Essbase ASO key structure

Yet another post on Essbase )
Internal ASO storage structure is a block box, so unlike BSO. So while Roske haven’t writen a book on ASO option, we have to wonder in the dark. 

The main description of ASO storage is “data is stored as key\value pairs”. This quote comes from ASO Tuning WhitePaper, the only “internals” technical description of ASO out there. It’s for 7, but all the concepts are still valid.

So, “key-value pairs” it is. Key length is very important then, because it directly affects:

a) size of the database, if you’ve got a billion rows cube, then -1 byte for every cell means 1 Gb less overall cube size (that’s raw size, it’ll be compressed later). 
b) query processing — for every query cell key got to be calculated for data retrieval. Less key, faster the query, as I think.

You can see the key description on Application Properties page or issuing “query database test.db list aggregate_storage runtime_info;”  MaxL command .
Here’s the sample output. 

parameter                                         value                                            
+-------------------------------------------------+-------------------------------------------------
 Dimension [Date] has [3] levels, bits used                                                        1
 Max. key length (bits)                                                                            1
 Max. key length (bytes)                                                                           8

we see following characteristics of dimension — number of levels and number of bits used to encode values in this dimension. Cell key is concantenation of each dimension keys. 

But that number of bits is pretty cryptic, though. On a project we’re doing we have a couple of 5 mln elements dimensions and key length for them is around 40 bits. If you use simple binary encoding — you can encode 2^40 elements (that’s really a huge number, it contains 14 digits). It’s no a simple binary encoding then. 

So how are dimension elements encoded? 

Well, it’s allways easy afterwards, so I’m not so proud now, as I was when it first struck me )

So I now think that the technique called hierarchial encoding (or indexing) or something similar is used, so each dimension element is encoded the following way:

Let’s assume that we have 3 levels in dimension, then the key will be formed like:
xxxyyzzzz
where xs — are the bits requiered to binary encode level 2 parent
ys — level 1 parent
zs — the bottom level element

So to form the key length for all dimension you have to concantenate binary keys required to encode elements on every level. 

Therefore there are a few things to keep in mind while doing big-scale ASO projects:
- think about number of levels in big dimensions, keeping them short greatly decreases size
- try to keep number of elements even on every level — that’s funny, but it’ll help to fully “pack the key” 

But that brings up a few more things about ASO i’m thinking of:
- how are pages in tablespace filled for data storage? I think some sort of hierarchial clustering, using key parts to point at pages
- what and how is stored in the outline (otl file)? Membernames and aliases for sure, but also the dimension elements keys, which are concantenated to get cell key and find the data page in tablespace.

 
So I’m still waiting for Edward to publish ASO book, there’s a lot of questions. The book seems to be close )