TM1 + Enterprise Planning, When, Why and How?

This post will be some extracts from a recent PoC, where we made a complex planning system including a rich scenario modeling part. So it was our first big TM1 + EP integration experience.

Some basic definitions:
Cognos Enterprise Planning (aka Adaytum EP) is a very scalable system, all computing is client-side, nice workflow organization, nice GUI (version 8.4 Eclipse one especially) with huge limitations when it comes to the moment of “show me the whole budget in one place” or to serious what-if modeling. But scaling is trivial, each server acts as many clients simultaneously (number of cores\CPUs) and adding more servers scales the system linearly. So to build really scalable systems, we made Contributor only working places, with all analysis done in Cognos 8 BI and Analyst was turned into a modeling tool. In adaytum and cognos 7 times a lot of things(consolidating, what-if) could be done (and to be honest, still can be) in Analyst, but just up the point you reached 40mln cells cube limit in Analyst, then you started dividing cubes and the system became nightmare. Transforming the same consolidation task into series of Contributor applications connected by admin links gave huge performance enhancements, but the "show me the whole budget" limitation was a major problem.
Cognos TM1 (aka Applix TM1) is an in-memory OLAP engine, ultra-speed calculations, server side computing, excel based gui (with web publication), limited workflow capabilities (compared with current EP), scalability limitations (imagine putting servers across whole Russia to ensure response time)

So when it comes to a planning system with large user base – it’s EP, but if some serious modeling and calculations are requied as well – it’s time to fire up TM1 as well.

Overall interaction scheme seems pretty simple:
tm1ep

1) Most of users work in EP inputting budgets, submitting and doing all other workflow related activities
2) After some user actions (saving budget, submitting, accepting it) data is transferred into TM1.
a. Detecting the desired action is easy – we’ve got Event Studio for that (see here and here).
b. Incremental publish is fired and changed elist is republished – that’s pretty fast
c. Then there’s a problem to extract only changed data from publication. That’s the same as with incremental admin links, you just write a simple database procedure that logs last data transfer time and create views showing only recent data (as usual, you want code samples – drop a line)
d. Changed data is bcp’ed\sqlloaded to TM1 server
e. TM1 TurboIntegrator process is fired, loading the extracted data into TM1
3) Analysts work with “live” data in TM1, doing what-if analysis (they dynamically add scenario versions for this purpose), having access to all enterprise budget and actual data.
4) Then a version could be used for top-down propagation and exported into EP (that’s pretty simple, just some writing into EP applications import tables)

The whole system seems pretty healthy, data transfers happens with 1-2 minutes latency (we used Save as trigger action). Doing any kind what-if analysis on a budget version only 2 minutes "stale" with ultra-speed seems very impressive to me. For instance, in that particular PoC the goal was to speed up overall company budget consolidation, including complex allocation and elimination rules. It took over 8 hours in their current system. And only 4 minutes in Tm1 -- imagine all the benefits you can get from that kind of speed.

Since there are more and more rumours about coming TM1+EP bundle, I just hope that step 2 will be more automated. For example, they can track delta xml's sent by contributor client, containing all data updates, instead of publishing and bcping. But I can see no way they can get rid of neccesity of creating 2 models, one in Analyst for Contributor applications and another in TM1. Although dlists\dimensions can be synchronized, calculations are utterly different (you can target specific cube cell in TM1 formula and only the cube slice in EP).

Essbase ASO key structure

Yet another post on Essbase )
Internal ASO storage structure is a block box, so unlike BSO. So while Roske haven't writen a book on ASO option, we have to wonder in the dark. 

The main description of ASO storage is "data is stored as key\value pairs". This quote comes from ASO Tuning WhitePaper, the only "internals" technical description of ASO out there. It's for 7, but all the concepts are still valid.

So, "key-value pairs" it is. Key length is very important then, because it directly affects:

a) size of the database, if you've got a billion rows cube, then -1 byte for every cell means 1 Gb less overall cube size (that's raw size, it'll be compressed later). 
b) query processing -- for every query cell key got to be calculated for data retrieval. Less key, faster the query, as I think.

You can see the key description on Application Properties page or issuing "query database test.db list aggregate_storage runtime_info;"  MaxL command .
Here's the sample output. 

parameter                                         value                                            
+-------------------------------------------------+-------------------------------------------------
 Dimension [Date] has [3] levels, bits used                                                        1
 Max. key length (bits)                                                                            1
 Max. key length (bytes)                                                                           8

we see following characteristics of dimension -- number of levels and number of bits used to encode values in this dimension. Cell key is concantenation of each dimension keys. 

But that number of bits is pretty cryptic, though. On a project we're doing we have a couple of 5 mln elements dimensions and key length for them is around 40 bits. If you use simple binary encoding -- you can encode 2^40 elements (that's really a huge number, it contains 14 digits). It's no a simple binary encoding then. 

So how are dimension elements encoded? 

Well, it's allways easy afterwards, so I'm not so proud now, as I was when it first struck me )

So I now think that the technique called hierarchial encoding (or indexing) or something similar is used, so each dimension element is encoded the following way:

Let's assume that we have 3 levels in dimension, then the key will be formed like:
xxxyyzzzz
where xs -- are the bits requiered to binary encode level 2 parent
ys -- level 1 parent
zs -- the bottom level element

So to form the key length for all dimension you have to concantenate binary keys required to encode elements on every level. 

Therefore there are a few things to keep in mind while doing big-scale ASO projects:
- think about number of levels in big dimensions, keeping them short greatly decreases size
- try to keep number of elements even on every level -- that's funny, but it'll help to fully "pack the key" 

But that brings up a few more things about ASO i'm thinking of:
- how are pages in tablespace filled for data storage? I think some sort of hierarchial clustering, using key parts to point at pages
- what and how is stored in the outline (otl file)? Membernames and aliases for sure, but also the dimension elements keys, which are concantenated to get cell key and find the data page in tablespace.

 
So I'm still waiting for Edward to publish ASO book, there's a lot of questions. The book seems to be close )

Loading data into ASO cube

Hm, I guess it's the time for a first big essbase-related post. Today we'll talk about data loading in ASO cube.

Those who've listened to our recent Oracle events presentations (russian TechForum 2008, BI Forum 2009), know that we've made a nice PoC project with SportMaster, consisting of loading their year's worth of daily stock and sales data (around billion fact cells with over a million elements in products dimension).  All in all, I've decided to post a write-up about how to achieve acceptable load speed on such volumes and tell a scary-story, with a detective twist and a happy ending.

I won't go in much detail on basics, so:

Since it's a billion cells cube, it's ASO storage option.

Since it's an ASO Cube, we use load buffers for data loading.

Since we use load buffers, we try to split input data into chunks, to maximize data read speed.

Since we use load buffers and we were doing PoC on 9.3.1, we have to extract data from DWH to text files (in 11 we could use ODBC based load buffers).

Just a remark: Essbase Integration Services does a lousy job on either building dimensions (we're adding 14 attributes for products and loading a file again for _every_ attribute is a disaster) or loading data (no visible ability to run multiple buffer load). No comments on 11 Essbase Studio, haven't tried it.

So, we've extracted data from DWH to text files and are loading these files into text buffers. While loading data from file into buffer, with a simple rule file I could achive data read speed of around 6 Mb\s (on a 80 Mb\s disk system, we've got two such disk systems attached on PoC server). Loading two files in parallel, gave 12 Mb\s. So to fully load up disk system we need to chunk files into 6 parts. That actually doesn't present a problem, but seems a bit awkward. While testing, I've accidentially forgot "using server rules_file" in MaxL script and was astonished by load speed of 25-40 Mb\s! That's how I've found out about 'free-form data load' )

Essbase has a built in ability to load data without rule file (see DBAG). And, moreover, this is almost 10 times faster than loading with rule file. Easiest way to find more about free-form load is to export an ASO cube to text file (right-click on database in Essbase Administration console). Moreover, free-form allows to load data in cross-tab (pivot accounts in columns), therefore greatly suppressing number of rows. Using this technique requires additional data formating: spaces as separator, double quotes enclosure, writing "Missing" or "MI" for null values. But load speed is worth it )

So the bottom line: free-form data load, additional data massage -- 70 Gbs of data in less than 30 minutes, almost reaching half of disk system speed.

That's it for fast data loading, if you're on 9.3.1 i'd recommend trying to massage data into free-form and load it this way.

Now for detective part of the story. Read the rest of this entry »

Cognos 8 BI Dashboard Studio

Finally got my hands to installing and trying it.
Well, it looks nice, allows flash-based chart formating and turns Cognos portal pages into more dynamic "dashboards".

But it's strangely slow (tried on a couple of computers) and all in all looks kinda "alpha" and plugged in at last moment. Installation is not that easy too, it requires war building and deployement, procedures that not every BI consultant is aware off.

So I think we should wait for another couple of service packs and there'll be a new standard tool for portal page development. XCelcious should be afraid )

Cognos and MS SQL — happy together

 Correcting myself. I was ranting about MS SQL and Cognos interaction. I was totally wrong.

The problem, described shortly is:

  1. you have 2 databases in MS SQL residing on the same physical server

  2. you create a join “table A from db1 with table B from db2”

  3. Cognos creates two select queries (select * from db1.A) and (select * from db2.B) and then does join locally on server. That’s terribly slow. Cognos should be generating one select query, performing the join on the db server and that’d be really fast.

The answer is simple: there’s no way Cognos can find out that these two databases are on the same physical server. But we can help old fella a bit, setting a datasource property (Content Manager Datasource) the same for both datasources. Therefore, the same datasource prefix will be generated for both tables and they’ll be regarded as residing on the same server. Query speed will increase dramatically.

Morale:  always set the datasource property for MS SQL databases located on the same server. 

Default Framework Manager package publish folder

Just a quick tip -- this setting is stored in "lastPublishCM" property of fm.ini file

Gave me some trouble recently, when language changes led to unappropriate publish folder name. Moreover, we were not able to publish anything at all )

Changing this file helped. Found via FileMonitor from sysinternals, as usual.

Back from winter-sleep

Okay, it has been almost half-a-year (oh, dears) since the last decent post on this blog. Russian one has suffered as well, I must note.

Reasons vary, but mostly it's that I've been, you know, busy-busy. As I now start to reflect on it, it's always a point of view thing and a question of self-control and ability to say no )

But enough philosphy, brief recap on what happend while I was out "there" in real world. I'll divide this into two parts (by vendors ))

Read the rest of this entry »

Cognos EP Cut-down models

Just a quick tip: read this cognoise thread thoroughly -- really in-depth discussion.

Cognos 8.4 is finally out

Go check the download page.
One wonders, what's that 64-bit BI server is all about -- they've really pushed the report server further, or that's just compatability release.

Cognos BI and MS SQL — not meant for each other?

We've encountered an annoying Cognos 8 bug recently. 

Usual scenario if you're creating reports over Enterprise Planning data includes "unioning" data from several applications. 

For example, you have an Operating Expenditures planning application and a Sales Planning application. To make a simple P&L, you just have to show up data from both these applications. So you can either:
Read the rest of this entry »

FireStats icon Powered by FireStats