Applied Dimensionality

Should we use TM1 for reporting?

Posted at — Sep 29, 2010
Should we use TM1 for reporting?

Another point worth clearing out: I really think TM1 is not a very good choice for OLAP reporting engine. On a decent data volumes, of course, millions of rows will be fast as lightning ;) Let me explain why.

There’s a fundamental difference choice you take when you’re building an OLAP engine:  you decide whether it will be bulk updated or data will be inserted real-time. Consequences are substantial, you either optimize system for faster reads or inserts.

TM1 was designed for data input and calculations, so it lacks the main ‘read-only’ system optimization: robust aggregates.

Aggregates

Aggregates are a common solution to performance issues that rise up when you ask “So how many socks did we sell last year” on dataset of billion rows. You can either try to walk through all detail records and count socks, or you can store an answer beforehand, creating a so-called aggregate view (products,years). Moreover this aggregate view can be used to answer question like ‘How were our sales at all last year’, since this view is way smaller than original data. But to do so, system must be ‘hierarchy-aware’, which is to know that you can calculate all product sales by omitting product dimension from (products,years) view. This property is sometimes called aggregate navigation. This all boils down to cube lattices that i like dearly. Working with such lattices was my PhD topic )

All read-optimized OLAP engines are built around this concept:

Each of these engines has a lot of optimizations in query rewriting. aimed on optimal aggregates usage.

As a side note: so does Oracle with materialized views and dimensions concept and IBM MQ with cube views )

But there’s no such thing in TM1. There are stargate views, which simply store the opened view and it’s very unclear how good is query-rewriting afterwards. At least nobody I talked with knows anything about that and my tests didn’t show good results.

Generally, TM1 projects were all about planning, which doesn’t involve billions of rows, so performance questions didn’t come up often.

Another big question, of course, is data loading speed.

TM1 loads both data and dimension at the same speed of 30k records per second (it depends on record size, but we couldn’t raise it much higher) in one CPU thread (say hello to Transformer). I’ve loaded data ten times faster in Oracle Essbase and it’s not the maximum you can reach.

30k\s is actually not bad enough, but inability to scale up is very bad.

IBM positions TM1 as reporting-OLAP server as well as foundation for planning products, which, I think, is a bit inaccurate. I’ve tried to sum up my concerns, feel free to comment on your experiences.

comments powered by Disqus