DMR’s pros and cons
March 13th, 2008DMR stands for Dimensionally Modelled Relational, a Cognos modelling technique allowing to present relational data sources as OLAP cubes (this meaning adding notion of dimensions with hierarchies and measures with various aggregation rules). All OLAP-style queries, roll-ups\drill-downs are then transformed into appropriate sql (group by's, aggregations) by Cognos Server. It's not a Cognos-only idea, see Microsoft UDM (Unified Dimensional Model) and Oracle BI EE works essentially the same way. The whole idea of dimensions\measures seems easy to understand (not by relationally bread-ones, though) so this modelling concept is gaining popularity.
And Cognos EP published data is represented by DMRs automatically )
Having played with DMR's on recent projects I'd like make some conclusions and complains.
I'll start with complains section -- more annoyance, more desire to write it out. If you know any solutions to problems stated below -- please mail me.
Complains:
1) No Sorting of Dimension Elements
We've encountered a simple but more than annoying problem: dimension elements wouldn't sort anyhow.
With help on cognoise.com got to the idea that business keys sorting (alphabetical) would work for all levels, except bottom one. Then cognos support answer revealed that we had to sort elements in underlying database, so we've added order by clause for dimension queries.
Then it appeared that order by is ignored due to the whole dimension query is packed in partition by's with min and etc. Manually patching oracle sql generation helps, but breaks other queries.
Using pass-through sql helps, but only as an insult -- sorting works, but no query optimization from that point, only "select * from" and filtering being done at cognos server. Totally performance no-go.
And there's no fix - officially Cognos BI doesn't sort DMR modelled dimensions for now.
2) Dimension element Hide toggles full table scan
Some insights on how cross-tabs (all, including DMRs) work in Cognos BI.
Suppose you want a simple product - years sales cross tab.
Just like
| Products | 2005 | 2006 |
|---|---|---|
| Product 1 | 100 | 200 |
| Product 2 | 400 | 300 |
| Product 3 | 500 | 600 |
What happens when run such a cross-tab against your huge fact table?
1) Cognos generates
select year,product, sum(sales)
from sales_fact
group by year,product
returning 6 rows
2) Result of the query are transformed into a small "cube" (powercube like) in temp\ directory (called ***.dmc)
3) This "cube" is used as source for cross-tab
Cool, eh?
Now what will happen if we want to hide "Product 2", for example?
1) Cognos will generate
select year,product, sales
from sales_fact
returning millions of rows
2) then filter them for <>"Product 2" and sum qty locally
3) create a "small' cube and show you hidden result (I couldn't wait for that, to be sincere, the mere sight of gigabyte temporary files being downloaded from dwh stressed me too much).
That's a problem, I don't understand why simple "where Product <>" in sql wouldn't do the job.
3) Bottom level search impossible.
Having a long Customer dimension (50k of elements) -- search by customer name like %Smith&% is vital. But if you've got hierarchies in DMRs, you can only search the current level, 'search all details' option is unavailable. We've had to create powercubes to get this functionality (and performance)
4) Hard dimensional formula writing\debug.
It ain't easy to write MDX from the start, but it's way harder in Cognos DMR's.
Basically, it's because of:
a) poor\none documentation
There's an excellent document called Dimensional Functions in Cognos 8 that really helps. But it's an overview, all precious details and peculiarities have to be discovered by testing and that's not the easy way. Detailed specs would be way much better.
One of the recent examples was with dimension roles.
In time dimension I've had a special field, storing the last work day of week key.
I've thought that I could use it to create work days subset (like all works of the current week up to day chosen in prompt). Writing something like "filter([Sales].[Time].[Day];[Sales].[Time].[Day] between roleValue('start_of_work_week';[Selected Week]) and roleValue('_businessKey';[Selected Week])).
And it turned out that roleValue results can not be used for > comparison (that's what I think after day spent testing). Why isn't that explicitly written out?
b) unintuitive debug
It's really hard to see what's going under the hood with dimensional functions, what is passed, how it's evaluated, how -> operator works and all that. Can I use [Sales].[Time].[Day]->roleValue('start_of_work_week';[Selected Week]) to get the desired day? And it isn't written anywhere I can reach. So it all up to changing display types and trying to guess.
*Pros, or why should you use DMRs:*
All other things work -)
1) Analysis studio is available and that's a great + in my opinion. It's really a nice tool, though still worse than client-side powerplay in aspects of advanced subsets and etc.
2) You can write dimensional formulas (like PeriodsToDay, TopSum and a lot of others) in reports. Eases report design a lot.
3) You can add drill-down functionality to your reports at zero cost.
4) Dimension Attributes. There are none in PowerCubes and it's a nice analytical capability that I sure don't want to miss. Like filtering stores by square footage > 100 m^2 in Analysis Studio.
“It ain’t easy to write MDX from the start, but it’s way harder in Cognos DMR’s.”
Where can you write MDX for DMR? I know you can write MDX in Report Studio for OLAP sources but I can’t figure out how to write MDX for any DMR (in FM or Report Studio)
I’ve meant that Cognos dimensional functions are essentially mdx one’s. See filter, periodstodate, siblings and all the others.
What you really lack is “Select on Rows, columns”, but that’s what cross-tabs are for.
Hi,
We are in the proces of deciding whether to employ a DMR or Powercube. Currently we are leaning towards a PowerCube simply due to the improvement in performance. But I’m conscious of whether there are any disadvantages to this approach (apart from the hassles of updating cubes)?
This document has helped me understand the options a little better.
But I was wondering what your opinion is? Would you recommend we use DMRs or PwerCubes?
Regards
Mick
Hi, Mick.
Some problems you’d face with powercubes:
1) Potential security duplication. Security is defined for powercubes in completely separate fashion (dimension views), not inheriting FM security. So if you are going to use powercubes and relational reports with security defined — you’d gonna write some automation scripts or do duplicate security definitions.
2) Some problems with custom rollups. FM allows adding aggregation rules on every dimension, in transformer it’s just time and all others. It requires some tricks time from time.
3) Another utility to tune up. Trying to optimize hdd usage, tinker with load rate — that’s a bit different story and a different view.
I usually start with DMRs (easy to debug\reload while etling a DWH) and switch to PCs just for performance reasons.
So if you have lots of data (more than 5m rows in fact table) — consider PCs, but be aware of problems mentioned above.
Hi
Have you had any issues with DMR performance?
The Cognos SQL rewrite in Analysis Studio means that the query never utilises star transformation, and therefore causes ‘nested loops’, which have a knock on effect on perfomance.
Regards
Rob
Hi, Rob.
In my experience, it only took some minor tuning (switching to native sql for fact tables) and then Star transformations worked brilliantly. I’m currently using 10mln row fact tables with star queries and it works pretty nice.
Can you share some details on your configuration (oracle + cognos versions) and generated sql?
Yuri.
Hello,
We are trying to build out a DMR with Cognos 8.3 and MS SQL 2005 (32 bit). Our consultant suggested the idea of a DMR and said that performance would not be a problem. Our consultant then quite to work for another firm and we are left in the lurch to figure this out on our own.
Our fact table is currently about 20 mln records; 8 years of data. If you joined all the tables as they are used by the OLTP, you end up with a pretty big snowflake with several circular joins.
We have imported that tables from a replicated copy of the OLTP, created views on the database and have imported the tables and views into FM. The views were built anywhere we created a snowflake in our model so that our model is a star schema in FM.
We have not dimensionalized the data yet beyond this – no hierarchies.
When we run a query agains a years worth of records (about 2 million records), it can take up to 10 minutes to return.
Is this normal, or are we missing something?
Any ideas on how to improve the performance?
Thanks,
Bob
Hi, Bob.
Well, 20 mln of data doesn’t seem big enough to start causing troubles on Ms SQL 2005. Although your “circular joins” are a bit worriyng — there shouldn’t be anything like circular joins in well-designed reporting scheme. If it’s possible, I’d like to look at your current design, to see if there are are any improvements at database level..
10 mins of data processing for query isn’t reasonable time (although it depends on your hardware as well) to do analysis, 5s-1m is only interval possible.
DMRs are pretty good on performance side, when you’ve got a good DBA nearby to tune up the SQL server itself and your FrameWork manager scheme is pretty straight-forward (just a simple star, for example). In other cases, it may turn out usefull to build some aggregate (in your case, full) cognos powerplay cubes to speed up query processing.
- Yuri.
Hi Y,
How do you implement Relative Time in DMR? Do you have to pre-calculate all previous year/quarter/month, YTD/MTD/WTD … and populate them in your time dimension table in RDBMS?
Thanks,
E
I’ve seen these same DMR issues in 8.1. I haven’t used DMRs in any later versions though. Have these problems been addressed by IBM/Cognos? Thanks for your insight!
— Jason
Hi, Eric.
You use analytical functions like PeriodsToDate to get the set of months and apply some aggregation like “total [Sales]” over this set. It’s relatively easy and Cognos handles SQL generation. That’s what DMR’s are good for.
You can store these values, but I wouldn’t be bothered with that untill you reach serious data volumes (10-100 mlns of rows in fact tables).
Hi, Jason.
Not in 8.3 ( I’m currently working with 8.4 and I’ll post out findings. Thanks for pointing to this list, I’ll simply check them all )
Hi Y,
Now I am reviewing some approachs and architecture defined by the external consultants that have worked on our Cognos BI… All of our relational star schema data is referenced in Cognos by DMRs, but as far I can see it’s the WORSE option for large tables, specially for this full table scan behavior, no matter what filters we use…
But if it is not the best option, which one could be? Configure ‘pure’ relational model in Framework Manager? Powercubes are very restrict because the attributes are not available.
By the way, congratulations for the blog.
Yuri,
I’d share some add-ons from my experience.
I usually follow the next way:
1. Trying to create aggregated tables and use them as back-end to the DMR (lots of the analyze about indexes user’d need) meta-data model.
2. Agree with you that are easy to start from using DMR.
3. Only in case of performance issue I am gonna start using the Cubes.
Cubes vs. DMR:
Timeliness – Contain “real time” data (DMR); Data in cubes need to be updated (Cube).
Flexibility for user – Static (DMR); Dynamic (Cube).
Usage – Management/Statistical reporting (DMR); Reporting and Analysis (Cube).
Then it’s always too much important to find out about dimensions to use in one Cube, ~4-5 it’s pretty critical amount on my mind.
And I cannot say anything about security. Lots of time I had to refuse from start using Cubes when the security scheme was pretty complicated with a lot of dynamic combination especially when I read security information from RDBMS.
Good Luck!
Hi, Andrei.
I usually try to avoid 1 where possible (on Oracle sites, it means) by adding Materialized Views.
Agreed on all other points.
TM1 will change dimension limits a bit, it seems )
Security will remain a main issue though.
Thanks for sharing,
Yuri.
So far building DMR models in C8.4.1 has been an absolute pleasure. With small datasets performance has been excellent.
Next models will introduce multifacts. In relational models we can use determinants and cardinality and Cognos makes use of stitch queries to allow multifact reports.
I would be VERY interested to know if this is feasible with DMR. Actually I will be in a world of hurt when this fails and it will mean going back to PC’s again.
(Can’t really , cause I have multilingual models)
Please share your thoughts!!
Answered you email.
Just for other readers — multifact DMRs work )
- Yuri
Thanks for this interesting blog.
Have been in the same boat to decide to go with DMR or with Cubes.
We have few large dimensions (5mln-10mln records), few large facts tables (again 5mln-10mln records) and rest other tables with maybe less records or so..in our DW.
Currently, we have a relational package, canned reports and ad-hoc reports based on this relational package.
The main business need is for ad-hoc and analysis.
DMR sounds appealing, as we have the realtional foundation. But, will DMR have performance issues
Or going the cube route will be more appropriate?
Appreciate if some expereinces are shared or any suggestions.
Thanks,
Hi,
I would go on DMR’s because of:
- single point of data delivery, no special routines for cube building, schedules, etc
- single point of providing data security — only FrameWork Manager, no propagation of rights to Transformer
- ability to use attributes and multiple languages — if you’re thinking of Transformer.
With following remarks:
1) Make sure that your DW design is good for reporting. If you’re having dimensions of the same size as fact tables — it seems a bit strange. Maybe you’re trying to catch to much history in dimension, whereas it should be in fact table? Like employees moving from one dept to another, or customer address changes. But it depends on subject area as well.
2) Make sure that you’re having decent hardware for DBMS – for 10s of millions any ‘mid-server’ (2-4 CPUs, 16G of Ram, normal I\O — a RAID5 or 10 controller with 5-10 disks) will suffice
3) Make sure that you get all of your DBMS — use 64bit version, use DW-features of your DBMS (bitmap indexes, materialized views for Oracle, MQT, MDC for DB2) and spend some time designing indexes for your schema.
I’ve personally used DMR’s over 100mln rows (in a very compact fact table, around 10gb size) — all was well enough.
- Yuri.