4 Things you can do to improve TM1 performance today

We’ve started a regular newsletter here at PMsquare, so I was asked if maybe I can write a ‘not-so-technical’ how-to article. You can judge the result yourself, reposting it below. Sign-up for the newsletter, lots of good stuff there.

4 Things you can do to improve TM1 performance today

TM1’s in-memory calculation engine is lightning fast, but you might be wondering if your installation is performing up to the standard lately? After all, we all know that humans are extremely fast in theory (think Usain Bolt), but not everyone one of us is capable of showing such performance in the backyard right now. Fortunately, you don’t have to spend months in the gym to improve TM1. Read below for 4 simple steps to better performance and increased user happiness. And yes, you can read it on the treadmill.

Upgrade

I’m sorry to open out with something as obvious, but if you’re currently on good old 9.5.2 or even 10.1.x, by far the biggest performance boost you will get is upgrade to 10.2. There are 2 main things that will improve performance drastically:

– Multi threaded queries (MTQ). You know these ’birds eye’ views on large cubes that make you wait for 30 seconds or more? With MTQ they’ll finally be calculated on multiple CPUs at the same time, giving you near linear increase in performance. That’s right, with 4 cores the same view will be rendered almost 4 times faster

– New TM1Web IBM rewrote TM1Web from scratch in 10.2, migrating it to Java instead of .Net. No more IIS tuning and .Net framework issues and up to 5x faster rendering of web reports and Contributor applications. Network performance is also massively improved, so if you’ve got a distributed user base (multiple countries or cities), this will improve everyone’s lives. We’ve got anecdotal evidence that this alone caused users to ‘flock back’ to TM1 from their beloved Excels. If you’re currently using Citrix to alleviate TM1 over WAN problems, upgrading to 10.2 will allow you to switch those licenses to some other application that needs them more.

There are many more things that are worth the switch:

– Scorecarding cubes

- Contributor applications if you’re not using them right now

– Deployment packages in IBM Performance Modeler

– And much more

 

Check virtual machine resource allocation CPU allocation and hyper threading

 

Virtualized environments are the de-facto standard right now, so it’s always worth checking that your TM1 resides on the VM host that has ample memory (no contention, no swapping) and the fastest CPUs (by per-core speed) that you potentially have on the virtual server farm. Even in 10.2 a lot of things (all TurboIntegrator processes, for example) are still singe-threaded, so per-core performance is still very important. You have ~40% per-core speed margins on the modern Intel CPUs, so an important factor when selecting hardware for TM1 server. Simply moving TM1 within the same virtual farm could give you 20-40% boost, well worth trying out.

 

And test turning off hyper threading for the VM that hosts your TM1, this can potentially give you a significant performance boost as well.

 

Setup VMM and VMT

 

Relatively few people are aware of the nuts & bolts of Stargate Cache mechanism used in TM1. In a nutshell: when you query a view TM1 tries to store the results in cache to avoid recalculating rules again if somebody else asks for the same data. It’s actually way more complex / clever, TM1 tries guessing what else you’d be interested and recalculating that as well by expanding the view.

All this magic caching is controlled by 2 parameters:

– VMM - how big the cache can be for a cube

– VMT - how long the query can run before selected for caching, in seconds.

So if you increase VMM you allow more results to be cached and if you decrease VMT even short queries will end up in cache.

Try adjusting these parameters for the cubes that drive your slowest reports to see if you can improve things with caching.

 

On a side note: Any user input invalidates caches, preserving caches is usually one of the key reasons separating cubes for input and reporting.

 

Try to cut out overfeeding

 

This advice isn’t as easy as it sounds as feeders and overfeeding are obviously the most complex / intricate part of TM1 model. But there’s at least a fairly simple and straightforward way to detect overfeeding that we commonly use to detect issues.

1) Start Performance monitor 

2) Open up }StatsByCube and calculate the ratio of Stored and Populated Cells (both Numeric and String) to Number of Fed Cells for the largest cubes (by Total Memory Used).

3) We usually use a rule of thumb to estimate if this ratio unreasonable. In most cases, 50 times is suspicious, and more that 100 — definitely overfeed. In general, you can actually quickly estimate the amount of feeders you require from calculation requirements. For example, we know that for every input cell in Local Currency, we need to calculate a reporting one (ratio of 2) and 10 different accounts (gross, net sales, discounts, etc.), therefore you should expect a ratio of 1:20. Ratios of more than 100 are most commonly an overfeed that can be helped by redesigning feeders.

 

And you can always apply a more reliable and accurate technique of building an OverFeeds cube as described in this proven practice article to fine tune the rules.

Posted in tm1
  • Alexander Dvoynev

    I think TM1 9.5.2 will be like win XP: 80% users will use it until the support ends :)

    I usually use the formula “Memory used for feeders takes more than 90% of Total memory used”:
    (1 is bad, MUF = Memory used for feeders, TMU = Total memory used)
    IsLarge = ABS( SIGN( (TMU-200.000.000) – ABS(TMU-200.000.000) ) );
    IsOverfeeded = ABS( SIGN( (MUF/TMU-0,9) – ABS(MUF/TMU-0,9) ) );

    IsBad = IsLarge * IsOverfeeded;

    Do you think using the numbers of cells is better than using memory? Why?

  • ykud

    Hi Alexander,

    Thank you for your comment, nice formulas, although I think you’re way too restrictive with 0.9 ;)

    I generally try working with cells because memory used is derived by number of cells multiplied by some factor (usually 16-32 bytes). Memory per cell depends on the number of dimensions and dimension order, so I prefer excluding all this complexity and isolate only the cells. After all, it’s number of cells driving the memory, not the other way around.

    Re 9.5.2: sure, it will take quite a bit of time, but one can always hope )

    Cheers,
    Yuri

  • nick_leeson

    I would have added MTQ as well !! Which is a by far the single biggest improvement in TM1 10.2. Otherwise a good article.