Cognos Content Store Security Overview in SQL

Ever found yourself wondering what users are really there in your BI Consumers or any other group? Sure, you’ve got a few of them explicitly listed there (bad practice) but mostly you’ve got just a list of other groups like “Western Division”, “Marketing” and all those (good practice). And they’re, most likely, nested into each other a few times (like Western Division OpEx and Marketing Asia).  Or, looking at it from totally other point of view — what groups Jack Smith is really in (again, with all these nesting ones)? Is he a member of Western Division or what?

 

Built-in Cognos Security administration is quite awkward (compared to so long gone Access Manager, sigh), so questions like above are so frequent. There are 3 main solutions:

1) Buy an externally developed tool from BSP or Motio. I had a chance to look at BSP MetaManager recently and it does a whole lot of stuff to make your administration life easy. Bit of an overkill to just ask “who’s there”, but totally worth it if you’re going big scale \ long term.

2) Buy an SDK license and use the sample script from IBM as a base and extend it just for this task and then use SDK left and right to do anything you want in Cognos. If you have an SDK license through some global license deal — the sample script maybe just what you need to get the job done.

3) Write SQL queries to Content Store database to return just this info. It’s an unsupported (I’ll even stress this) unsupported way of doing it, but if you just want to have a quick look or are happy to loose it at some upgrade point (although it works on both 8.4 and 10.1, CS database schema isn’t changing that much), here are the scripts for both MS SQL Server and Oracle.

 

I used the script from this SQL.RU topic as a base and added group nesting unwrapping with Connect By in Oracle and CTE’s in SQL Server. You can turn this SQL into FM query subject and have a nifty report.

 

 

Continue Reading »

New recruit to my ETL toolbox

I’ve recently completed my first real DataStage project and took a chance to get certified while all the stuff is still “fresh”. Certification itself is quite complex and I didn’t use most of the tricks depicted in questions up until the moment when one of the jobs had to process a quarter billion of rows in a reasonable timeframe. From that point on I learned quite a lot about partitioning, balancing, debugging and choosing right stages to do the job (who would’ve thought that RemoveDuplicates is waaaaay slower than Sort (with Remove Duplicates option) — why put RD stage in at all?) Anyhow, now I’m also an IBM Certified Solution Developer — Infoshere DataStage v8.5 )

So my current ETL tools breakdown goes smth like (not counting PoC and likes):

  • Oracle Data Integrator — 3 projects
  • Pentaho Data Integrator — 2 projects
  • IBM InfoSphere DataStage — 1 project

And that’s my current preference list as well. I love ODI’s flexibility (it’s actually very simple once you get it how it works and it’s extremely configurable), ELT approach (I’d rather be tuning my DBMS than DBMS and a separate ETL engine). PDI is very open and quite user-friendly (compared to DS, for example) and it’s easier to understand & debug than ODI. PDI community edition is enough for most small data sized integration projects and enterprise version is very affordable. Datastage is terrifically well-suited for big data volume tasks and parallel processing, but is quite an overkill in small projects.

It’s interesting that although I did quite a bit of DWH model design I written have just a few posts on this topic. But every time I think about writing out some advice — I think that the best advice is to just go read the books. And if you still have questions — reread them ) I’m reread Kimball’s books a few times already and every time gives you an “ah, that’s what they meant” moment based on your recent experience.

Anyhow, my last couple major DWH projects were for government agencies and I packed a number of simple but effective modeling tips exactly for them. Hopefully I’ll write them out in nearest future. Just need a free weekend or two.

Cognos Express and BI free trials

I must be the last one to notice, but you can actually try out Cognos 10.1 BI and Cognos Express for free (time limited trial). That’s very convenient and is totally a major shift from usual ‘we have products soooo good that you have to ask partners to show them’ policy.

‘Open download and use but paid support’ policy advocated by Oracle (for example) can actually attract more people to work with the products and to buy them eventually, even when they are already at a different company. It really worked well on russian DBMS market, where Oracle is the only ‘big enterprise’ choice.

On the other hand, both BI and Express are quite complex products, so mere effort of trying to make them work in non-obvious tasks can lead to utmost frustration. But both of them are packed with demos and samples, so that should make life a bit easier.

All in all, I’m really glad this is happening and am rooting for time-unlimited trials. You won’t build a serious system without fix packs anyway )

Do you want your TM1 go twice faster on Intel-box? Turn HyperThreading off

This’ll be a bit long (but with a hidden bonus for attentive reader), so I’ll start from conclusions. If you’re using a recent server with Intel CPUs, you’d better check whether they have Hyper-Threading (HT) and try turning it off to gain 2x speed boost.

How to do it:

1) go to server, open command line, type systeminfo (processor info will be in first ten lines). Or if you don’t like it cool, open Control Panel, CPUs are described on second pane )

2) check whether your CPU’s are in this list

3) ask you server admin, or manually reboot server, look into BIOS and turn HyperThreading off

4) Test your TM1 processes, they may speed up considerably

 

What’s behind the scenes.

Hyper-threading, in a nutshell, is a technique showing each physical CPU core as 2 logical ones. Since during normal work a lot of time is spent in thread switching and related registry load/unload and yada yada, physical cores actually are underloaded in common multithreaded system. Adding a duplicate registry set and a ‘virtual’ core allows to utilise physical core up to 20-40% more, earning throughput benefits.

Key words here are “threading” and “throughput”, meaning that HT benefit multithreaded applications where lots of small request are processed at the same time. OLTP databases are a prominent example.

But most of the systems I work with, almost all OLAP engines and even DWH-tuned DBMSes, actually suffer in this scenario.

For example, TM1 calculations are executed in a single thread, so if this thread is assigned half an CPU core, speed drops significantly. The same logic goes for Essbase metadata update process, for example, or for Cognos Enterprise Planning Job processing.

I’ve seen significant performance degradation (30%) in Cognos EP job processing due to HT context switching and have advised turning HT off on all Cognos EP Job servers.

Encountering the same issue with TM1 recently left me puzzled, since there was plenty of empty CPUs on server and there shouldn’t be any thread switching. But we’ve got 2+ time speedup after flickering the switch anyhow )

 

I promised a bonus and here it goes:

When buying a TM1 server with Intel CPUs, buy way more cores than you’ll license. Physical cores, not logical.

This will turn on Intel Turbo-Boost and it might give you another 30-40% speed boost. I haven’t tried that yet and there’s a bit of scepticism of current boost-detection approach in Wikipedia, but it’s definitely worth testing in a lab before you buy hardware. Cores are cheap compared to licence costs, so you may save yourself a ton of money by having a faster system. As in wikipedia example, for Core i7-2920XM you can boost up to 3,5 Ghz per core with 2,5 Ghz base speed. That’s a hefty speed up for a good old one-threaded TM1.

Cognos Enterprise Planning PAD contents, duplicate entries and deployment problems

Seems like I can rename this blog ‘Reminisces’ any day now. After some years of not prodding into Cognos EP I was asked to look at a rather confusing error. Export deployment wouldn’t work, throwing ‘null’ error initialising object selection dialogue. Yes, as these two IBM support articles (1,2) state: it’s a problem with duplicated objects in PAD. But since was the first time I saw this rather massive instance (80+ apps, hundreds of macros), I was interested in finding a way to automate this task. Although number of EP users is decreasing, hope that anybody facing this issue in future will benefit from this post. And if you’re curious about how things are stored in PAD, scroll over )

 

There are a few things that can be duplicated in PAD, so I’ll describe each situation accompanied with code I wrote to check for duplicated items. Fortunately, export dialogue shows what it’s checking at failure point, so you’ll know where to look.

Table of contents

Macros

All you need to know about macros is store in P_MACRO* tables. P_MACRO holds general macro info, including it’s name that is potential duplicate. EP developers must’ve overlooked the fact that macro name is case-sensitive from Administration Console point of view and case-insensitive from export deployment wizard’s point of view. So if you have two macros: copy_sales and copy_Sales, you’ll find out about this duplication only when you’ll try to export deployment.

Code checking for such situations is very straight-forward.

SELECT * FROM p_macro WHERE
 LOWER(rtrim(ltrim(macroname))) IN
 (
 SELECT LOWER(rtrim(ltrim(macroname))) FROM P_MACRO
 GROUP BY LOWER(rtrim(ltrim(macroname)))
 HAVING COUNT (*) >1
 )
 ORDER BY LOWER(rtrim(ltrim(macroname)))

Found some duplicated macros this way, but the problem persisted.

Macros are also recorded in PAD xml (see below), but table approach is way easier.

Applications

Applications pose a more interesting problem.

ADMINOPTION table

Most of the application details are stored in ADMINOPTION table of each applications scheme (Oracle) or database (Ms SQL Server). I wrote some time ago about updating JVM options for Contibutor client via this table. In my search for duplicates I was interested whether there are duplicate application ids or display names. So it made sense to query all adminoption tables at once.

Here’s a sample Oracle SQL script for such queries. It’s easy to replicate it for MS SQL server (seems that didn’t publish such examples as of yet, mail me if you need them).

CREATE TABLE temp
(USR_NAME VARCHAR(50) NULL,
OPTIONID VARCHAR(250) NULL,
DESCRIPTION VARCHAR(512) NULL,
OPTIONVALUE VARCHAR(512) NULL);
commit;
BEGIN
  FOR usr IN (SELECT owner FROM all_tables
  WHERE TABLE_NAME = 'ADMINOPTION')
  LOOP
  EXECUTE IMMEDIATE 'INSERT INTO temp (usr_name, optionid, description, optionvalue)
  select '''|| (usr.owner) ||''' USR_NAME, OPTIONID, DESCRIPTION, OPTIONVALUE from ' || (usr.owner) ||'.ADMINOPTION' ;
  END LOOP;
END;
 
-- check for duplicate application_display_name
SELECT optionvalue, COUNT(*) FROM temp
WHERE optionid = 'APP_DISPLAY_NAME'
GROUP BY optionvalue
HAVING COUNT (*) > 1

Strangely, such exercise showed me some duplicated names, but detaching those apps from PAD didn’t help, Export wizard was still complaining about duplicated apps.
So I started SQL profiling to see what was really happening under the hood and was impressed by finding

PAD xml

Turns out, all application details are stored in an XML file located in P_PAD table. This file is actually huge and messy, but it contains some fields that are not available in ADMINOPTION table or Administration console output. Deep inside this file, applications are described like:

<applications>
<application id="{94BF5F99-E6AF-4D9C-BA63-E2303055296B}" name="go_capex_contributor" 
objidref="obj_APPLICATION"><properties><property propidref="pr_DATASTORE_CON_DOC">
<![CDATA[<connection>   
...
  <applicationname>    go_capex_contributor   </applicationname>  
</connection>  ]]>
</property><property propidref="pr_DISPLAY_NAME"><![CDATA[go_capex_contributor]]></property>
<property propidref="pr_UNIQUE_ALIAS"><![CDATA[test]]></property>
...
</applications>

And it turns out that the ‘name’ property held duplicates in my case. Since manually extracting properties is tad too tedious for me, I wrote a simple python script that does exactly that. It’s not a complete solution, more an approach sample. I’m using pyparsing as usual. Continue Reading »

A cow jumped over the earth

Big news: we’ve relocated to Sydney, Australia (yep, the one with the Opera house, kangaroos and Fosters).

And as of 1st of November I’ll be joining small, but proud ”bunch of guys that share the same passion and dedication for Cognos (and sometimes the same sick humour :))” ©, namely, PMSquare.

There’ll be more upside-down stories here, I promise.

 

As they say it here, “Cheers to you all” ;)

Cognos BI SDK revisited

After, oh my god, 4 years, I’ve spent a couple of days typing Cognos 10 SDK Java code. What changed in BI automating during this period?

Well, the biggest change was trivial: I’ve forgot everything ;)

But that turned out to be good news, since version 8.3 to version 10 changes are huge. I remember reading SDK Developer Guide overnight for 8.3, but now it’s 3,500+ pages long! As it turns out, you can still read all the needed stuff in a short time period, but understanding what to read is a challenge.

So it’s best to start with this proven practice article, providing some kind of helicopter view of what can be achieved by what parts of SDK.

Also take a look on this 4 part blog series by Peter Beck of BI Professional site.

From there, I suggest you dive into sdk\samples folder and look through the code provided there. However complicated it may look, it’s actually way more integrated than any Knowledge Base\Technote code sample you can download, since extending those turns into rewriting the built-in samples functionality. Trust me on this, I’ve spent half a day doing exactly that before giving up and building upon existing samples.

Where to look when you’re stuck trying to understand a code piece or function? I used javadoc generated documentation instead of Developer Guide, because it’s Java-specific. I used Developer Guide just like an overall Class&Methods reference.

As for packing samples content: minimum required set for content store manipulation (add\remove group, members etc) is (all paths are from sdk\java\):
Common\CRNConnect.java — connecting to Cognos Services
Secuity\Logon.java — you’re not using Anonymous  logon, do you?
HandlerCS\CSHandler.java — all Content store manipulation wrap-ups

Some more SDK sample links:

Cognoise SDK for Cognos 8 forum

Cognoise SDK for Cognos 10 forum

IBM SDK-related articles and Technotes

 

PS: It’s funny, I’ve wrote both TM1 Java API code and BI 10 SDK in last three months ) Can’t think about any combined methods that’d be useful. Any ideas?

Using session variables in Cognos BI

Just a quick Cognos BI hint: you can use session variables to store project-level constant values.

I’m a big fan of ‘feature-rich’ ETL reports showing not only what dimension element mismatch between systems, but also allowing seamless editing of element mapping. This usually means drilling down from report into external application for dimension mapping. Parameters are usually passed via URL (easiest possible way).
After server name and, therefore, URL changed for a second time in current project, I’ve set up a project level ‘severname’ constant to avoid XML find\replace for each report.
It’s really easy:

1) Add a session variable to your Framework project, write your required constant value there. Like ‘servername’ = ‘awesomebi’

2) Use it in Report Studio, just type

#sq($variable_name)#

in expression editor. sq encloses string in single quotes.

Cognos BI OLAP querying

Sometimes it’s best to be late with fulfilling promises. Was I to write this post a month earlier, I wouldn’t be able to give people using Cognos BI with Microsoft Analysis Services and Oracle Essbase a life-saving advice. At least, the advice that I’d be really grateful a couple of years ago.

 

A month ago I talked about how Oracle BI interacts with OLAP servers and promised a write-up on Cognos. Here it comes.

 

Let’s start with some obvious history. In dinosaur era, Cognos became popular mostly due to it’s OLAP server PowerPlay. And during development of unified reporting platform that we now know as Cognos 8 or 10, special attention was paid to integrating PowerPlay cubes.

 

In the nutshell, there are a few major architectural decisions:
1) There’s almost no meta data stored in Framework models for OLAP cubes. Really, it’s just a connection string, you will not see dimensions or measures when you plug an OLAP source into FM model. Cube metadata is accessed by each studio\report at runtime (not strictly true — there’s caching, but that’s insignificant in this context) and stored in reports definition. So when you change cube dimensional structure (add\remove levels or dimensions) — there’s a possibility your reports will break. But contrary to Oracle BI, you’ll be fixing that in reports themselves rather than in Framework model (I’m not sure if that’s good news)
2) To make OLAP reports more durable, all references to data elements in reports are based on dimension Member Unique Names (MUNs) that can be made less prone to change than just element names or paths. If you’re playing your cards well, all adding structural changes (add dimension, hierarchy or level to existing one) will not break any existing report.
3) Since PowerPlay was a desktop solution it had an interesting approach to zero suppression — it was done on client machine. When this approach migrated to unified Cognos platform, Cognos server became the client doing all suppression (I ranted about this in this post). But I couldn’t understand why the same approach (server-side suppression) was used when connecting to Essbase or Ms Analysis Services.  It took me quite a while to build some workarounds in the project where I used Cognos BI with Essbase. But not anymorethere’s a solution (where it was 3 years ago, one wonders)

 

And there’s a whole new Dynamic Query Engine mode in Cognos 10 devoted to OLAP querying. It can be used with SAP BW (main pain-point it seems), Essbase, TM1 and, interestingly, Netezza (see software environments page). Dynamic Query mode provides 64 bit in-memory security aware caching, null-suppression and a nice looking query analyzer.

 

As it sometimes happens — I suggest you to read first 5 pages of Dynamic Query Cookbook even if you’re not using Cognos 10, since they describe how original query execution (meaning, Cognos 8 ) works. It’s actually the first document I’ve seen  clearly stating that temporary micro-cubes are created for each DMR request (these are dmc files appearing in temp folder).

BedRockTM1: code samples and methodology

A new TM1 resource just popped up on my radar. Take a close look at bedrocktm1.org.

Overall approach excites coder part of me: modular development, agile, best practice source code — all this is actually very helpful. Having so much Turbo Integrator ‘how-to’s solves a lot of ‘re-inventing the bicycle’ cases.

As far as I understand bulk of this project’s code is developed by CubeWise. Hats of to them for being so open and willing to push all TM1 community further.

It’s interesting to see whether this site will gain momentum — whether there’ll be non-CubeWise commiters, how will adoption go and etc.

I would be proud to contribute some code there someday, if I’ll be doing TM1 related stuff. Maybe they will accept my small utilities for start….