Archive for the ‘open source’ Category

You are currently browsing the archives for the open source category.


Ways of calculating Running Count to import data into Multi-Line models

I've nicked the term "Multi-Line" model out from some cognos best practices presentations. Never known it was called that way )

Multi-Line is only way to go when you have a potentially huge dimension only a tiny bit of which should be available to end-user at a time. Like employee planning, whole dimension of 10k people, strictly less than a 100 in department. So you create a fake dimension of 1..100 and add an Employee name column, decreasing cube volume by 100. Access tables + cut-down might help, but it's sometimes better to allow people pick up any employee,client, product given up that there won't be more than fixed number of rows.

Main problem with this modeling technique arises when it's time to import data in such cube. That data usually doesn't have an 1..100 running count attached, so it's your task to add it.

In this post I'll sum up the ideas of how to calculate running count split by elist (that's usual, ain't it?).

As a example, I'll use this simple table

dept emp salary
Finance Pete 100
Finance Ann 200
Finance Jo 300
HR Nick 100
HR Sam 200

And you have departments as an elist, so you have to number rows so that numbering will restart for each dept. Numbering requires an order so let's alphabetical order of employee names.

So this is what we want to get:

dept emp salary running_count
Finance Ann 200 1
Finance Jo 300 2
Finance Pete 100 3
HR Nick 100 1
HR Sam 200 2

Read the rest of this entry »

Open Source BPMS

Intalio makes Open Source Business Process Management (BPM is Business Performance Management, adding an S you get a completely different product). It seems that OS community lacks only some OLAP-server with breakback(write-back, back-solve, call whether you like) to get the planning\budgeting system. Mondrian cannot do that, as far as I know.
In analogue with Ms Performance Point solution, by components (just an example):

Component Ms Open Source Analogue
Reporting Proclarity Pentaho (there's a lot more, I know)
OLAP Analysis MS AS Mondrian(I don't think Palo is ready for multi-user eviroments)
Workflow BizTalk Intalio
ETL SSAS Kettle
BreakBack Calculation WriteBack in MS AS ???
Database MS Sql Server Mysql
OS Windows Linux

Just a sketch, of course.

Atlassian

Just wanted to say that those guys rock. Meaning it, their products always make me feel sorry for implementing smth else. Stable as hell (stopping only for updates), working on slow PCs (a usual desktop).
We're using Jira for about 3 years now, and it's just the way it should be. Recently discovered Jelly, so now all that automatic answering\status-changing business is rolling.
And Confluence is simply my love. The way it should be. The way everything should be. We've got a 3+Gb info in it, considering all Cognos-related issues (and non-Cognos also), we document projects in it, we use it as department site. I'd be using it instead of Wordpress for this blog (I really prefer wiki-markup to AJAX WSIWYG) and everything else, but hosting cost differentiates 3 times.

I've just put Russian Translation of Confluence on Atlassian, hopefully it will help someone.

Cognos EP log analysis

One of the main things I carried out of my alma-mater is passion for data. When you've got data, so many interesting things can be done ;) . Various analysis, comparison, stats - just givе the data.

That was a rather pathetic preambula. Let's get straight to business.

Problem.
On recent project I was doing some technical support (dllhost problems, if you recall, and more). It is a rather mature installation (about 8 months of production) with quite a number of servers, so logs contained around 100 Mb of data. And common question was -- ok, this error, did it appear before? When? Accompanied by what errors? What's the overall trend? Common patterns? Charts?
All that analytical questions posed up on EP error logs.

As we all know, Cognos Contributor errors are recorded into PlanningErrorLog.csv files. And those log files contain rather detailed information, including ep version, module, time and, of course, error description.

I've tried to find some complete solution, but as usual on Microsoft platform, there are only paid tools, yet not solving the task completely.

Solution.

Having a whole Cognos 8 BI at finger-tips, I thought it would be nice to have all that errors in PowerPlay cube. I'm an OLAP-guy, after all.

So, technically the task was divided into 3 parts:

  1. Gathering the logs
  2. Forming a datasource for Transformer
  3. Creating a Transformer model and cube

Step 1 is solved by .bat file -- I'll post it in separate as "Backing up with timestamping"
Step 3 is quite straight-forward, if you have a single file, containing errors from multiple ones, except for time dimension as usual.
More details on forming a datasource. Got no time? Skip to this scheme for overall picture.
Merging the logs.
At first I thought it was a rather simple task since the source files are csv (comma separated) and they just need to merged n to 1 with some additional transformations (adding top level error categories based on error desc, for example).
It's never easy, I must admit.
Well, csv is comma-separated for all but Cognos, PlanningErrorLogs are tab-separated. That's not a problem, let's get n tab files to csv.
Planning logs contain some wonderful pack of unprintable chars(meant for Excel easy opening I hope, because there is no reason otherwise). In those chars there are some kind "EOF" chars, for example (I can see only their hex codes anyway), so VBScript cannot parse those files line by line correctly. There is a variant to open the file in Excel and save it to "normal csv", but that's impossible with 60 Mb log file I've got here.
For sake of my nerves and Internet space I won't describe all other problems like parsing returned sql statements in error descriptions (those contain tabs and ";" in the same line).

Since I like Python much the final script is .py. It takes a directorу containing logs, merges them into 1 file, adds timesort values and error categories.

So that's the final scheme.

We add client servers to this sending net and plan to use 8 BI Event Notifier as technical support catalyst.

-----
I'm eager to give out the scripts for the same reasons as I do this blog (also a dim hope that overall Cognos support will get better), so if you're interested - mail bark-bark ykud.com. A rather good error categorization can be created with some joined effort, imho.

Pentaho Data Mining

Acquisition trend affects open-source world as well -- Pentaho "acquired" Weka, data-mining toolset to put another brick in their all-covering-BI-wall.
Weka is a new name to me, so I'll toy with it for a while. Just need a sample dataset to get started. Only one at hands is aggregated list of client site EP errors (the so-famous PlanningErrorLog.csv). Around 100 mbs of raw data, nearly 250k of records -- maybe there is something we can't figure without using data mining. Except, of course, the fact that it Cognos EP is buggy. :)

Firefox

I like FireFox a lot, and was allways disappointed when changing to IE to view Contributor applications.

Recent versions of IETab addon allow to open tabs, using IE as renderer, and, moreover, start contrib apps in FireFox tabs.

So I don't have to switch browser anymore. Hoooray?

Back from vacations. Digging piled mail and feeds.

Adaptive Planning - first open-source planning application? (just a less-featured version, but that's a start).

Compiere + some OS BI (including planning) can be a solution in future.

Open Source BI.

They are really baking em as pancakes, i must admit:
Pentaho – http://www.pentaho.org
GreenPlum – http://www.greenplum.com
Jaspersoft - www.jaspersoft.com
Eclipse/BIRT - http://www.eclipse.org/birt/
Jpivot - http://jpivot.sourceforge.net/
The Bee Project - www.bee.insightstrategy.cz/en/index.html
OpenI - www.openi.org
PALO www.palo.net
SQL Summit - www.sqlsummit.com/trends/OpenSourceBI.htm
Breadboard BI - www.breadboardbi.com
Spago BI - http://spagobi.objectweb.org/
Got the list here.

FireStats icon Powered by FireStats