If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

How have you handled huge data

Started by cognostechie, 03 May 2013 02:39:11 AM

Previous topic - Next topic

cognostechie

Not sure where to post this so posting it here.

I am curious to know what is the largest data size people have handled and how everyone has handled it.

So, let's say you have 10 Fact tables, the smallest one of them having 70 million records. How did you use it in Cognos? Just the normal way you would make a Framework Model? or did you make a Transformer/TM1 cube? or did you create summary tables? or some other approach?


CognosPaul

One of my clients builds probes for telecoms companies. Each probe retrieves different CDRs and populates different tables. The tables themselves are fairly wide, a few of them have upwards of 200 fields. The expected volume of CDRs varies between the telecoms companies, anywhere between 50,000 and 200,000 rows per second.

Because of the amounts of data, the detailed data is only stored for 48 hours, and they have various aggregate tables to handle historical information. The aggregate tables are updated periodically based on the interval.

Obviously this presents a few interesting challenges. Some reports need to contain unions of the different technologies, the probes load the tables with nulls, and the vast volumes of data can cause reports to run slowly.

Most reports need to be filtered by one of the agg tables, but fortunately they never need to mix data between the different aggregations. This allows us to use macro to decide which table to use: select * from [DS].table#prompt('AggType','token')# (This is a simplification, the use of macros in the data layer will prevent Cognos will force the governor "Allow enhanced model portability at runtime" which is problematic on some RDBMSs). Users can choose which aggregation to use via a prompt on the prompt page, or we can create a hidden prompt with a default value if the report needs to be on a specific aggregation.

Partitions and indexes are supremely important - any query that does a full table scan is rejected, so the model is built very very carefully.

They are in the process of upgrading to 10.2, so we haven't had a chance to check the dynamic cubes. I suspect it won't work for us as most of the reports use more than 12 attributes.

cognostechie

Thanks Paul.

The biggest data I worked with was for a dairy products client who was adding 80000 records to the fact table every day. That was about 4 yrs ago in Chicago. They had 30 product lines and we had split the data in different databases, one for each product line. They had decided not to use Cubes for reasons best known to them so the only option was to use FM. The data was then integrated using Cognos macros in FM depending on what the user chooses. FM had an extra namespace called 'Model Layer' which was the integration layer between Database Layer and Business Layer. A third party tool called Loadrunner was used to handle performance issues.

Anyway, it is quite surprising that in the most widely used Cognos users forum, only couple of people have exposure to big data.

bdbits

With all the hype these days on "big data", I've often wondered how large the marketplace really is for truly large datasets.

We have one warehouse with 100s of millions of rows in some fact tables, others with 10s of millions of rows, but many are smaller. For the larger ones, we do build FM models but we also build cubes for certain frequent subject areas. Most of the reports end up built off the cubes for sheer speed, but the users do prefer FM packages for some things. If you tune the database, it can perform well enough.

But I have never had anything on the scale PaulM is talking about. Yikes! It would actually be pretty cool to work on something like that.