If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Distinct customer count in Transformer

Started by satyagorantla, 12 Mar 2008 09:27:57 AM

Previous topic - Next topic

satyagorantla

Hello Everyone,

I have a requirement to build a cube with distinct customer count that is dynamic across time dimension. I am using FM to generate IQDs as datasources for Transformer. Here is the sample data

Cust ID          TransactionID       TransactionDate
--------         -------------       -----------------
101                 1A                     01-01-2007
102                 1B                     01-31-2007
101                 1C                     02-20-2007
101                 2A                     01-31-2008
103                 2B                     02-01-2008

I have built a time dimension on Transaction Date column. And I need to show the distinct cust. count by this dimension. So Overall cust. count should be 3 and drill down on time dimension by year should show year 2 for 2007 and for 2 2008.

I cannot use running count and set the minimum of running count to 1 as the subset on which the running count needs to be applied changes with the level on the time dimension.

Also, category count is ruled out as I have over 50 million customers. External rollups are not feasible as there a quite a few dimensions to rollup.

I have posted this topic under Cognos 8 -> BI modelling (Transformer) as well.

Any inputs you have are greatly appreciated!!!

Thanks in advance!

satyagorantla

I am still stuck with this issue and any suggestions you have are much appreciated. Thanks!

seweryng

Hi,

As far as I know this is old dilema for transformer cubes :)

As you said,
Category count measure will not work for anything exceding 10k-15k of categories. External rollup will work but number of queries required to prepare the data grows exponentially by number of dimensions and levels.

The only options I can think of:
- Go for external rollup but talk to the user and limit the scope of allocation to 3-4 dimensions, maybe 7-8 levels. You still may have all your dimensions for other mesures but your count distinct will be just unallocated in them (you will still allocate this measure for top level in all dimensions with one single query).
- validate your requirements. Maybe they can be satisfied by number of simple reports.
- move to dimensionally modeled relational data in FM in v8. But you need to be carefull with this one, especially with your volume of data.

I would also hear some opinions on this subject...

Regards,

Jurii

Category count measure will not work for anything exceding 10k-15k of categories

About 8 years ago I built a PowerPlay Transformer cube containing 1 million customers (16 million rows of data in fact table). The hardware was a workstation Pentium-2, 256 Mb RAM. Category count worked well, but not fast. The first query took about 7 minutes, but after drilling down the response time became better, a few seconds per query.

50 million customers is a great challenge for category counting, but it would be interesting to test Cognos Transformer on more powerful hardware than Pentium-2 ;)

seweryng


Can I ask more details? Like how big was this cube in terms of dimensions/file size?
What was loading time?

When you said "after drilling down", what were you drill down on? Customer dimension?

I remember my trials with category count and it worked well (means retrieving data in seconds) only for small cubes build for marketing campaign efficiency for specific, small customer segments. Maybe I should give it another chance...

Jurii

Can I ask more details? Like how big was this cube in terms of dimensions/file size?
What was loading time?
When you said "after drilling down", what were you drill down on? Customer dimension?


There were 10 dimensions and 15 measures in that cube. The Customer dimension had 3 geographical levels. The size of .MDC file, as I remember, was between 1GB and 2GB. The loading time was about 10 hours.
Drilling down - on both Customer dimension and other dimensions.

seweryng

ok, thank you, that's relatively big cube than...
7 minutes to start with is reasonable given the circumstances, however most clients would complain.

Jurii

7 minutes to start with is reasonable given the circumstances, however most clients would complain.

I suppose it would take less time than 7 minutes if this cube was placed on powerful hardware (more powerful than a workstation Pentium-2 with 256 MB RAM).
The lion share of projects I take part usually include not more than 50 thousands of categories for category count, and it takes a few/several seconds to calculate distinct counts in Cognos BI reports.
Since 64-bit architecture became compatible with Cognos BI, I will find the possibility to conduct some big tests with category counting...

ms_anna

I have a similar question but it is regarding the accomodation of about 800K categories in a unique calculated column (as a dimension), knowing the limitations of max categories, is there a way i can have all these 800K categories counted through category count measure in transformer.

Please give some detailed steps.