Hello Everyone,
I have a requirement to build a cube with distinct customer count that is dynamic across time dimension. I am using FM to generate IQDs as datasources for Transformer. Here is the sample data
Cust ID TransactionID TransactionDate
-------- ------------- -----------------
101 1A 01-01-2007
102 1B 01-31-2007
101 1C 02-20-2007
101 2A 01-31-2008
103 2B 02-01-2008
I have built a time dimension on Transaction Date column. And I need to show the distinct cust. count by this dimension. So Overall cust. count should be 3 and drill down on time dimension by year should show year 2 for 2007 and for 2 2008.
I cannot use running count and set the minimum of running count to 1 as the subset on which the running count needs to be applied changes with the level on the time dimension.
Also, category count is ruled out as I have over 50 million customers. External rollups are not feasible as there a quite a few dimensions to rollup.
I have posted this topic under Cognos 8 -> BI modelling (Transformer) as well.
Any inputs you have are greatly appreciated!!!
Thanks in advance!
I am still stuck with this issue and any suggestions you have are much appreciated. Thanks!
Hi,
As far as I know this is old dilema for transformer cubes :)
As you said,
Category count measure will not work for anything exceding 10k-15k of categories. External rollup will work but number of queries required to prepare the data grows exponentially by number of dimensions and levels.
The only options I can think of:
- Go for external rollup but talk to the user and limit the scope of allocation to 3-4 dimensions, maybe 7-8 levels. You still may have all your dimensions for other mesures but your count distinct will be just unallocated in them (you will still allocate this measure for top level in all dimensions with one single query).
- validate your requirements. Maybe they can be satisfied by number of simple reports.
- move to dimensionally modeled relational data in FM in v8. But you need to be carefull with this one, especially with your volume of data.
I would also hear some opinions on this subject...
Regards,
Category count measure will not work for anything exceding 10k-15k of categories
About 8 years ago I built a PowerPlay Transformer cube containing 1 million customers (16 million rows of data in fact table). The hardware was a workstation Pentium-2, 256 Mb RAM. Category count worked well, but not fast. The first query took about 7 minutes, but after drilling down the response time became better, a few seconds per query.
50 million customers is a great challenge for category counting, but it would be interesting to test Cognos Transformer on more powerful hardware than Pentium-2 ;)
Can I ask more details? Like how big was this cube in terms of dimensions/file size?
What was loading time?
When you said "after drilling down", what were you drill down on? Customer dimension?
I remember my trials with category count and it worked well (means retrieving data in seconds) only for small cubes build for marketing campaign efficiency for specific, small customer segments. Maybe I should give it another chance...
Can I ask more details? Like how big was this cube in terms of dimensions/file size?
What was loading time?
When you said "after drilling down", what were you drill down on? Customer dimension?
There were 10 dimensions and 15 measures in that cube. The Customer dimension had 3 geographical levels. The size of .MDC file, as I remember, was between 1GB and 2GB. The loading time was about 10 hours.
Drilling down - on both Customer dimension and other dimensions.
ok, thank you, that's relatively big cube than...
7 minutes to start with is reasonable given the circumstances, however most clients would complain.
7 minutes to start with is reasonable given the circumstances, however most clients would complain.
I suppose it would take less time than 7 minutes if this cube was placed on powerful hardware (more powerful than a workstation Pentium-2 with 256 MB RAM).
The lion share of projects I take part usually include not more than 50 thousands of categories for category count, and it takes a few/several seconds to calculate distinct counts in Cognos BI reports.
Since 64-bit architecture became compatible with Cognos BI, I will find the possibility to conduct some big tests with category counting...
I have a similar question but it is regarding the accomodation of about 800K categories in a unique calculated column (as a dimension), knowing the limitations of max categories, is there a way i can have all these 800K categories counted through category count measure in transformer.
Please give some detailed steps.