COGNOiSe.com - The IBM Cognos Community

Legacy Business Intelligence => COGNOS PowerPlay => Topic started by: satyagorantla on 12 Mar 2008 09:27:57 AM

Title: Distinct customer count in Transformer
Post by: satyagorantla on 12 Mar 2008 09:27:57 AM
Hello Everyone,

I have a requirement to build a cube with distinct customer count that is dynamic across time dimension. I am using FM to generate IQDs as datasources for Transformer. Here is the sample data

Cust ID          TransactionID       TransactionDate
--------         -------------       -----------------
101                 1A                     01-01-2007
102                 1B                     01-31-2007
101                 1C                     02-20-2007
101                 2A                     01-31-2008
103                 2B                     02-01-2008

I have built a time dimension on Transaction Date column. And I need to show the distinct cust. count by this dimension. So Overall cust. count should be 3 and drill down on time dimension by year should show year 2 for 2007 and for 2 2008.

I cannot use running count and set the minimum of running count to 1 as the subset on which the running count needs to be applied changes with the level on the time dimension.

Also, category count is ruled out as I have over 50 million customers. External rollups are not feasible as there a quite a few dimensions to rollup.

I have posted this topic under Cognos 8 -> BI modelling (Transformer) as well.

Any inputs you have are greatly appreciated!!!

Thanks in advance!
Title: Re: Distinct customer count in Transformer
Post by: satyagorantla on 13 Mar 2008 01:58:17 PM
I am still stuck with this issue and any suggestions you have are much appreciated. Thanks!
Title: Re: Distinct customer count in Transformer
Post by: seweryng on 08 Feb 2009 05:52:46 AM
Hi,

As far as I know this is old dilema for transformer cubes :)

As you said,
Category count measure will not work for anything exceding 10k-15k of categories. External rollup will work but number of queries required to prepare the data grows exponentially by number of dimensions and levels.

The only options I can think of:
- Go for external rollup but talk to the user and limit the scope of allocation to 3-4 dimensions, maybe 7-8 levels. You still may have all your dimensions for other mesures but your count distinct will be just unallocated in them (you will still allocate this measure for top level in all dimensions with one single query).
- validate your requirements. Maybe they can be satisfied by number of simple reports.
- move to dimensionally modeled relational data in FM in v8. But you need to be carefull with this one, especially with your volume of data.

I would also hear some opinions on this subject...

Regards,
Title: Re: Distinct customer count in Transformer
Post by: Jurii on 10 Feb 2009 12:02:00 PM
Category count measure will not work for anything exceding 10k-15k of categories

About 8 years ago I built a PowerPlay Transformer cube containing 1 million customers (16 million rows of data in fact table). The hardware was a workstation Pentium-2, 256 Mb RAM. Category count worked well, but not fast. The first query took about 7 minutes, but after drilling down the response time became better, a few seconds per query.

50 million customers is a great challenge for category counting, but it would be interesting to test Cognos Transformer on more powerful hardware than Pentium-2 ;)
Title: Re: Distinct customer count in Transformer
Post by: seweryng on 10 Feb 2009 07:00:59 PM

Can I ask more details? Like how big was this cube in terms of dimensions/file size?
What was loading time?

When you said "after drilling down", what were you drill down on? Customer dimension?

I remember my trials with category count and it worked well (means retrieving data in seconds) only for small cubes build for marketing campaign efficiency for specific, small customer segments. Maybe I should give it another chance...
Title: Re: Distinct customer count in Transformer
Post by: Jurii on 10 Feb 2009 10:14:36 PM
Can I ask more details? Like how big was this cube in terms of dimensions/file size?
What was loading time?
When you said "after drilling down", what were you drill down on? Customer dimension?


There were 10 dimensions and 15 measures in that cube. The Customer dimension had 3 geographical levels. The size of .MDC file, as I remember, was between 1GB and 2GB. The loading time was about 10 hours.
Drilling down - on both Customer dimension and other dimensions.
Title: Re: Distinct customer count in Transformer
Post by: seweryng on 10 Feb 2009 11:39:16 PM
ok, thank you, that's relatively big cube than...
7 minutes to start with is reasonable given the circumstances, however most clients would complain.
Title: Re: Distinct customer count in Transformer
Post by: Jurii on 11 Feb 2009 01:43:46 AM
7 minutes to start with is reasonable given the circumstances, however most clients would complain.

I suppose it would take less time than 7 minutes if this cube was placed on powerful hardware (more powerful than a workstation Pentium-2 with 256 MB RAM).
The lion share of projects I take part usually include not more than 50 thousands of categories for category count, and it takes a few/several seconds to calculate distinct counts in Cognos BI reports.
Since 64-bit architecture became compatible with Cognos BI, I will find the possibility to conduct some big tests with category counting...
Title: Re: Distinct customer count in Transformer
Post by: ms_anna on 01 May 2009 08:18:12 PM
I have a similar question but it is regarding the accomodation of about 800K categories in a unique calculated column (as a dimension), knowing the limitations of max categories, is there a way i can have all these 800K categories counted through category count measure in transformer.

Please give some detailed steps.