If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Count distinct performance

Started by mrcool, 09 Oct 2010 08:36:56 AM

Previous topic - Next topic

mrcool

Hi All,

I am facing performance problem in Analysis with the usage of Count distinct as regular aggregate property in FM. Cognos is writing weird queries when I use count distinct as regular property. I tried implementing count distinct inside the expression and setting the regular aggregate as calculated and Automatic but it didn't work as the totals in analysis studio are getting summed up instead of count distinct.
Did any one face a similar issue and found the resolution please post it here.

Thanks,
Mrcool

Alp

I have just tried Count distinct via report studio. The SQL looks normal.

Do you see performance difference between SQL if you write it yourself and the version from Cognos?

Try testing count distinct on a simple query subject, may be your problem is result of some model complications?

- Alp

mrcool

Hi ALP Thanx for your reply. Yes I am seeing performance difference.

Here is the scenario I am trying in analysis studio.

My requirement is to count distinct billing Account numbers for a period.

In Summary
Date          Count(distinct Billing Account)
2009/2010    100
Total 2009/2010 100

After drill down to Month level where distinct billing account numbers fall between months:
Date  count(distinct BA)
JAN       25
FEB       30
MAR       25
APR       25
Total     105(expected 100)

Here the expected total is 100 which I am able to achieve by applying Regular aggregate as Count distinct.
By adding count distinct inside the expression am getting the total as 105.

Thanks,
Mrcool

Alp

Unfortunately, I do not know how to workaround the problem in analysis studio. In RS, you could possibly get the result in 2 separate queries.

- Alp

mrcool

Thanx Alp...In report studio you can achieve this in single query by using for clause.
Hi Guru's please post your thought's...
This is how cognos is generating query when count distinct is applied:
SELECT   "T2"."C0" (NAMED "Yearkey" ) , "T2"."C2" (NAMED "Year0" ) ,
      "T2"."C1" (NAMED "Monthkey" ) , "T2"."C3" (NAMED "Month0" ) ,
      "T1"."C2" (NAMED "Billing_Account" ) , "T0"."C1" (NAMED "Billing_Account1" )
FROM   (
SELECT   "Approved_Calendar"."Cstm_Year_Id" (NAMED "C0" ) , COUNT ( DISTINCT "Compensation_Payable_Fact"."BAccnt_Num" ) (NAMED "C1" )
FROM   "EDWIT1E_VW40_DLR_COMP"."Cstm_Clndr_Dt" "Approved_Calendar" ,
      "EDWIT1E_VW40_DLR_COMP"."COMP_PAYABLE_FACT" "Compensation_Payable_Fact"
WHERE   "Approved_Calendar"."Clndr_Dt" = "Compensation_Payable_Fact"."Comp_Payable_Apprd_Dt"
   AND   "Approved_Calendar"."Cstm_Clndr_Tp_Id" = "Compensation_Payable_Fact"."Cstm_Week_Tp_Id"
   AND   "Approved_Calendar"."Cstm_Year_Id" = 2010
GROUP   BY "Approved_Calendar"."Cstm_Year_Id" ) "T0" ,

(
SELECT   "Approved_Calendar"."Cstm_Year_Id" (NAMED "C0" ) , "Approved_Calendar"."Cstm_Month_Id" (NAMED "C1" ) ,
      COUNT ( DISTINCT "Compensation_Payable_Fact"."BAccnt_Num" ) (NAMED "C2" )
FROM   "EDWIT1E_VW40_DLR_COMP"."Cstm_Clndr_Dt" "Approved_Calendar" ,
      "EDWIT1E_VW40_DLR_COMP"."COMP_PAYABLE_FACT" "Compensation_Payable_Fact"
WHERE   "Approved_Calendar"."Clndr_Dt" = "Compensation_Payable_Fact"."Comp_Payable_Apprd_Dt"
   AND   "Approved_Calendar"."Cstm_Clndr_Tp_Id" = "Compensation_Payable_Fact"."Cstm_Week_Tp_Id"
   AND   "Approved_Calendar"."Cstm_Year_Id" = 2010
GROUP   BY "Approved_Calendar"."Cstm_Year_Id" , "Approved_Calendar"."Cstm_Month_Id" ) "T1" ,
   
      (
SELECT   "Approved_Calendar"."Cstm_Year_Id" (NAMED "C0" ) , "Approved_Calendar"."Cstm_Month_Id" (NAMED "C1" ) ,
      "Approved_Calendar"."Cstm_Year_Name" (NAMED "C2" ) , "Approved_Calendar"."Cstm_Month_Name" (NAMED "C3" )
FROM   "EDWIT1E_VW40_DLR_COMP"."Cstm_Clndr_Dt" "Approved_Calendar" ,
      "EDWIT1E_VW40_DLR_COMP"."COMP_PAYABLE_FACT" "Compensation_Payable_Fact"
WHERE   "Approved_Calendar"."Clndr_Dt" = "Compensation_Payable_Fact"."Comp_Payable_Apprd_Dt"
   AND   "Approved_Calendar"."Cstm_Clndr_Tp_Id" = "Compensation_Payable_Fact"."Cstm_Week_Tp_Id"
   AND   "Approved_Calendar"."Cstm_Year_Id" = 2010
GROUP   BY "Approved_Calendar"."Cstm_Year_Id" , "Approved_Calendar"."Cstm_Month_Id" ,
      "Approved_Calendar"."Cstm_Year_Name" , "Approved_Calendar"."Cstm_Month_Name" ) "T2"
WHERE   ( "T2"."C0" = "T0"."C0"
   OR   "T2"."C0" IS NULL
   AND   "T0"."C0" IS NULL )
   AND   ( "T2"."C0" = "T1"."C0"
   OR   "T2"."C0" IS NULL
   AND   "T1"."C0" IS NULL )
   AND   ( "T2"."C1" = "T1"."C1"
   OR   "T2"."C1" IS NULL
   AND   "T1"."C1" IS NULL )

Thanks,
Mrcool

blom0344

I would check:


SELECT   "Approved_Calendar"."Cstm_Year_Id" (NAMED "C0" ) , "Approved_Calendar"."Cstm_Month_Id" (NAMED "C1" ) ,
      "Approved_Calendar"."Cstm_Year_Name" (NAMED "C2" ) , "Approved_Calendar"."Cstm_Month_Name" (NAMED "C3" )
FROM   "EDWIT1E_VW40_DLR_COMP"."Cstm_Clndr_Dt" "Approved_Calendar" ,
      "EDWIT1E_VW40_DLR_COMP"."COMP_PAYABLE_FACT" "Compensation_Payable_Fact"
WHERE   "Approved_Calendar"."Clndr_Dt" = "Compensation_Payable_Fact"."Comp_Payable_Apprd_Dt"
   AND   "Approved_Calendar"."Cstm_Clndr_Tp_Id" = "Compensation_Payable_Fact"."Cstm_Week_Tp_Id"
   AND   "Approved_Calendar"."Cstm_Year_Id" = 2010
GROUP   BY "Approved_Calendar"."Cstm_Year_Id" , "Approved_Calendar"."Cstm_Month_Id" ,
      "Approved_Calendar"."Cstm_Year_Name" , "Approved_Calendar"."Cstm_Month_Name"




Any chance this is returning  multiple rows for a given month?

mrcool

Hi Bloom,

No It is retrieving single row per month. I think it is writing one more sub query to fetch count distinct in the total.
But my doubt is why cognos is writing 3 subqueries as 2 queries should be sufficient to achieve the desired output.

Thanks,
Mrcool

IceTea

Quote from: mrcool on 10 Oct 2010 03:10:32 AM
In Summary
Date          Count(distinct Billing Account)
2009/2010    100
Total 2009/2010 100

After drill down to Month level where distinct billing account numbers fall between months:
Date  count(distinct BA)
JAN       25
FEB       30
MAR       25
APR       25
Total     105(expected 100)


Only a guess, but maybe this helps in some way:

Looks like in your first attempt it does a distinct over all BAs of the whole year. Then, after drilldown to month level it does a distinct over the monthly BAs. So if there is the SAME BA in JAN AND (!) FEB, it could be counted as ONE BA at the yearly view, but at the month level it is counted for JAN AND (!) FEB. Could this be the problem?

So what is your Business rule to deal with this case? Would you like to distinct count always on the actual hierarchy-level? So you are alright now, although it's confusing when the total's are changing while drilling.




mrcool

Hi IceTea,

QuoteLooks like in your first attempt it does a distinct over all BAs of the whole year. Then, after drilldown to month level it does a distinct over the monthly BAs. So if there is the SAME BA in JAN AND (!) FEB, it could be counted as ONE BA at the yearly view, but at the month level it is counted for JAN AND (!) FEB. Could this be the problem?

There is no problem in that. I am doing that as per the requirement.
I just want to know if there is a way to force cognos to write better SQL than this as it is eating up the performance.

Cheers,
Mrcool

blom0344

You mention Analysis Studio. So, this is based on a DMR model?
You define Months as JAN,FEB etc.
Did you uncheck unique level in the dimension-level, as with this notation you will need the next higher level to get uniqueness..

mrcool

Hi Blom,

Yes I am using DMR model. I have unique level unchecked. Do I need to check this?


Thanks,
Mrcool

mrcool

Hi All,

Need your help in this.


Thanx,
Mrcool

blom0344

It looks to me that Icetea got it nailed. Counting distinct occurences against the year level can yield different figures than a count distinct against the combination of year+month.

Unique combinations can increase by adding a dimension level

jive

Hi Mr. Cool,
I think for your number to get normalized you will have to add max or min date to get your information only one times. That's mean what is appear in march will be count only one times and display for that date
.
For the quality of the select written by Cognos, I think every Sql request write by cognos are base only on the definition include in the framework Manager,even if your FM it's hyper design it's stay it's a tool with his own rules. If you used the tools it will write Sql as it's define in the cognos rule and I think we don't have access of theses rules.

For the performance, the Sql is the key point, each time I got that kind of problem i look at my FM first, at the Sql and most of the time I have to write down a small query subject who will include the information I need to speed up the process and increase performance.

Thanks