If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Count Distinct Fact

Started by grifter102, 04 Oct 2012 09:42:02 AM

Previous topic - Next topic

grifter102

I have a package created which is at a purchase order item level and I'd like to create a fact for users in Query Studio which shows a distinct count of purchase orders.  I tried something very simple:

count(distinct [Po Number])

but it doesn't seem to work.  When I created a very simple query using that fact within query studio, it times out.  I opened the query studio report in report studio to check the generated sql, and it just makes no sense to me:

select distinct "T1"."C0" "C0" , "T1"."C1" "C1" , "T1"."C2" "C2" , "T0"."C2" "C3"
from (
select "T1"."C0" "C0" , "T1"."C1" "C1" , min("T0"."C0") "C2"
from (
select count(distinct "T1"."C2") "C0"
from (
select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , count(distinct "T1"."PO_ID") "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}
group by "T1"."CLIENT_NAME", "T1"."BRAND_DESC") "T0",
(select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , "T1"."PO_ID" "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}) "T1"
where "T1"."C0" = "T0"."C0" and "T1"."C1" = "T0"."C1") "T0",
(select distinct "T1"."C0" "C0" , "T1"."C1" "C1" , "T0"."C2" "C2"
from (
select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , count(distinct "T1"."PO_ID") "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}
group by "T1"."CLIENT_NAME", "T1"."BRAND_DESC") "T0", (
select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , "T1"."PO_ID" "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}) "T1"
where "T1"."C0" = "T0"."C0" and "T1"."C1" = "T0"."C1") "T1"
group by "T1"."C0", "T1"."C1") "T0", (
select "T1"."C0" "C0" , "T1"."C1" "C1" , "T1"."C2" "C2"
from (
select count(distinct "T1"."C2") "C0"
from (
select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , count(distinct "T1"."PO_ID") "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}
group by "T1"."CLIENT_NAME", "T1"."BRAND_DESC") "T0", (
select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , "T1"."PO_ID" "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}) "T1"
where "T1"."C0" = "T0"."C0" and "T1"."C1" = "T0"."C1") "T0", (
select distinct "T1"."C0" "C0" , "T1"."C1" "C1" , "T0"."C2" "C2"
from (
select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , count(distinct "T1"."PO_ID") "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}
group by "T1"."CLIENT_NAME", "T1"."BRAND_DESC") "T0", (
select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , "T1"."PO_ID" "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}) "T1"
where "T1"."C0" = "T0"."C0" and "T1"."C1" = "T0"."C1") "T1") "T1"
where "T1"."C0" = "T0"."C0" and "T1"."C1" = "T0"."C1"
order by "C3" asc  FOR FETCH ONLY


The record source and co_cd limits are part of the index on the source table... if I run one piece of this query directly against DB2:

select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , count(distinct "T1"."PO_ID") "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147') and "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'}
group by "T1"."CLIENT_NAME", "T1"."BRAND_DESC") "T0",


it returns from the database in 2 seconds.  I have no idea what cognos is doing with all of the other pieces of the query.  I know that the generated SQL can be a little bit odd... but this seems ridiculous.

Am I doing something wrong?  What would be the proper way to have a distinct count of POs measure in Query Studio via FM?

tjohnson3050

First off, there is no need to create an expression to do count distinct.  In FM, you can just create a another data item based on purchase order, and set the aggregation property on the item to count distinct, that may remove one level of unecessary aggregation.

grifter102

That doesn't seem to work quite right either.  I created a new data item pointing to PO Number, set the Regular Aggregate property to Count Distinct and set it as a fact.  When I open the report in RS and view the sql, it's much cleaner now, but it shows this:

select "T1"."CLIENT_NAME" "C0" , "T1"."BRAND_DESC" "C1" , count("T1"."PO_ID") "C2"
from CIDM."PURCHASE_ORDER_ITEM_SUMMARY" "T1"
where "T1"."PO_ISSUE_DT" between {d '2011-01-01'} and {d '2011-01-31'} and "T1"."RECORD_ORIGINAL_SOURCE" in ('SAPGP1100') and "T1"."CO_CD" in ('0147')
group by "T1"."CLIENT_NAME", "T1"."BRAND_DESC" FOR FETCH ONLY


No "Distinct" is set, and since the table is at the PO item level, it's counting one PO per every item in the report.  Is there some other flag that needs to be set in order to enable "distinct"?

tjohnson3050

The property you set in FM is the default property used in report studio. If you already had a report created, make sure the aggregation property in report studio is set to count distinct.

grifter102

I'm using query studio.  I created a new data item with the definition of [PO Number], then set the regular aggregation to "Count Distinct".  I set it as a fact, then published the package.  I added 2 other items from the same table to a query studio report and added my "PO Count Distinct" item as well.  Once I ran it the numbers looked off, so I opened the report in RS and checked the SQL. That's where I noticed that  "Distinct" wasn't being used.

Are there any other properties that I can check or other settings in QS or the framework that would cause this?

Thanks fpr the quick responses.

tjohnson3050

Look at the regular aggregation properties of the data item in Report Studio.  What is that property currently set to?

grifter102

In report studio both aggregate and rollup aggregate are set as Count, not Count Distinct.  If I change it to Count Distinct, the SQL generates properly.

Unfortunately, this doesn't help me in Query Studio.  How do I change the property so it uses distinct in Query Studio?

grifter102

I also went into Query Studio, right-clicked on the "PO Count Distinct" data item I set up, went to Information... and the Aggregation there says "Count Distinct".  Even though it is definitely calculating only a Count, and when I "Open in Report Studio" it does only Count.

tjohnson3050

In Query Studio, select the data item, then click summarize (big Sigma button).  Then you will get a screen that allows you to change the detail aggregation of the data item you have slected.

grifter102

The item already had a summary (the report was set to auto-summarize), so clicking the sigma button didn't do anything.  When I right-clicked on the column and then clicked Summarize, the summary said Count.  I changed it to Count Distinct and ran the report.  The results were still only a Count (no distinct) and the summary total was completely wrong (showed 43 even though there are around 1000 distinct POs for that month).

Is there some other option that I'm missing or have set improperly?  The users of the report need a PO count distinct they can drag into their reports... it seems like it should be simple but nothing seems to be working.

tjohnson3050

In Query Studio, when you click summarize, the aggregate function is for the summary row.  You need to click the advanced button to access the aggregate property for the details rows.

grifter102

Okay, that worked.  I set the advanced row summary for cells to be "Count Distinct" and that worked.  The overall total for the column is still very, very wrong, but I think I've read several things in the past about issues with the overall summarization in QS.

Is there any reason why Query Studio doesn't use the proper aggregation that I define in framework manager?  It's going to be very confusing for the users to have to adjust the advanced summarization to convert the column into a Count Distinct... even though it's already set up as such.  Is this just a bug or limitation of Query Studio?

Thanks very much for your help, I really appreciate it.

Lynn

#12
I wonder if it behaves differently in Business Insight Advanced/Cognos Workspace Advanced (name has been changed for 10.2)?

tjohnson3050

Keep in mind that you changed the value in Framework Manager after the Query Studio report had been created.  When you create a new report, the data items will have the aggregation properties that were in the model at the time you created the report.  If you change those aggregation properties in the model, the values in existing reports will not change.

This is good argument for testing the model fully before report creation begins.

blom0344

Grifter,

What middleware are you using to connect to DB2?  The query syntax you show us point more at generic ODBC than native connectivity to DB2..

grifter102

@tjohnson

I deleted the item in Query Studio after changing the framework, then readded it.  I still had the same strange behavior.  I'm also not really sure how to "test the model fully" without using something like Query Studio.  I can test a little bit in FM, but since my users will be using Query Studio mainly, that's where I perform most of the testing.

@blom

I'm not sure exactly how the connection is being made to DB2... I'll try to find out.  It's difficult since the Cognos installation is managed by a very bureaucratic group and getting detailed answers is a grueling process.