If you are unable to create a new account, please email support@bspsoftware.com

 

How forces Report Studio to generate a "GROUP BY" instead "DISTINCT"

Started by safafin, 04 Aug 2016 07:47:00 AM

Previous topic - Next topic

safafin

Hey guys,

On my report in dimensional with one dimension and one fact (sum aggregate). Report Studio generate a SQL query with a DISTINCT.

How I can force Report Studio to generate a SQL query rather with a GROUP BY ?

Thanks for your help.

MFGF

Quote from: safafin on 04 Aug 2016 07:47:00 AM
Hey guys,

On my report in dimensional with one dimension and one fact (sum aggregate). Report Studio generate a SQL query with a DISTINCT.

How I can force Report Studio to generate a SQL query rather with a GROUP BY ?

Thanks for your help.

What behaviour does the underlying item have? Identifier? Attribute?

MF.
Meep!

safafin

I have two items :

  • Article is a dimension
  • Quantity is a measure with sum in aggregate

Aggregate is correct. It's the time of treatment which is bad because of DISTINCT into SQL query.

When I switch the DISTINCT by GROUP BY in DataStudio, the performances are corrects.

Michael75

QuoteIt's the time of treatment which is bad

<interfering mode>

I suspect that safafin is a native French speaker, and is referring to the processing time.

</interfering mode>

safafin

To complemente the last post.

Report Studio generates this query :


WITH   "REF_ARTICLE0" AS (
SELECT DISTINCT           
"REF_ARTICLE"."CD_ARTUC" AS "CD_ARTICLE",           
"REF_ARTICLE"."CD_FILART" AS "CD_FILART"     
FROM         
"REF_ARTICLE" "REF_ARTICLE_TOTAL"


SELECT     
"REF_ARTICLE0"."CD_FILART" AS "column0",
SUM("REL_AGG_DETNOM_SEM"."IN_QUANTITY") AS "column1" 
FROM     
"REF_ARTICLE0"
INNER JOIN
"REL_AGG_DETNOM_SEM" "REL_AGG_DETNOM_SEM"
ON
"REF_ARTICLE0"."CD_ARTICLE" = "REL_AGG_DETNOM_SEM"."CD_ARTICLE"   
GROUP BY
"REF_ARTICLE0"."CD_FILART"
FOR FETCH ONLY


It's the WITH for the dimension which increase the processing time.

I would like force Report Studio to make a GROUP BY :


WITH   "REF_ARTICLE0" AS (
SELECT
"REF_ARTICLE"."CD_ARTUC" AS "CD_ARTICLE",
"REF_ARTICLE"."CD_FILART" AS "CD_FILART"     
FROM         
"REF_ARTICLE" "REF_ARTICLE_TOTAL"
        GROUP BY
"REF_ARTICLE"."CD_ARTUC" AS "CD_ARTICLE",
"REF_ARTICLE"."CD_FILART" AS "CD_FILART"   


SELECT     
"REF_ARTICLE0"."CD_FILART" AS "column0",
SUM("REL_AGG_DETNOM_SEM"."IN_QUANTITY") AS "column1" 
FROM     
"REF_ARTICLE0"
INNER JOIN
"REL_AGG_DETNOM_SEM" "REL_AGG_DETNOM_SEM"
ON
"REF_ARTICLE0"."CD_ARTICLE" = "REL_AGG_DETNOM_SEM"."CD_ARTICLE"   
GROUP BY
"REF_ARTICLE0"."CD_FILART"
FOR FETCH ONLY


I hope that help you to understand my problem.

<interfering mode>

Your assomption is right. Thanks for your help Michael75.

</interfering mode>

AnalyticsWithJay

safafin,

I suspect your issue is something else.

The subquery within the "WITH" clause does not contain any aggregate functions. DISTINCT and GROUP BY do the same thing -- in fact -- many databases are smart enough to substitute a DISTINCT for a GROUP BY when it notices you have no aggregate functions in your query. The only difference between these (essentially - I'm oversimplifying, but it is true for this example) is that GROUP BY allows you to use aggregate functions.

In other words, Cognos is generating the correct SQL since it's a dimension, and there is no difference in the result of the subquery between the two. This is why I suspect your issue is something else.

Could you post more details about the issue? Perhaps showing the tabular data, final data, what number you expect and what number you see.


safafin

Thanks for your response.

I have a DB2 BLU v10 database. It's a IBM expert who we have spoken difference between "DISTINCT" and "GROUP BY" into SQL query on this database.

After to have try the both SQL query. There is a very difference between DISTINCT and GROUP BY. Not in the finality but in the processing time.

Example :








FILEARTQUANTITY
150
1100
1200
TOTAL350

In  this example, there is no difference in the result but a difference in the processing time. With a GROUP BY the query is executed in 1 second but with a DISTINCT the query is executed in 54 second.

AnalyticsWithJay

The query plan generated between GROUP BY and DISTINCT should be the same in nearly all cases, assuming you are comparing the exact same SQL, and neither query uses aggregate functions.

You should talk to your DBA and ask them to generate an explain/execution plan so you can see what is happening. This is uncommon behavior at your database level. Cognos is generating the correct SQL given your query. Rather than focusing on forcing Cognos to use SQL clauses which it shouldn't be using, the focus IMO should be on why the database is performing poorly against DISTINCT clauses.

Could be your indices, could be your DBA's configuration. But IMO you need to be investigating the DB, not Cognos.

safafin

I generated a explain plan of my queries. There aren't the same.





I will to be investigating the DB.

It's strange sometimes Report Studio Generates a query with GROUP BY instead of DISTINCT. That's why I wanted to force Report Studio to use GROUP BY.

Thanks CognoidJay.

MFGF

Quote from: safafin on 04 Aug 2016 10:24:31 AM
I have two items :

  • Article is a dimension
  • Quantity is a measure with sum in aggregate

Aggregate is correct. It's the time of treatment which is bad because of DISTINCT into SQL query.

When I switch the DISTINCT by GROUP BY in DataStudio, the performances are corrects.

Hi,

If you look at the Article item in Framework Manager, what us the Usage property set to be? Is it set to be an Attribute or an Identifier?

MF.
Meep!

AnalyticsWithJay

Quote from: safafin on 05 Aug 2016 10:05:06 AM

It's strange sometimes Report Studio Generates a query with GROUP BY instead of DISTINCT. That's why I wanted to force Report Studio to use GROUP BY.


It generates a DISTINCT when your query has no fact (aggregate) item. It generates a GROUP BY if your query has a measure/aggregate. In theory, you could always add a measure item to the query to force it to use a GROUP BY, but I wanted to point out that you may potentially have some other issue that you're not considering (ie: the DB).

Any luck with your DBA?

safafin

Hi,

@MFGF
Article is in regular dimension and Quantity is in measure dimension from DMR.
Article is a identificator and Quantity is a fact from query subject.

@CognoidJay
I tried with a iregular dimension item and a measure dimension item.

Any lucky with DBA because we haven't DBA.

Thanks.

safafin

Hi,

For your information, the difference between GROUP BY and DISTINCT in DB2 V10.5 is IBM bug.
A ticket was put on the ibm support.

Thanks you to all for your help.

AnalyticsWithJay

Thanks for the update! I had a feeling that was very strange behavior.

safafin

Hi,

We have found a solution to force COGNOS to generate a GROUP BY instead of DISTINCT on a DB2 database.

In the config file db2.sql.properties in folder xqe into folder configuration . Edit the parameter "performance.convertGroupByToDistinct=true" by "performance.convertGroupByToDistinct=false".

Thanks.

Michael75

Thanks for taking the trouble to post your fix here. That will make this thread much more valuable to anybody else who encounters this problem, and searches Cognoise looking for solutions.