If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Using report datasource generates full outer join error

Started by Michael75, 05 Jun 2013 08:15:44 AM

Previous topic - Next topic

Michael75

We are on C10.1.1 FP2, and I'm experiencing a strange error with Transformer.

In this project I first built an "Orders" cube, a totally conventional one, where the datasources are as follows:
  package-based : 16
  report-based  : 4 (including Customers dim & Orders measures)
and all works fine.

Recently, starting from my Orders cube, I've developed a "Key Account Orders" cube. As its name implies, this has stricter customer selection criteria, and it also has a different Orders measures datasource (created as a separate report).

So I am now at this number of datasources:
  package-based : 17 (including one new KA-specific dimension)
  report-based  : 4  (of which two are newly-developed reports -  KA Customers dim & KA Orders measures)

The problem occurs with the new KA Orders measures datasource. As I get the hang of this Cognos BI thingy, I've learnt to check the generated SQL wherever possible - it can avoid a lot of pain later! I'd checked the SQL generated by my new KA Orders measures report, and it was fine, as was its o/p. And initially, I didn't check the SQL in the new cube via Data Source Viewer, because I didn't think this would have served any purpose.

A bit later, seeing wrong results in my testing of the new cube, I did this very check in Transformer via Data Source Viewer, and was surprised to see that the generated SQL contained the dreaded full outer join.

I went back to my Key Account Orders measures report, and started playing with the "Use With Clause" parameter on its final query, as I've found in the past that Transformer can be sensitive to this. But regardless of the setting in the saved report, my cube datasource still generates full outer join SQL, although the report, via Show Generated SQL/MDX does not.

I realise that this is a bit of a long shot, but has anybody come across a similar scenario, and if so did they manage to solve it?


TIA, Michael

cognostechie

A full outer join is not neccecarily bad ! If your package has Facts joined with a conformed dimension, then Cognos will create a full outer join regardless of whether it is being used in Transformer or Report Studio.

What is wrong here is that you are using a package and reports that are using multiple tables !! When this feature was introduced (ver 8.4) IBM had declared in the Transformer User Guide that the package can be used as a data source but it should be made sure that multiple data sources are used in Transformer and association within them is done inside Transformer. What that means is that any query subject that has a join in it should not be used as a data source in Transformer. What you can do is to create seperate query subjects for Facts and Dimensions and associate them in Transformer.

The feature of allowing a package as a data source does not imply that every package that is ready to be used in reporting studios can also be used in Transformer. To understand this, I would suggest you should get a proper training on Transformer to understand how it builds the cube from the data sources.

Michael75

@ Cognostechie

Thanks for taking the time to answer. I am in fact familiar with stitch queries resulting in a full outer join, but that is not at all what's happening here. To take your next point, I also know that Transformer establishes the relationships between datasources based on matching column names.

However, I was not familiar with the recommendations you give about using a (report) datasource based on multiple tables. Very interesting! I'll try to break this datasource down into several simpler ones.

Lynn

I don't think there are issues using a report data source that references multiple tables. I do that all the time. Perhaps it is just package based data sources that have that issue? I don't use package based data sources.

cognostechie

I did not mention 'report' data source. I never use a report as a data source in Transformer. If somebody changes that report, it can adversely effect Transformer cube. If somebody applies a filter to that report then the entire cube will be
missing that data set.  A package based data source is more preferable as long as that package is NOT used for reporting.

There are diff ways to do the same thing. One way is to do it and if it works for now then retire to the bar happily. Another way to do it in a way that it will work in every case, not work today and fail tomorrow.

What I suggested is what IBM mentions as the best practice (not to use any query that has a join in it, whether it is a report or a package).

Reason 1 - The join can be an inner join between dimension and fact tables which will exclude all dimensions values which do not have any fact data yet. So if you are selling fruits and Mangoes did not get sold yet because they are a new product for the company then Mangoes would not show up in the cube with zero value. The business would definately want to see
Mangoes with zero value to know which products do not get sold !

Reason 2 - Transformer rolls up the data as opposed to reports which may be designed to show the data at the granular level. The joins will roll up the data based on how it is joined which might work for the tables that are joined inside the query. However a cube may have multiple dimensions that reference other fields from other tables which do NOT have any relation with those joined tables. Result can be that the roll up which is performed by Transformer can be wrong for those dimensions. 

It is difficult to explain how Transformer accumulates data from diff sources and associates them to different dimensions without writing a whole book so it is better to go thru the IBM document which explains what are the implications of using
queries which have joins in it.

The best way is to create individual queries for each dimension which will feed the dimension information and once that is done then the queries which contains the fact information will roll up the fact information to each dimension. This is where the association comes into play by common field names and ensures that the rollup of data at every level is perfect. It is quite possible that one fact may not have relation to every dimension of the cube and this method would also ensure that the data
in that case does not gets rolled up because there will be no joins inside the queries to force it !

Anyway, like I said, it is up to you which way you want to take.

There is also a setting in the measures that sometimes causes a wrong rollup. The 'precision' setting of the measures determines how it handles the rounding off which can cause huge differences in the totals. Normally a field with 2 decimals with precision = 2 and format with 2 decimal places gives accurate results. If you do not need decimals in the cube then you can set the precision as much as the decimals in your data but that can also be overkill in terms of resources needed to build the cube!     

Lynn

Cognostechie, it was Michael75 who mentioned report data sources in response to your earlier post.

I agree 100% with your comments and I should have clarified further. All the reports that feed the cubes are built specifically for that purpose, so they are not available to users and are subject to the same change control and deployment processes as the transformer model and Framework Manager model.

Also, the joins are not between dimensions and fact tables. There are separate reports for each dimension and separate reports for the facts. The joins exist in the dimension reports since we are building cubes from operational systems which do not have denormalized dimension tables, so joins are required to build the complete dimensional hierarchies.

There is a good best practices document on Transformer cubes that might help Micheal. If you have not seen it then google Cognos Transformer Proven Practices...well worth reading.

cognostechie

Ok Lynn, we seem to be on the same page  :) The reason I put emphasis on this is because the beginners usually think
about the existing reports (which are made for reporting and not made specifically as data sources for cubes) and use them
in Transformer so I wanted the OP to know the implications.

Michael75

Thanks Cognostechie & Lynn! Your support and ideas are very much appreciated.

@ Lynn you should add to your profile of statesman (stateswoman?) the fact that you are a brilliant mind reader! Had I replied first to Cognostechie, I'd have written exactly what you did. We are indeed talking about reports which are:
-   created by me, for the specific purpose of supplying data to this cube
-   only accessible by me
-   stored in a folder called "Reports for Cube Builds"
-   each one handling only one dimension or one set of measures
-   a "measures" datasource report only reports measures, plus FKs in its fact table; any joins it may perform to dimensions are inherent to the FM model, and are not created by me

There's only one thing in your reply which is not applicable to the current situation. I'm not working off an OLTP system, but off a DM, whose design is quite respectful of best practices. But this is a global project within our company, I only have read access to the FM DM layer, and I've had to create, for example, a Customer Dim datasource report which reads 4 or 5 snowflaked dimensions around Customer, and includes my own filters.

Maybe I should also add that I was being a bit disingenuous, implying in my first post that I was a novice. I have:
-   followed a large number of IBM Cognos courses, including Transformer 10
-   read & understood "IBM Cognos BI as a Data Source for Transformer"
-   created several functioning cubes (including the "parent" of my problem cube, where there is 95% similarity – see my first post)

Having said all that, my original problem remains. I have an "Orders Measures" report which works perfectly as a datasource for my original cube.  From that, for my new cube, I created my "Key Account Orders Measures" report which runs perfectly as a standalone report (I've comprehensively checked the output of both), but which, when referenced in my Key Account Orders cube, generates rubbish results, and whose SQL, when checked by right click – Data Source Viewer, includes horrors such as:
    full outer join
    on  ( D3.sc = D2.sc and D3.sc2 = D2.sc2 )  and D3.sc17 = D2.sc4


I don't claim to fully understand the integration of Transformer with Cognos BI since 8.4, and I'm left wondering how Transformer can make such a different interpretation of a query which is, even if it comes from a specific report, package-based.

Nor do I expect you to have a snappy answer to this, but I'm sure that you'll have ideas which help me find the answer.

@ Lynn When you suggested searching for "Cognos Transformer Proven Practices", did you have other articles in mind than the one I've already mentioned?

Thanks again,
Michael

Lynn

Ha ha, brilliant mind reader is not a skill I can claim!

I can't imagine what is causing your problem, but I would check to see if all the columns in your report/data source are referenced within your cube. Could it be that minimized SQL is the culprit and changing the SQL being executed vs what the report layout includes? Wild guess, but that's all I've got.

The document I'm talking about was proven practices for cubes in general. I don't have the link handy, but google transformer proven practices.

Let me know if you solve it!


Sent from my iPhone using Tapatalk