If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Bridge tables in FM - cross join problem

Started by dougcl, 28 Nov 2008 01:28:00 PM

Previous topic - Next topic

dougcl

Hi folks, if any of you have Kimball's book, 2nd ed, I'm talking about Fig 13.4 on page 264 "Diagnosis group dimension to create primary key relationship."

Given: I have a fact table, Health Care Billing Line Item Fact. I have a Diagnosis Dimension with many diagnoses in it. Since each row of the fact can have many diagnoses, I introduce the concept of a "Diagnosis Group" and now each row of the fact table has a single diagnosis group associated to it. I then create a bridge table (really just a fact table) to associate each diagnosis group to a number of diagnoses.

Problem: If I select * on the fact table, I get duplicate group keys. Similarly, if I select * on the bridge I get duplicate group keys. If I then join the two results together on the group keys, I need a cross join to map the duplicates in both together. Cognos doesn't want to do this. When you ask Cognos to do this it decides that it needs to use RSUM to establish a distinct set of keys in each, because it doesn't want to make a cross join.

This seems like an idiotic situation, so I am hoping there is an easy answer.

Thanks,
Doug

blom0344

Quote
Since each row of the fact can have many diagnoses

Would you care to comment on this?
I assume that your fact stores a certain set of columns that each store a FK to the diagnosis table?
Can you give an example?

dougcl

Hi yes I can elaborate. As you suggest, one way to handle multiple attributes associated to a single fact row is to add columns to the fact table. In this case, you would have a column for each diagnosis, and maybe the columns would be diagnosis1, diagnosis2, diagnosis3,...etc. That's great if
1) You always have a fixed number of diagnoses, and
2) If the multiple diagnoses are typically of interest to the users of the fact table

In general, neither if these assumptions can be made. To resolve this, Ralph Kimball (and I am assuming others) have suggested the use of a "bridge table." A bridge table in this context is essentially a factless fact table that associates a single key to a number of dimension rows. Now you can use a single column for fk_diagnosis_group in your fact table, and allow the bridge table to associate that group to many diagnoses (one or more).

So an example would be as shown in Kimball's book Data Warehouse Toolkit 2nd Ed on page 264.

I found another example using google (in this case AccountToCustomer is the bridge table):



So there are many repeating values of any given account in the AccountToCustomer table, and there are many repeating values of any given account key in the MonthAccountSnapshot. So even though these tables are being joined by the conformed Account dimension, the join column involves duplicate keys on both sides. Cognos will not work with this, and instead issues RSUM statements to construct it's own unique keys. I am now looking at turning on the options to allow cross products, but it seems to have no effect.

Thanks for your interest,
Doug



blom0344

Kimball will give you a proper solution to store this data , but the DW gurus are not that keen on how access it by use of BI-tools. Neither is Cognos very well equiped to deal with these type of structures.
My first inclination (often pretty fast with Cognos) is to by-pass the Cognos native solutions and to write a straightforward SQL solution to the problem at hand.
(almost always use some sort of complex union   8) )
With your example I now understand what you are trying to achieve storage-wise. I would be interested to know what your reporting goals are..

JGirl

How do you have your tables modelled in FM (ie  how many query subjects, and what are the cardinalities between them)?

J

dougcl

#5
Quote from: blom0344 on 29 Nov 2008 01:38:10 PM
My first inclination (often pretty fast with Cognos) is to by-pass the Cognos native solutions and to write a straightforward SQL solution to the problem at hand.
(almost always use some sort of complex union   8) )
With your example I now understand what you are trying to achieve storage-wise. I would be interested to know what your reporting goals are..

in fact, it's easily realized in SQL by hand against the database as a simple outer join. The problem arises when Cognos (correctly) constructs a stitch query, but then constructs RSUMs to establish a set of unique keys.

My reporting goals are extremely general and simple at the moment. I want to be able to represent mulitvalued dimensional attributes in a star schema. There are so many real world examples of this (consider the above examples), I don't see how you can live without it.

Thanks for your interest.

dougcl

Quote from: JGirl on 30 Nov 2008 06:32:26 AM
How do you have your tables modelled in FM (ie  how many query subjects, and what are the cardinalities between them)?

J

The tables are modeled in the database exactly as they are shown in the example above (although of course the names and attributes are different). The cardinality is 1..n on the fact and bridge table, and 1..1 on all the dimension tables (exactly as expected). Nothing is done in FM except the minimum. There is a ds query subject for each table, the determinants are set, the dimensions created from this, and the hierarchies set. Extremely simple. If you have dealt with a bridge table before this situation should be very familiar.

Thanks,
Doug

blom0344

Cognos uses stitch queries to resolve many-to-one-to-many relationships as in 2 facts converging on the same conformed dimension. The idea is to be able to have different grain in facts and STILL have Cognos aggregate them to the proper level. Your multivalued dimensional attributes are a different scenario. I am not saying that it is impossible to resolve it through Cognos means alone. I am just stating that I can resolve some scenario's by working around them.
The scenarion may be trivial to you, but I think most designers will not have run into the issue at all..

dougcl

I have a fact table that represents a customer visit to my website.
I have a dimension table that represents customers.
I have a growing list of IP addresses for each customer, and I want to be able to drag an Ip subnet address onto my visit report and filter on them to get counts.
Does this example help? Are you familiar with bridge tables and what they are used for? Have you used a bridge table in FM? I can think of so many examples, I'm not sure how they lend further to an understanding, but just in case...

Thanks again for your interest,
Doug


blom0344

Yes, I have worked with bridge tables before, though that was in an earlier millenium with a better tool (Business Objects client)  ;D
What you call a fact is what elsewhere is called a factless fact / event / transaction table. These typically do not store measures but only relationships.
Cognos FM is primarily focused on the 'real' fact objects and uses cardinality settings and determinants to cope with the multi-fact scenario.
If - as you mentioned - you can think of a very simple solution in SQL, there is a good chance that you can enforce it in the Cognos model by setting all cardinalities to 1:1  or 0:1 to mimic an outer join.

I hope you solve your scenario..

dougcl

Hi, changing all the cardinalities to the bridge/factless fact/fact table seems to "work" although I'm not sure why. I guess what this does is tells Cognos not to try to aggregate measures without double counting. It eliminates the stitch query, which I think is unfortunate, because I believe this will create a performance hit. It's not necessary to eliminate the stitch query, as it runs fine if created manually on the db without the RSUMs. But in any case, thanks for this advice, as it is worth investigating.

I suppose a question that came up was how to turn on "automatic summarization" probably also known as "automatic aggregation" since this appears to be another possible solution, based on information in the FM training materials. This is no doubt an extremely simple question, so hopefully someone has the answer.

Thanks,
Doug

blom0344

Perhaps I am not clear enough on the standard behavior, but the stitch query is primarily meant to solve the correct aggregate level if 2 'true' facts are involved (meaning facts that contain measures like quantities,amount etc) The stitching is performed on the Cognos server as it applies a full outer join between the 2 subqueries. I do not understand why you expect a stitch to be a good performer..

Automatic aggregation is applied to a measure by setting the aggregation type to the correct value in FM.
So , quantities will have to be summed, but non-measures are either not aggregated or counted. I believe this is a very elegant solution that Cognos provides as it does away with hardcoding these expressions..

jsperson

Hello, I'm reviving an old topic because it is quite pertinent to a problem I'm having right now.  I have a conformed dimension that joins two fact tables.  When using QS and querying both fact tables, Cognos creates a stitch query, but uses only the common fields that are in the query to join on for the full outer join.  From reading the documentation, I would expect the full outer join to be on the fields that the tables have in common.  Turns out this only happens if you include the fields in the select clause. 

This is not really an acceptable solution as it requires returning all detailed records rather than aggregating.  Also, it opens the user up to creating joins on unindexed fields.  I can't believe this is operating as designed. 

Any insight would be appreciated.

Cheers!

blom0344

The behavior you describe would AFAIK indicatie that determinants are not properly set. If Cognos detects a n--1--n path it ALWAYS tries to solve by splitting and stitching back again, but I guess It then needs to make a guess how to apply the outer join..
does that make any sense?   :'(

jsperson

That makes a lot of sense.  In fact, I think the stitch method is the correct behavior.  My only contention is that it's not stitching on the right fields.  Page 354 of the 8.3 FM manual says that it should full outer join on the common keys.  What they should have said is that the full outer join will be made on the common fields in each select clause.  This means that the user has to determine what the common fields are and drag them over to the query pane.  Even then, dragging over the surrogate key returns a row for every detail record and that messes up aggregation.

I have the common surrogate key defined as a determinate in both fact tables and the conformed dimension.  The fact tables are correctly indicated as 1-n.

At this point I have convinced the first line Cognos support that this is a bug.