If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Fact to Fact and Dimension to Dimension Join

Started by gpollock, 05 Aug 2015 11:58:39 AM

Previous topic - Next topic

gpollock

Hi all.  I'm trying to solve a problem with reporting multiple facts and dimensions where the reports aren't looking the way I want them to, and I'd like your feedback to see where I'm going wrong.

1) I have three fact tables relating to invoices and each table has its own grain.  Invoice header is at the invoice number level.  Invoice Detail is one level lower showing invoice line.  Finally, Invoice line errors is the lowest--a single invoice line could have 0 or more errors.  I really don't want to have one big table with all possible combinations, so I'm creating an invoice dimension that contains invoice number and line number, and the common elements, and joining the facts to the invoice dimension at the appropriate levels, using determinants.  I don't want to clutter the dimension with the possible invoice error types.  Good idea or bad idea?  As a note, at a later time I may need to add additional fact tables that go to a third grain, but it would be parallel with invoice errors i.e. the level would be Invoice Number, Invoice Line Number, unrelated key.

2) The invoices are joined to conformed dimensions, e.g. Vendor dimension at the header and detail, product at the detail and errors, and Date at all three levels.  I don't want to have redundant data, so I'd like to join these dimensions to the invoice dimension.  The argument against this is I would have dimensions joined to other dimensions.  The argument for is that for each new dimension, we only add it once instead of to all three fact tables.  Does this sound like a good idea to you?

I appreciate the help.

gpollock

Sorry, I forgot to mention that I'm on Cognos 10.2.1, and I don't think we're using DQM.

bdbits

A few thoughts that may or may not apply.

       
  • Is the invoice header really a fact table, or could it be a dimension? Does it have measures that are not aggregates of invoice details?
  • Are the errors from a collection of standard errors, i.e. could they be a dimension, perhaps linked to invoice details with a bridge table?
  • You should not normally be joining dimensions to dimensions, there are exceptions but it is generally not a good idea.
  • Data warehouses are seldom 3rd normal form, in other words it is ok to have some redundancies. This can take getting used to if you have a development background, as I did.

gpollock

Thanks for your help bdbits.

The invoice header has just a few non-aggregate measures, enough to justify it being a fact table.  Long story short, our data entry system doesn't divide certain measures by invoice lines (e.g. tax, shipping, etc.) and I'm using this example as a way of gathering principles for the moment instead of trying to guide the user requirement.

The errors fact table has three primary keys: Invoice number, Invoice line, and error code.  The error code is a generic error code, but is only specific to this fact table, so the decision was made to bring the error message into this fact as an additional field.  Not the most optimal, but this is a single fact table, and a single invoice line could have multiple errors depending on which validations it fails.

Not joining dimensions to each other seems to be the consensus, and I'm not surprised.  The main problem I struggle with is adding dimension foreign keys to every table.  Data redundancy rubs me the wrong way, even if it's just a foreign key.

On the bright side, I was able to model the query subjects so I don't need these keys on the invoice error table.  Just remember to set the identifiers on all tables, set the determinants on the common fields dimension appropriately.  For example, Invoice error table doesn't join to any dimensions other than the common invoice fields dimension; and the vendor dimension is joined to the invoice header table.  If I need to see all invoice errors for Vendor A, I need to make sure my query has a (any) value from the common dimension and the header fact, and then Cognos will stitch the values together.  I'm not 100% happy with this, but it will get the job done for now.  As another plus, we're rebuilding our entire data warehouse and model from scratch, so now's the time to experiment with our existing model.