If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Star Schema for This Table Setup?

Started by Francis aka khayman, 30 Aug 2018 05:02:22 PM

Previous topic - Next topic

Francis aka khayman

Not sure where I should post this as this topic is more of tables design. Please move to the appropriate location.

In my previous work, usually, sales data is stored in a table this way:

Sales Table
Month_ID   Sales_Amount
201801          50
201802        100
201803        200

We then use ETL to create star schema with the data from table above as our fact table. Then we use the star schema in our FM model.

In my current work, data is currently stored in such a way that the latest figure is updated monthly:

Sales Table
Month_ID    Sales_Amount
201801         50
201802       150                     ----> jan (50) + feb (100)
201803       300                     ----> jan (50 + feb (100) + mar (200)

Question:
Would you recommend to use ETL to convert this into star schema similar to the first setup? and use the star schema in FM model?

cognostechie

From a join and query generation perspective, it will not cause any problems but it will not be of much use either. The 2nd example you gave (your current work place) seems to be a table containing cumulative data which might have been built for a specific report. It will definitely not provide data for multiple reports. Even the 1st example is a summarized table. In a proper star schema design, the fact data is expected to be at a granular level to enable you to make detailed as well as summarized reports.     

bdbits

The first method is by far preferable. You should be persisting data, not running totals. And it would be a total pain to e.g. produce a chart showing monthly sales. Whereas if you have sales by month, it is trivial.

Likely even better would be transactional sales data with appropriate dimensional attributes, but that's a whole 'nother level beyond your question.

If you can change it, you definitely should.

sdf

Would it be possible to maintain both method in one model?
There may be reports that need the second method for running totals.
Something like alternate hierarchy in multi dimensions.

Francis aka khayman

thanks for your insights.

just some more info...
... i used sales data so that I could present clear example/comparison only. the actual system is something like loans payment. for some reason, the system tables were designed to store cumulative data and we have to create cognos reports against it.

our current cognos reports use SQL as data sources because the FM model is not very well designed. i am thinking of creating a better FM model. That is why i am asking if I should propose to have ETL process to convert to star schema or just create an FM model from the system tables.

so far my current choice is NOT to have ETL and star schema... unless a more superior reasoning is given to the contrary

MFGF

Quote from: khayman on 31 Aug 2018 08:26:07 AM
thanks for your insights.

just some more info...
... i used sales data so that I could present clear example/comparison only. the actual system is something like loans payment. for some reason, the system tables were designed to store cumulative data and we have to create cognos reports against it.

our current cognos reports use SQL as data sources because the FM model is not very well designed. i am thinking of creating a better FM model. That is why i am asking if I should propose to have ETL process to convert to star schema or just create an FM model from the system tables.

so far my current choice is NOT to have ETL and star schema... unless a more superior reasoning is given to the contrary

With your data in the form you describe above, you're not going to be able to write reports easily and efficiently that show (for example) the March sales amount. The issue is that the amount stored in the table on the March row actually relates to multiple months on multiple rows, and SQL based relational reporting tools are not designed to be able to handle this. You are going to find it really difficult to model this in FM to deliver accurate, consistent results, and even if you do manage to build something that works, it will most likely be hugely inefficient and slow. If you can go the ETL route and use your ETL process to calculate the real amount for each month and store this in your fact table, reporting will be simple and efficient.

Just my humble opinion, so feel free to ignore :)

Cheers!

MF.
Meep!