If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

ETL Design

Started by Srik, 13 Dec 2005 10:38:49 AM

Previous topic - Next topic

Srik

Hi,

what are the best practises in ETL Design using Decision Stream ?


Thanks & Regards,
kanth

MFGF

Wow!  That's a huge topic, which can only really be covered by attending the DecisionStream training.  It's a bit like asking "what are the best practices for driving a car?"

People will be able to give you some basic ideas in a forum like this, but you're not going to be able to appreciate anything like the whole story.

Topics include things such as
not delivering dimensions in fact builds.
running dimension builds prior to fact builds in order to generate surrogate keys for dimensions, then utilising lookups in the fact builds for surrogate key substitution.
configuring dimension builds to accommodate type 1 and type 2 scds.
configuring fact builds to handle late arriving facts.
setting up an environment where data is brought in, cleaned, merged and transformed via a staging area, enterprise layer and finally delivered to a presentation layer.
configuring jobstreams to detect failure and log appropriate messages
etc etc etc

Those who already use DecisionStream could write pages and pages on any of these topics, and still not fully explain them in this kind of forum.

Bottom line is to recommend the training, where all this is explaied and demonstrated.

Best regards,

MF
Meep!

Blue

G'day Kanth

I think your question should be broken up into two parts:

1. What are best practises for ETL in general?
2. What are best practises for DecisionStream in particular.

The answer to the first part will be the same whether you are using DecisionStream, Informatica, DataStage, DTS, QTP, or any other ETL tool or programming language.  Ralph Kimball has a great book dedicated to this subject and I suggest to read it.

The answer to the second part is a bit more complex and depends on your specific situation. 

I prefer keeping the use of DecisionStream "features" to a minimum and doing most of the work in the SQL code and using the wizard tools.  This is because it is A) easy to understand by other people and therefore cheaper to maintain and B) relatively simple to migrate the logic to another ETL tool when the client or the vendor decides that another tool is preferable to DecisionStream.

That being said I have seen some quite interesting things done with DecisionStream features including calculations, defaults, and the use of environment variables.  But every programmer wants to do it things his or her way and the result has been confusing to me.  Logic gets obscured by these features.  I do support the basic features though; the ones found in the wizards as well as lookups to get surrogate keys.

Have fun!
Robert Edis
Principal
Robert Edis Consulting
Rotorua, New Zealand

Srik

Thank you everyone