If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Transfer of Data between Analyst Cubes

Started by gidster, 11 Jul 2007 10:02:52 AM

Previous topic - Next topic

gidster

Currently we are downloading a huge (30,000+ row) file from a source system into Excel.  The download contains a bunch of columns...some of them are numeric and others are text based (eg Address Details etc.)

Our Users then go through these rows and apply some logic to them to flag any rows that need to be held in an "Error Pen".  They literally copy and paste these items into a separate Excel sheet.

I am thinking there has got to be a better way to do this via Analyst...but I can't figure out how to do it?!

My initial thoughts were to have a "Raw Data" cube.  I was thinking this would need to look almost exactly the same as the existing Excel dump (ie be a 2 dimensional cube).  The reason I thought this is because we would want to help identify the logic for them, and to do that we would need a flat file so that the calculations would work. 

I was then thinking we would have an "Error Pen" Cube.  And this is where my problems begin... I think the Error Pen would also need to be a flat file as once a row is flagged as an error we would want all the associated details in that row to be in the Error Pen... and I don't think any of the D-Link choices can achieve this.  Does anyone have any ideas?  ??? Thanks in advance...

PS: I think ultimately we would want this to be a Contributor Application - not sure if that might influence your ideas

jluckhoff

In my opinion neither analyst or contributor is the tool to do this in. Neither of then has the formatting functionality that you have with excel and flagging rows in it using d-list formatted items sounds like it could take a lot of time.

Not sure what the type of data is you are using but if you are just trying to clean it, why dont you look at using some transact or PL/SQL which you combine with scripting that uses regular expressions. If you for example if you want to clean your data from lines containing a particular invalid data type eg. false email addresses, you can easily build this into a function. 

Dont know if this will work as I really have no idea what kind of data you are talking about nor what you want to do with it. 

Its worth remembering that Cognos EP products are not great for flexible data manipulation in large volumes.

gidster

Thanks for your suggestion JLuckhoff (Jacques is this you?).  I can see why you are saying this, but it is not a strictly data cleansing operation. The 30,000 rows contain data about insurance policies and would include items such as the Policy Number, the Country the policy was taken out in, the Currency Used, the Premium Amounts, the Amount of Insurance Tax, the Address of the Insured etc.

The end user then needs to review this and apply certain logic (for example, if the policy is a UK issued policy it MUST have 5% Insurance Tax... if it does not (say it has in error been given the 3% tax for Ireland) the policy needs to be set aside in an "Error Pen", then the source system needs to have a 2% "corrective" tax transaction applied to it.  Then (at the next download) the End User can then match up this new transaction (with the 2% corrective tax) to the original transaction (with the 3% tax) so that they can pass the two transactions together (with the correct 5% tax amount) down the line.  And apparently it is not possible to fix the original error...you MUST apply a separate transaction that TOGETHER will fix it (not sure exactly why, but but I did ask and was told this is the way it has to be??!!!)

So I think this is more than a simple data cleansing exercise in the purest sense of the words.

The downloading is expected to be a monthly exercise.  And I am hoping there is some way of breaking down the 30,000 rows between the 15 - 20 End Users so that each one is looking at a manageable amount each month.

Given the above, what I wanted to try and do was spend a few hours trying to prove that this could work using EP.  But, as I said, I am stuck on how to transfer the "error" transactions from the "Raw Data" cube once I have flagged them as being errors into my "Error Pen" cube.  Is this even possible??!

Thanks again for your input.

Gideon.

jluckhoff

#3
Hi Gideon,

Yeah, its me - spending free my time on forums - pretty sad huh?

I see your problem. What you need is some good old SQL functionality in a dlink so you can say transfer where error = 'Y'  - You could always try asking cognos for an enhancement request  :)

If I were to do this I'd probably have 2 small contributor application - one to keep the raw data (just a single cube) and the other to load the error rows into.
Get the users to flag all the bad rows in application one then publish it, use some SQL to pull the data from application1's EV_ views into App2's IM_Tables while filtering on only bad rows , do an prepare import/gtp macro.

Not very slick I know but it should be robust. you can further make it user friendly by giving the users a link on asp page that can fire off the CEP publish - prepar routine, or you coudl run it periodically. If the app is as small as you say then a routine like this should not have too much impact.

I suppose another way to do this inside of CEP would be to sue some allocation dlink combined with an internal link or conditional to force cells to zero.
I.e. import data from raw cube to pen and then zero out all data that is not flagged as bad. This sounds messy as you would probably end up with a lot of empty rows but if your users know how to suppress zero rows in the client and dont mind using it, it might work.

Anyway, thats my two cents worth - how are you anyway - still at that insurance place in the city?

Jacques

gidster

Jacques...why don't you send me your details? It would be good to catch up.

jluckhoff

#5
would be good.


gidster

For those who hate it when a post is just left hanging without a resolution here is ours...or at least where we have got to!

Basically we are doing a lot of the 'work'' in SQL DTS and transforming the data into a Tab Delimited Text file so we can import it into Contributor (Copy/Load/Prepare).  This then pushes the data into Contributor.

The Contributor Model will have a certain degree of error checking (eg is the tax rate set correctly for each different country, are the address details completed etc.).  If these parameters are off, a "Check Me!" error flag will appear.

The User can look at these (as well as any other row of data) and Pick from a Yes / No item to select the row for holding in an "Error Pen"

We are going to be accessing the data via the Contributor Excel Add-In (because our users need to be able to Filter on the data, as well as use freeze pane etc.)

We are going to be using Contributors Incremental Publish to allow near real time reporting on these Cubes in Cognos 8.

The Cognos 8 reports will allow Users to take all rows with errors and try and fix them in the Source system.

And it will also let them take details down from all the OK (ie non-error) rows and push them down-line to the relevant people.

So basically we are using a combination of SQL and Contributor to meet our business needs.

There are still going to be some issues along the way, but this will be our approach....Hopefully it will work!


ykud

Quote from: gidster on 24 Aug 2007 10:16:05 AM
The Contributor Model will have a certain degree of error checking (eg is the tax rate set correctly for each different country, are the address details completed etc.).  If these parameters are off, a "Check Me!" error flag will appear.

Why not to use validation rules in EP 8.2?

ovo

Also if you are building DTS jobs, do not create flat file, populate the IM_ tables directly.  This will save both time and administrative overhead.

gidster

OK, so now we have had to make a serious change to our plans: basically, even though the total cells per E-List in our single, two-dimensional cube were less than 500,000 performance was soooo slow that we had to change plans (it was taking > 4 minutes to open up an E-List, and pretty much the same 4 minutes to mark an entry as an error!)

So our new plan is this:
Run a Cognos 8 report on top of the data in the SQL database.  As per some of the earlier suggestions, we are using the SQL database to try and build in the logic (eg is the tax 5%, if not then mark as a "CheckMe" etc.)
We have then built some custom javascripting (using a Cognos 8 HTML item) to add a checkbox to the report.
If the SQL logic has marked the row as a "CheckMe" the tickbox is marked with a cross.
Now, when a User runs the report it brings back the appropriate rows along with corresponding checkboxes.
All the User needs to do is review the row(s) of data and tick or untick the checkbox as they desire.
Whenever the User ticks or unticks the checkbox the javascript fires off a SQL Stored Procedure to update the SQL database.
This is achieved using the SQL Add In on Microsofts Web Services.
The response is as good as instant.
And we take Contributor's performace totally out of the equation!
In fact, this solution is actually better than the one we were originally going with, so all in all, a good result!
Of course, seeing as we are using Cognos 8 to actually update our database there is a risk that upgrading might break this process...but hopefully not!
Thanks again to everyone for contributing!

incognito

I loved the idea to embed an update procedure on a Cognos 8 Report.

It´s possible to share this code ?


gidster

In terms of actually sharing the code, I am not sure I even know how to do that... I just explained what I wanted to the IT gurus.  And the parts I understand are what I have explained above.  But hopefully, now others know you can actually enable Report Studio reports to automatically update data via a Stored Procedure, maybe they will do it!  Regards, and thanks again.