Split CSV into multiple lines

msawyer · 15 Nov 2013 12:05:52 PM

I am working on a report in Report Studio and need to split a field where the values are separated by a comma (ex: S,M,L,XL). I need to split these values into their own separate lines. I also have other fields in the report.

Current data looks like this:

Style# Color Sizes
123456 123 S,M,L,XL

I need my data to look like this:

Style# Color Sizes
123456 123 S
123456 123 M
123456 123 L
123456 123 XL

bdbits · 15 Nov 2013 04:46:36 PM

If your data is dimensional, or even in a star schema, the best answer is to have your ETL developer or package modeler split that into a dimension, or more likely a level in a dimension, for you. That is what it actually appears to be when I look at your example.

If it is relational, depending on the database you could probably use pass-through SQL and use a pivot table. A stored procedure could also be an option.

Cognos report expressions just don't really offer a lot in terms of parsing strings. There are functions to split strings into arrays, but I am not really sure how you could reference the individual elements of the array. If you could find a way to do that, you could create rows in the result set with a data item that represents the Nth array item, and filter out those rows that do not have an element in that position (they would be null or empty strings). In any case, if you are dealing with larger data sets the performance would likely not be very good with a lot of manipulation like this.

The best answer really is to split it out before it gets to Cognos, in my opinion.

TheCognosDave · 18 Nov 2013 09:02:30 AM

There might be a simpler option ... what database are you reporting against ?

msawyer · 18 Nov 2013 10:14:45 AM

I believe the database we are using is Teradata.

TheCognosDave · 18 Nov 2013 11:11:55 AM

I think you can do this using SUBSTRING in Teradata SQL with something like this....

select substring ('S,M,L,XL' from 0 for position (',' in 'S,M,L,XL'))
,substring ('S,M,L,XL' from position(',' in 'S,M,L,XL') +1 for position (',' in 'S,M,L,XL')-1)

and so on...

Lynn · 18 Nov 2013 11:52:13 AM

It is possible, although tedious, to parse out the individual elements as separate columns on the *same* row. When you have a variable number of items to be parsed it becomes even uglier.

Transforming one row into multiple rows gets a bit more involved and performance issues are likely to arrive on the scene.

I'm with bdbits on this one. Let the database do the heavy lifting.

MFGF · 19 Nov 2013 11:35:42 AM

I agree. Having "multiple values" within a single data item on a row doesn't even conform to First Normal Form. Pretty much every mainstream BI reporting tool is geared up to generate SQL queries in a relational database, and expects data to conform to at least Second Normal Form. Trying to work around these kinds of data structure issues in each report (or in the metadata) is really ugly and hugely inefficient - SQL is not designed to do this easily. The absolute best way to resolve it is to normalize the data and (if you have chance) deliver it into star schemas.

Cheers!

MF.

BSP Software	Resources	About Us
MetaManager	BSP Software Training	BSP Software
Integrated Control Suite	YouTube Channel	Micro Strategies Inc
Security Migration		IBM Cognos
Integrated Management Suite

If you are unable to create a new account, please email support@bspsoftware.com

COGNOiSe.com - The IBM Cognos Community

News:

Split CSV into multiple lines

msawyer

bdbits

TheCognosDave

msawyer

TheCognosDave

Lynn

MFGF