If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Filtering out illegal characters

Started by grifter102, 12 Nov 2012 02:19:00 PM

Previous topic - Next topic

grifter102

Is there any way to filter out illegal characters from a given dimension in Cognos?  The database data I'm working will is causing the "Illegal Character (CTRL-CHAR 26)" error to show up whenever I run the report.  I've talked to the DBAs and they say there's nothing they can do about the bad characters - they will not be able to filter them at a database level.  I've tried filtering them in the SQL, but the DB2 database I'm using has the CHR() function disabled (so I can't even identify which lines have the control character in them).

I need some way of doing this within Cognos... does it even exist?  I'd like to do something like:

filter([Vendor Name], not caption([Vendor Name]) contains \x26)

but I'm not sure how to identify the control character with Cognos (it doesn't like the regex \x26).  Is this possible?

Lynn

I'm not certain that I believe the DBAs can't do anything about those characters. Is your funky character always at the end or embedded anywhere within? If it is always at the end, you can try the example below.

It uses a substring to take position 1 of the string up through the end of the string excluding any bogus characters at the end. I had this situation due to carriage return/line feed in the database column.

The length of the substring is the total length of the field minus the length of the field after all valid characters are replaced with a space and then trimmed. Sneaky, huh?

For example, let's pretend the "~" is a control code character. If the field contained ABC~~ then the length would be calculated as 5. The translate would replace the ABC with three spaces, but leave the ~~ unaltered. The trim function would remove all the spaces and the length of the result would be 2. Take the original length of 5 minus the 2 and your substring would yield positions 1 to 3, effectively stripping off the garbage at the end.

Of course if yours are interspersed this won't help you. Also, the range of legal characters in the translate function may be broader for you than mine below which includes only uppercase letters and numbers.


substr(
    [YourItem] ,
    1,
     length( rtrim ( [YourItem] )) -
     length( trim ( translate( upper( [YourItem] ),
          ' ',
           'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'))))

grifter102

That's a pretty slick way of handling the problem... unfortunately for me the characters are interspersed.  There are occasional issues with the system which loads in the names and sometimes unicode characters are recorded incorrectly, resulting in this issue.  I'm getting the data after it's passed through 4 different hands (source system stores the 'illegal' character then sends it to the purchasing system which records it in the business warehouse which a separate database ETLs into the backend for the cognos installation).  So, I spent several weeks going to each group... the cognos backend people say they can't perform the translate via their ETL since there's no function for it (translation: too much work and they don't want to), the business warehouse says they just store data, they don't transform it, the purchasing system says that there is no problem with the data and the source system says they just record whatever is entered.  /end rant

End result is that if I want to get my report working, I'm going to have to do this myself somehow.  I like your idea of translating out the characters... I might be able to work something out with that... but if that doesn't work... does anyone else know of a way to specifically identify these control characters in Cognos to do a direct filter?

Lynn

Maybe you can use the translate idea to identify the location of the bad characters by putting in a tilde or accent grave or some other rarely/never used character. Once you know where they are and how many you have to deal with you could use nested substrings. It would be really ugly to do, but without a CHR() function I'm not sure how else. The source that prompted my solution below is also DB2 and I struggled for quite a while before finding a way around it.

If you could figure out the logic then maybe you could get the lazy-a** DBAs to put it into a UDF or build a view so it could be modeled in FM rather than relying on the report to do all the heavy lifting.

I feel your pain with database guys and the various systems people. It never fails to amaze me that people responsible for a data warehouse fail to realize that it only exists to provide reporting and analysis. If they don't address issues that hinder (or could improve) reporting and analysis then what are they there for?

Good luck and post back if you come up with something further!

navissar

There may be a way, through some kind of exhausting manipulation which would affect query performance, to achieve this; you will be stuck with that problem for ages on end, whenever you will want to develop something new you'll have to do this all over again. This, exactly here, is how BI projects fail - looking for solutions on the wrong end. BI systems, and Cognos is no exception, display data. The data they display is the data that is stored in the DB. If that data is problematic, it should be handled in the DB layer.

grifter102

Thanks, all, for the ideas.  I'm going to take another run at the various DBAs now that I have confirmed there is no proper way to do this in Cognos.

Also, since this issue only occurs with some small suppliers, I think I'm going to use the substr/translate idea and convert everything non-standard to spaces, then take the initial substring of the first X space characters, then slap a big "-ERROR" on the end of the string.  Should at least temporarily solve the problem and when the users ask about it, I'll tell them it's bad data in the database and see if I can't drum up some ire for the lazy DBAs.