We have column that contains a text description and we want to filter on certain keywords, and a column which shows which keywords have been filtered (as not all keywords are in the same description).
I've tried several options with CASE, IF and IN_RANGE statements, but do not get the desired report.
Example
Keywords:
Test1
Fox
Cat
Dog
Test2
Text description:
The test1 module has been tested and shows good results on cats and dogs.
The example has 3 keywords, test1, cat and dog, which I try to filter with cognos queries. The other 2 keywords, fox and test2 are not in text description examples, but might appear in another text description.
Hope someone can help me out.
I think what you have shown are your two inputs: The list of keywords for the filter and a sample text description.
Please show the expected outcome. More sample text descriptions, at least one that contains none of the keywords, would also be helpful to better demonstrate the problem.
I want to find keyword(s) in the text value. Im looking for AND/OR solution that the query loops in 1 text value for all keywords and then goes to the next.
Query sample that im currently using:
[Dataitem].[textvalue] CONTAINS 'Test1' OR
[Dataitem].[textvalue] CONTAINS 'Fox' OR
[Dataitem].[textvalue] CONTAINS 'Cat' OR
Etc...
I found a solution in SQL with cross join but this one needs to be done in Cognos Analytics, which I not found a good solution.
Quote from: dougp on 26 Feb 2025 11:42:17 AMI think what you have shown are your two inputs: The list of keywords for the filter and a sample text description.
Please show the expected outcome. More sample text descriptions, at least one that contains none of the keywords, would also be helpful to better demonstrate the problem.
It is correct that I want 2 outputs. So one column that has the filtered text description and one column that shows the filtered words.
But I have the problem that the cognos filters do not loop in a text field. The filter searches one word at a time in the text description so you get double results.
Desired output:
The test1 module has been tested and shows good results on cats and dogs.
The module has been tested and shows good results on fox. |
|
That's better, but I'd still need more clarification.
"cats" and "dogs" are not in your list of search terms, but you indicate you want them found. How would that work?
Does the word need to be the first, last, succeeded by punctuation, or surrounded by spaces? (...meaning, it's the whole word.) What if it is found as part of a word?
description | keywords |
People in recovery groups can be dogmatic. | dog |
Cattle can be scatty. | cat, cat |
And are the terms case-sensitive?
Quote from: dougp on 05 Mar 2025 11:59:39 AMThat's better, but I'd still need more clarification.
"cats" and "dogs" are not in your list of search terms, but you indicate you want them found. How would that work?
Does the word need to be the first, last, succeeded by punctuation, or surrounded by spaces? (...meaning, it's the whole word.) What if it is found as part of a word?
description | keywords | People in recovery groups can be dogmatic. | dog | Cattle can be scatty. | cat, cat |
And are the terms case-sensitive?
|
Would like to find part of the words. So if the parameter is cat and test, it also finds cats, cattail and test01, test02, testing etc...
It's also necessary in our filter because the input of the text description is not always grammatically correct and therefore we need to filter on multiple parameters in 1 text description field.
It does not have to be case sensitive.
Have tried the IF ELSE THEN statement in the Data Item Text description, but not did not give a result.
IF (COLUMN TEXT DESCRIPTION CONTAINS 'Test1') THEN ('Test1')
ELSE (IF (COLUMN TEXT DESCRIPTION CONTAINS 'Fox') THEN ('Fox')
ELSE (IF (COLUMN TEXT DESCRIPTION CONTAINS 'Cat') THEN ('Cat')
ELSE (IF (COLUMN TEXT DESCRIPTION CONTAINS 'Dog') THEN ('Dog')
ELSE (IF (COLUMN TEXT DESCRIPTION CONTAINS 'Test2') THEN ('Test2')
ELSE (NULL)))))
Specific to my last post, should the same keyword be listed twice?
Quotedescription | keywords |
People in recovery groups can be dogmatic. | dog |
Cattle can be scatty. | cat, cat |
|
Is "cat, cat" correct, or should it just be "cat"?
You didn't say which RDBMS you are using. If you can have SQL do the work for you, here's an example for SQL Server 2016 and later:
with
searchTerms as (
select *
from (
values
('Test1')
, ('Fox')
, ('Cat')
, ('Dog')
, ('Test2')
) q (Term)
),
input as (
select *
from (
values
('The test1 module has been tested and shows good results on cats and dogs.')
, ('The module has been tested and shows good results on fox.')
, ('People in recovery groups can be dogmatic.')
, ('Cattle can be scatty.')
) q ([Text description])
),
a as (
select
i.[Text description]
, t.Term
, (len(i.[Text description]) - len(replace(i.[Text description], t.Term, ''))) / len(t.Term) as TermCount
, len(t.Term) as TermLen
from searchTerms t
inner join input i on i.[Text description] like '%' + t.Term + '%'
)
select
a.[Text description]
, string_agg(left(replicate(a.Term + ', ', a.TermCount), (a.TermLen + 2) * a.TermCount - 2), ', ') as 'Keywords Found'
from a
group by
a.[Text description]
order by 1
If it matters what order the search terms appear in the output, STRING_AGG has an ORDER BY clause in 2017 and later
If STRING_AGG is not available, or if you are using SQL Server 2016 and need the search terms in a specific order, you can use STUFF:
with
searchTerms as (
select *
from (
values
('Test1')
, ('Fox')
, ('Cat')
, ('Dog')
, ('Test2')
) q (Term)
),
input as (
select *
from (
values
('The test1 module has been tested and shows good results on cats and dogs.')
, ('The module has been tested and shows good results on fox.')
, ('People in recovery groups can be dogmatic.')
, ('Cattle can be scatty.')
) q ([Text description])
),
a as (
select
i.[Text description]
, t.Term
, (len(i.[Text description]) - len(replace(i.[Text description], t.Term, ''))) / len(t.Term) as TermCount
, len(t.Term) as TermLen
from searchTerms t
inner join input i on i.[Text description] like '%' + t.Term + '%'
),
b as (
select
a.[Text description]
, left(replicate(a.Term + ', ', a.TermCount), (a.TermLen + 2) * a.TermCount - 2) as Keyword
from a
)
select distinct t1.[Text description]
, STUFF(
(
SELECT ', ' + t2.Keyword
from b t2
where t1.[Text description] = t2.[Text description]
order by t2.Keyword
FOR XML PATH('')
)
, 1
, 2
, ''
) data
from b t1;
To do it natively in Cognos, STRING_AGG and STUFF are not available. You'll need to have the visualization perform the work using a repeater:
<report xmlns="http://developer.cognos.com/schemas/report/16.2/" useStyleVersion="11.6" expressionLocale="en-us">
<drillBehavior/>
<layouts>
<layout>
<reportPages>
<page name="Page1">
<style>
<defaultStyles>
<defaultStyle refStyle="pg"/>
</defaultStyles>
</style>
<pageBody>
<style>
<defaultStyles>
<defaultStyle refStyle="pb"/>
</defaultStyles>
</style>
<contents>
<list horizontalPagination="true" name="List1" refQuery="Input">
<noDataHandler>
<contents>
<block>
<contents>
<textItem>
<dataSource>
<staticValue>No Data Available</staticValue>
</dataSource>
</textItem>
</contents>
<style>
<CSS value="padding:16px;"/>
</style>
</block>
</contents>
</noDataHandler>
<style>
<CSS value="border-collapse:collapse"/>
<defaultStyles>
<defaultStyle refStyle="ls"/>
</defaultStyles>
</style>
<listColumns>
<listColumn>
<listColumnTitle>
<style>
<defaultStyles>
<defaultStyle refStyle="lt"/>
</defaultStyles>
</style>
<contents>
<textItem>
<dataSource>
<dataItemLabel refDataItem="Text description"/>
</dataSource>
</textItem>
</contents>
</listColumnTitle>
<listColumnBody>
<style>
<defaultStyles>
<defaultStyle refStyle="lc"/>
</defaultStyles>
</style>
<contents>
<textItem>
<dataSource>
<dataItemValue refDataItem="Text description"/>
</dataSource>
</textItem>
</contents>
</listColumnBody>
</listColumn>
<listColumn>
<listColumnTitle>
<style>
<defaultStyles>
<defaultStyle refStyle="lt"/>
</defaultStyles>
</style>
<contents/>
</listColumnTitle>
<listColumnBody>
<style>
<defaultStyles>
<defaultStyle refStyle="lc"/>
</defaultStyles>
</style>
<contents>
<repeater name="Repeater1" refQuery="Query1">
<contents>
<textItem>
<dataSource>
<dataItemValue refDataItem="Terms"/>
</dataSource>
</textItem>
<textItem>
<dataSource>
<dataItemValue refDataItem="comma"/>
</dataSource>
</textItem>
</contents>
<masterDetailLinks>
<masterDetailLink>
<masterContext>
<dataItemContext refDataItem="Text description"/>
</masterContext>
<detailContext>
<dataItemContext refDataItem="Text description"/>
</detailContext>
</masterDetailLink>
</masterDetailLinks>
</repeater>
</contents>
</listColumnBody>
</listColumn>
</listColumns>
<sortList>
<sortItem refDataItem="Text description"/>
</sortList>
</list>
</contents>
</pageBody>
<XMLAttributes>
<XMLAttribute output="no" name="RS_legacyDrillDown" value="0"/>
</XMLAttributes>
</page>
</reportPages>
</layout>
</layouts>
<XMLAttributes>
<XMLAttribute output="no" name="RS_CreateExtendedDataItems" value="true"/>
<XMLAttribute output="no" name="listSeparator" value=","/>
<XMLAttribute output="no" name="decimalSeparator" value="."/>
<XMLAttribute output="no" name="RS_modelModificationTime" value="2015-11-25T21:38:24.820Z"/>
</XMLAttributes>
<queries>
<query name="SearchTerms">
<source>
<sqlQuery name="SQL1" dataSource="great_outdoors_sales">
<sqlText> select *
from (
values
('Test1')
, ('Fox')
, ('Cat')
, ('Dog')
, ('Test2')
) q (Term)</sqlText>
<mdProjectedItems>
<mdProjectedItem name="Term"/>
</mdProjectedItems>
</sqlQuery>
</source>
<selection>
<dataItem name="Term" aggregate="none" rollupAggregate="none">
<expression>[SQL1].[Term]</expression>
</dataItem>
</selection>
</query>
<query name="Input">
<source>
<sqlQuery name="SQL2" dataSource="great_outdoors_sales">
<sqlText>select *
from (
values
('The test1 module has been tested and shows good results on cats and dogs.')
, ('The module has been tested and shows good results on fox.')
, ('People in recovery groups can be dogmatic.')
, ('Cattle can be scatty.')
) q ([Text description])</sqlText>
<mdProjectedItems>
<mdProjectedItem name="Text description"/>
</mdProjectedItems>
</sqlQuery>
</source>
<selection>
<dataItem name="Text description" aggregate="none" rollupAggregate="none">
<expression>[SQL2].[Text description]</expression>
<XMLAttributes>
<XMLAttribute output="no" name="RS_dataType" value="3"/>
<XMLAttribute output="no" name="RS_dataUsage" value=""/>
</XMLAttributes>
</dataItem>
</selection>
</query>
<query name="a">
<source>
<joinOperation>
<joinOperands>
<joinOperand cardinality="0:1">
<queryRef refQuery="SearchTerms"/>
</joinOperand>
<joinOperand cardinality="1:1">
<queryRef refQuery="Input"/>
</joinOperand>
</joinOperands>
<joinFilter>
<filterExpression>[Input].[Text description] contains [SearchTerms].[Term]</filterExpression>
</joinFilter>
</joinOperation>
</source>
<selection>
<dataItem aggregate="none" rollupAggregate="none" name="Text description">
<expression>[Input].[Text description]</expression>
</dataItem>
<dataItem aggregate="none" rollupAggregate="none" name="Term">
<expression>[SearchTerms].[Term]</expression>
</dataItem>
<dataItem name="TermCount" aggregate="none" rollupAggregate="none">
<expression>(len([Input].[Text description]) - len(replace([Input].[Text description], [SearchTerms].[Term], ''))) / len([SearchTerms].[Term])</expression>
</dataItem>
<dataItem name="TermLen" aggregate="none" rollupAggregate="none">
<expression>len([SearchTerms].[Term])</expression>
</dataItem>
</selection>
</query>
<query name="Query1">
<source>
<queryRef refQuery="a"/>
</source>
<selection>
<dataItem aggregate="none" rollupAggregate="none" name="Text description" sort="ascending">
<expression>[a].[Text description]</expression>
<XMLAttributes>
<XMLAttribute output="no" name="RS_dataType" value="3"/>
<XMLAttribute output="no" name="RS_dataUsage" value=""/>
</XMLAttributes>
</dataItem>
<dataItem name="Terms" aggregate="none" rollupAggregate="none" sort="ascending">
<expression>left(replicate([a].[Term] + ', ', [a].[TermCount]), ([a].[TermLen] + 2) * [a].[TermCount] - 2)</expression>
<XMLAttributes>
<XMLAttribute output="no" name="RS_dataType" value="3"/>
<XMLAttribute output="no" name="RS_dataUsage" value=""/>
</XMLAttributes>
</dataItem>
<dataItem name="comma">
<expression>case running-count([Terms] for [Text description])
when count([Terms] for [Text description])
then ''
else ', '
end</expression>
<XMLAttributes>
<XMLAttribute output="no" name="RS_dataType" value="3"/>
<XMLAttribute output="no" name="RS_dataUsage" value=""/>
</XMLAttributes>
</dataItem>
</selection>
</query>
</queries>
<modelPath>/content/folder[@name='Samples']/folder[@name='Models']/package[@name='GO sales (query)']/model[@name='model']</modelPath>
<reportName>warm</reportName>
</report>