If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

data sets and limitation

Started by HalfBloodPrince, 12 Nov 2018 12:42:27 AM

Previous topic - Next topic

HalfBloodPrince

Hi Experts,

I was reading about the new feature of Cognos 11 datasets.  could some help me on the below points.

1] In data set can we pull data from multiple sources( multiple packages,data sources )
2] Is there any limit on the size of the data that can be stored in one dataset ?
3]  on server , where the data sets values will be stored ? what impact it will have on performance if data sets are very large ?

Thanks in advance

MFGF

Quote from: HalfBloodPrince on 12 Nov 2018 12:42:27 AM
Hi Experts,

I was reading about the new feature of Cognos 11 datasets.  could some help me on the below points.

1] In data set can we pull data from multiple sources( multiple packages,data sources )
2] Is there any limit on the size of the data that can be stored in one dataset ?
3]  on server , where the data sets values will be stored ? what impact it will have on performance if data sets are very large ?

Thanks in advance

Hi,

If it helps you understand data sets, think of them as simply a way of pulling data out of any supported connection(s) and loading the data into CA's columnar data store.

1. Yes, provided you have a single package or data module that spans all the data you need. You will be building a list object to extract the data, so the underlying data sources/packages will need to be joined in the metadata.
2. Don't treat the columnar data store as a replacement for your production data warehouse. It works best with small-to-medium data sets. Although the administrator can set limits on the size of uploaded files, I'm not sure if these limits apply when creating data sets, so my advice is to use common sense and not try to cram the world's supply of data in there :)
3. The data is stored in the content store database. When you need access to the data, it is loaded into files in Apache Parquet format on the filesystem of the Cognos server, and from there into memory on the Cognos server. You can imagine what trying to load terabytes of data into memory might do to your server :)

Cheers!

MF.
Meep!

HalfBloodPrince

Its very helpful. Thanks a lot, MF ;D

the6campbells

Datasets/files uploads both produce an binary columnar file format which is highly compressed (aka Apache Parquet). 

If a user attempts to upload a file (could be a zip in 11.1), they will be unable to upload a file who size (bytes) is > limit set by Admin or the total amount of space their set of uploaded files can occupy.
Datasets do not impose those limits.

As a very simple rule of thumb, total size (bytes) of Parquet file will be similar or smaller than say 7'zip of raw CSV file.
Bottom line, subject to the # of columns, their precision and degree of compression, a single file could hold many rows of data.

Don't upload columns you will never use. It wastes time to produce the file and space etc.
Avoid trying stuff hundreds of columns into a row in a dataset.
While you can stuff several millions of rows into a single dataset, you should evaluate what you are trying to achieve.

See customer docs for other pointers.