Home » Solutions » ETL DB Acceleration » DataStage
Accelerate DataStage (Sorts & Transforms) 
Sort Plug-In, Faster Transforms, and Safe Test Data

Challenges:
Large data volumes (i.e. more than one million rows) can be slow to transform, even after consulting and tuning are employed. Particular bottlenecks are large sorts, joins, aggregations, loads, and sometimes unloads. Parallelization or optimization in other layers or tools can be somewhat unwieldy, if not expensive, and may create adverse performance impacts on other users.

Solutions:

1) CoSort Sort Stage Plug-In for DataStage
Speed sorting directly within DataStage Server Edition with CoSort's unique Sort Stage Plug-In for DataStage. This can improve sort performance up to 10X with no interface changes. Subsequent join, aggregation, and load runtimes should also benefit.

2) Fast Transformations alongside DataStage
By running the CoSort product's Sort Control Language (SortCL) program alongside IBM WebSphere DataStage Server or Enterprise Edition in the file system, you can perform fast sorts, joins, and aggregations -- all in the same job script and I/O pass. While running large data transformation tasks in parallel, you can also specify file-format and data-type conversions, field-level encryption and other data privacy functions, custom reports, and pre-sorted load files.

If you still wish to use the aggregation stage in DataStage, CoSort can help you improve its performance. Add a sequential file stage prior to the aggregation stage, and run a SortCL script to externally pre-sort the file on break keys. Then, define the sorted fields in the aggregation stage.

3) Safe Test Data for DataStage
IRI's RowGen package can generate safe, realistic test data against CoSort metadta, .dsx-defined files, and your RDB data models. RowGen users can create intelligent test data from random computation and/or set-file selection, and they can further format that data with the same data manipulation and formatting capabilities within CoSort.

To facilitate CoSort operations alongside DataStage, as well as the creation of realistic test data for DataStage, Meta Integration Technology's Model Bridge (MIMB) software can create SortCL and RowGen data definition files from the flat-file layouts you have already defined in .dsx format. This saves you from having to manually re-write all your input and output file field layouts, making it easier to run these tools with DataStage!

See also:
FAQ > DataStage
Solutions > Data Transformation

Solutions > Field Protection
Solutions > Business Intelligence
Products > CoSort > SortCL
Products > CoSort > SortCL Metadata
Products > RowGen (Test Data)
Meta Integration Model Bridge

make text smaller make text larger print this pageemail this page
» Resources
» Next Steps
1-800-333-SORT
1-321-777-8889
Did you find what you were looking for on this page?
YesNoUnsure

What you were looking for:

Include your email address if you would like a response.