Part of the governance role in protecting data at risk is assessing and
avoiding the use of unsafe production data used for prototyping. Realistic
test data is needed for:
• application stress and range-testing
• outsourced program development
• simulated database populations
• hardware and software benchmarking
• sharing custom file and report formats
Very often those who need test data merely replicate the production database,
or use snippets of real production files -- both of which are inherently
risky because they can expose sensitive, private data.
Meanwhile, generating safe, useful test data can be difficult because:
• integrity constraints cannot easily be preserved
• it's usually not in production file/report layouts
• you want to filter and transform it as it builds
• custom programs are hard to write or maintain
• most test data tools are slow or limited in scope
Solutions:
You can now recommend and use a tool that easily generates safe, referentially
correct test data in your database table and custom file formats for all
of the above uses. Specifically, IRI's test data generation package, RowGen,
can generate (and transform) intelligent test data in the same table, report
and production file formats you need to load, or send outside the firewall.
Any number, size, type, and position of fields, records, or files can
be specified in a single job script and I/O pass. Various selection and
ranging criteria can be applied to synthesize and stress applications.
With RowGen, you can simulate sensitive database and file data, and assure
your risk management officials that the data are safe. RowGen's logging and audit
functions are also available to help you verify compliance with privacy
rules.
Common Metadata
If you also use the SortCL program
(in IRI's CoSort package)
to define and manipulate production file layouts, you can also use the same
metadata to create safe test data in the same formats using RowGen. And
when you RowGen, the same metadata you use to define test data can be migrated
for use in SortCL to transform real data in the same format(s) when it becomes
available.