Excerpts from the IRI White Paper:
Safe, Realistic Test Data: The Case for RowGen
1) New applications being developed; data needed for user acceptance testing and benchmarking
In one scenario encountered by a RowGen user, a high volume of test data was needed to support a new application. In this instance, there simply was no production data to draw from for testing. With the large volume needed to test the boundaries of operations, the program manager estimated that the project would require roughly 120 hours of development time to build the test data set.
Using a conservative estimate of $100 per development hour, he was looking at an unplanned cost of $12,000 against the fixed-price project budget. This is on top of the delay incurred by reallocating a developer who was providing code for the primary project deliverables.
By bringing in RowGen, connecting to the model repository, setting up and running a script, the manager realized the following benefits:
114 hours development hours saved which equates to $11,400 in cost avoided
Nearly three (3) weeks of project schedule reclaimed
A repeatable process to create test data that will follow future data model revisions
A method to generate new test data for future releases to the application being developed
Error-free test data
Reduced project risk
High initial ROI for the RowGen application
2) Safe, Referentially Correct Data Needed for Integration Testing
While working on a systems integration project, a technology lifecycle consultancy called FlexITy Solutions required massive quantities of referentially correct test data. The project required the operation of a transaction database and the transmission of query results to validate the systems being installed. FlexITy's client, a financial institution, could not turn over any production data for testing due to regulatory and privacy concerns. FlexITy had to produce and incorporate test data that permitted queries against a representative database, and to transmit those results over a long-haul connection. RowGen was the logical fit for these requirements.
When the Solution Architect, Technical Lead, and Project Manager confronted the problem, they considered the same alternatives presented here. They estimated that 200 man-hours of development time would be required to build the scripts necessary to populate the tables -- work that would be hard to apply to future projects, given this is only a point solution. The time that could be saved using RowGen would provide a competitive advantage. Also, RowGen would be reusable for future projects. The decision was therefore made to move forward with RowGen.
The server used for the effort held four (4) CPUs and 8GB of RAM. The data model they decided upon contained 16 tables with referential integrity. When the RowGen job launched, the team was impressed when, after four hours, the entire suite of test data was generated. They now had 33 million rows of data spanning 20GB of storage. In just over 16 hours, they saved a potential 184 hours of development time. The 184 man-hours that were saved were reallocated to other areas. From a financial perspective, those hours represented a significant savings for the project budget. Total savings, with the cost of RowGen factored in, were estimated at $20,000US, considering a fully loaded rate of $150US per hour. Whether applied to the project margin, or reallocated to bolster efforts in other areas, the savings represent a significant benefit.