Success Stories
E-Commerce Data Warehouse - InfoSpace, Inc.
(published July 2000, DM Review)
Background
InfoSpace is a worldwide leader in providing infrastructure services
to wireless carriers, merchants and Web sites. Through InfoSpace's
Promotions, consumers view online promotions either on wireless
devices or on Internet destination sites, such as American Express.
They then simply visit the online or physical storefront and make
a purchase as they normally would using their pre-registered credit
or offline debit card - with no need to print coupons or present
a copy of the offer. Consumers then receive promotions automatically
on their card account, with no extra effort on their part.
InfoSpace offers consumers, merchants, and distribution partners
an innovative, seamless method of enabling and measuring promotions
for online, off-line, and hybrid merchants at national, regional,
and local levels. One of the many important values for all partners
is InfoSpace's unique ability to close the loop in measuring the
effectiveness of each promotion at multiple points in the promotion
cycle. To enable this, the company has built a robust, high availability,
scalable, yet adaptable Internet solution to support the tens of
thousands of merchants and millions of consumers we expect to have
in the coming months.
Hardware
CoSORT is installed on Sun UltraSPARC servers running Oracle 7 and
Solaris 7.
Problem Solved
In order for [InfoSpace] to maintain a competitive advantage, we
must be able to efficiently receive, sort through, and select large
amounts of data coming from member banks that manage credit cards
and the InfoSpace Promotion transactions. The files we receive can
contain millions of rows of raw transaction data. We have to sort
and filter these massive files to find only those customers that
have registered online for promotions with client merchants, and
provide reports and usage statistics to our clients. To solve the
filtering issue, our data warehouse team looked at ways to increase
the speed of selecting and sorting these huge data files outside
of database. Receiving files from a number of different banks provided
another problem - multiple formats.
Product Functionality
CoSORT natively supports more than a hundred data and record types,
and has several UIs, including SortCL [CoSORT's sort control
language]. SortCL syntax allows us to specify, join and re-map
differently-formatted files from input to output, letting us "reformat
at will." In SortCL's input phase, record or file layouts
are defined. The processing action can be sorting, merging, joining
and/or reporting. In the output phase, selected records are further
filtered and/or remapped to one or more output targets. Cross-calculations
and multiple formats can be written to the same or different target(s).
The key to the performance is still sorting, of course, which CoSORT
drives across multiple CPUs.
In addition, CoSORT's sort control language (SortCL) program
is an SQL-like DDL and DML that not only natively supports different
data types, but allows us to recognize, join and re-map differently-formatted
files from input to output.
Strengths
By sorting and filtering data outside Oracle, files did not need
to be custom-loaded into the database, which would have required
a great deal of programming. CoSORT thus helped reduce or eliminate
the development costs associated with an internal solution, and
its join function is essential. CoSORT's ability to extract and
transform files from multiple banks so efficiently is an enormous
data warehousing advantage it has over other sort packages.
CoSORT's ease of implementation also helped cut our initial setup
time and costs. We did not spend a lot of time configuring CoSORT
to work with our parameters, nor did we have to make any changes
to the banks' data files; CoSORT handled it all. As we integrate
with more banks, we are confident, based on past performance, that
CoSORT will handle the large files and multiple formats sent to
us.
Weaknesses
SortCL's initial join syntax (file.field naming) restricted
us to using left-right file names without extensions or absolute
paths. A later version fixed this problem, however.
Selection Criteria
Our decision to purchase CoSORT was based on both performance and
functionality. After evaluating many sort solutions on the market,
we found that CoSORT was the fastest at sorting very large files
- critical to our daily operational time window - and the only sort
capable of performing an integrated (single pass) join to filter
records. In our evaluation, we found that CoSORT was 5 times faster
than the UNIX sort and 2-3 times faster than its major competitor.
Such speed allows our entire network to run more efficiently.
Deliverables
All CoSORT packages come with SortCL, sorti (an interactive
sort/merge program)¸ coroutine and subroutine API calls, and drop-in
replacements for the UNIX, MF COBOL, SAS, and Natural sorts. The
SortCL program can be called from the command line, in batch,
from another program, or even a new Java GUI launched remotely.
Also included are tools for converting metadata sources like COBOL
FDs, Microsoft CSV, and NCSA web logs into SortCL data definition
files. Several third party applications also hook directly to CoSORT.
Vendor Support
CoSORT's ease of implementation cut our set-up costs, and SortCL
is flexible enough to handle the multiple bank data formats without
asking for help. What questions we have are answered promptly, and
web support is also available.
Documentation
Traditional UNIX man pages accompany each deliverable, but the full
Acroread manual is searchable, and has more runtime examples and
logic flow diagrams.

|