Core Technology
Text-Mining Automation: Based
on proprietary pattern recognition technology, the Praedea
DEP utilizes genetic algorithm-enhanced machine learning systems
to extract specific, targeted data from documents. In a two-stage
process, the Praedea DEP first parses the document(s) into
a hierarchy of generic components to decompose the often lengthy
text into a logically connected set of similar text elements.
Next, the Praedea DEP applies a statistical approach to identify
and extract the targeted data from the parsed document(s),
in which genetic algorithms are invoked to compute optimal
data location determinants. These location determinants include
both distance and key phrase indicators. Unlike traditional
parsing methodologies that merely look at a limited number
of specific examples or string occurrences in attempting to
locate targeted data, the Praedea DEP observes hundreds or
even thousands of key phrases and text pattern layouts that
surround the targeted data to provide for greatly heightened
data extraction accuracy and success rates.
While the text-mining technology and underlying processes
of the Praedea DEP are extremely complex, the Praedea DEP
presents users with a simple user-interface utilizing a step-by-step
“Term Model Wizard.” The Term Model Wizard enables
users to independently build and optimize text-mining term
models – all without the need for in-house engineering
or text-mining expertise.
Technology Platform: The Praedea
DEP runs on operating systems compliant with the JavaTM
version 1.3 or 1.4 technology and conforms to the published
specifications for the SunTM
SolarisTM, Red Hat®
Linux®, HP-UX, and Microsoft®
Windows NT®, Windows®
2000 and Windows® XP platforms.
The Praedea DEP is broadly compatible with most database
platforms. Customers are required to have their own database
license(s) when the Praedea DEP is implemented in a production
environment or, alternatively, to utilize open source offerings
such as the MySQL® database.
The Praedea DEP requires a JDBC-capable RDBMS install on a
database server machine. This machine can co-reside with the
Praedea server or be IP addressable from the Praedea server.
|