Information Extraction - Facilitating Smart Data Management
A wealth of information is hidden within unstructured text located in company documents, e-mails, newspaper articles, web pages, etc. This information is best exploited in a structured or relational form, which is more suitable for searching and integration with the relational databases, and for text mining.
Accordia’s Information Extraction System produces a structured representation of the information that is buried in unstructured text documents: free-text documents written in natural language, and semi-structured pages.
There are three major components of our Information Extraction System:
- The Named Entity Recognizer (NER), which finds and classifies:
(1) the names of people, organizations, and geographic locations
(2) the date and time expressions, percentages, and money amounts
- The Co-Reference Resolution (CoRe) module, which discovers
- identity relations between entities in and across documents.
- The Relation Extraction (RE) module, which finds relations between
- recognized entities.