 |
LINGWAY LINGUISTIC AND SEMANTIC COMPONENTS |
These components constitute the core technology of the LINGWAY KM semantic development platform, and are the result of 100 man-years of development.
More about Natural Language Processing (NLP)
1- Linguistic resources and services
This component includes a set of linguistic resources and services (dictionaries and e-grammars), including a 60,000 word multilingual dictionary representing 150,000 concepts, and powerful automated language analysis module (word recognition through their different forms, recognition of compound words, recognition of word meanings, expansion of synonyms, identification of word groups…).
For more about linguistic resources
2- Multilingual semantic search
The “full text” multilingual semantic search component is based on the Lucène “Open Source” indexer. It can be used independently, but can also be interfaced with any type of “full text” indexer with a Boolean search language (connectors are currently available for Exalead, Hummingbird and Oracle).
The component interprets queries in natural language and translates them into Boolean queries adapted to the underlying indexer. It allows you to reduce the number of “no results” and “too many results,” facilitates access to all the documents corresponding to a query and enables non-specialists to access the information.
3- Search and categorization
The search and categorization component for sentences, questions and short texts, verbatim reports for example, enables you to efficiently classify data in specialized nomenclatures (International Patent Classification, for instance).
4- Text analysis and structuring
This component allows you to transform any type of text into an XML structure enriched by markings indicating the general structure of the text, identifying the named entities (persons, organizations, products, places…), thematic indexing (free or controlled descriptors) and key sentences or phrasesto facilitate reading. It also allows you to build up a database of extracted terms.
The module also enables you to create an automated indexing system that complies with the W3C Dublin Core standards. It is also very useful in projects that involve updating old Intranet sites and transforming them into more modern formats.
5- Name approximation search
This component enables you to access a list of names or terms with an approximate spelling: correction of typographical, phonetic errors, joined words and hyphenations.
It allows visitors to e-commerce sites to find the information they need must faster, for example.
|