DIMAP (DIctionary MAintenance Programs)
DIMAP-4 is now available in a beta version and incorporates definition
parsing for building semantic networks, the word-sense
disambiguation functionality from Senseval and the
question-answering capability from TREC-8, using the Proximity Parser.
DIMAP-4 is available at a special cost of $300 for single users (ordering information), with a 45-day money back
guarantee, and with guaranteed updates for six months. (Note that DIMAP-4 does
not include MCCA, but with your purchase, you will also receive access to
DIMAP-3 which does include this functionality.)
DIMAP-4 Features
- Dictionary Management
- open any number of dictionaries at a time, with quick overview of
dictionary entries;
- rapid opening of recently used dictionaries;
- merge several dictionaries; and
- easy backup or renaming
- Entry Management
- easy editing, deletion, and creation of new entries
- quick overview of an entry's senses
- create bilingual dictionaries entries in source language and dictionary
information in English (enabling use of most of the DIMAP functionality)
- Entry Maintenance
- multiple senses
- special fields for
- part of speech (customizable)
- definition identifier
- label number
- usage label
- definition
- usage note
- multiple superconcepts (hypernym, genus)
- multiple instances (hyponyms)
- multiple feature structures (attribute-value pairs)
- multiple roles (semantic relations), each with multiple links
- customized editing for superconcepts, instances, feature structures, and
roles
- Search and Analysis of Entries/Senses
- Regular expression search on all fields, with search results shown on
screen or printed to a file (with format to your specifications)
- Extract subdictionaries (using search mechanism to create a file of
selected entries that can be uploaded into a new DIMAP dictionary)
- Definition parsing (using the Proximity Parser)
- Parse individual definitions or all definitions, in step or batch mode
- Start at any entry, with position remembered between sessions
- Automatically identify and/or add semantic relations discovered during
parsing (including synonyms), with user-customizable regular expression
patterns for recognition
- Diagnostic definition parsing aids (print to files such things as parse
output, identified semantic relations, bad parses, definitions with no
identified semantic relations, comparison to WordNet hierarchy, and unknown
words)
- Compare and map definitions across dictionaries, using either of two
methods
- Useful for mapping among a main dictionary and (independently developed
derived dictionaries)
- All or individual entries, with or without stop list
- Word overlap using best fit
- Componential analysis using score based on matches between hypernyms and
other semantic relations, using WordNet synsets to allow "fuzzy"
matches
- Optional use of reference mapping to highlight potential definition
differences
- Analysis of dictionary digraph based on hypernym links to identify
primitive senses among the definitions (for whole or partial dictionaries such
as thesaurus groupings)
- Summarizes hypernym links among entries
- Identifies non-primitive (derived) entries and senses
- Identifies primitive defining vocabulary
- Identifies definitional cycles
- Particularly useful when thesaurus entries are linked to definitions
- Conversion routines for
- Uploading dictionary data from other sources (requires specific format)
- Downloading dictionary data for use elsewhere according to your own format
- Template editor to facilitate format specification, including addition of
your own strings (such as SGML, HTML, or XML codes)
- Lexical acquisition features to facilitate batch creation of dictionary
- Create dictionaries based on analyzing your own texts
- Create empty dictionary based on all words in text (lexical analyzer can
create empty entries of words in non-English, Latin-1 languages)
- Create entries based on capitalized phrases (allowing for certain
"join" words such as "for" and "of), approximating
named-entity acquisition
- Create entries based on longest contiguous non-interrupted phrase without a
stop word or punctuation separator, approximating compound noun acquisition
- Create entries based on WordNet lookup (thus creating a WordNet subset
corresponding to your texts)
- Integrated WordNet lookup, with all information converted into DIMAP
format, thus allowing a word-based use of WordNet, rather than a synset-based
use
- Text parsing using the Proximity parser
- Parse individual sentences in text window (saving parse results, if
desired). (See our online
demo of the sentence parser.)
- Parse one or more files creating databases of discourse entities, their
semantic roles in the sentence, and their governing word (particularly adapted
for use in processing TREC documents, containing multiple texts, for use in
question-answering track), with options for
- parsing individual texts
- parsing a range of texts
- Parse one or more files for word-sense disambiguation (particularly adapted
for Senseval-style training and test data)
- Can be used with corpus instance files
- Modify DIMAP dictionary entries (used in the disambiguation) to test out
lexical entry information for disambiguation
- Evaluate disambiguation results against a tagged (training) set
- Examine disambiguation results against an untagged corpus
- Question-answering using the Proximity parser
- Build text databases (for example, of help files or historical texts) using
text parsing
- Create or import question sets (such as those used in the TREC
question-answering track)
- Build your own question set interactively (such as frequently asked
questions)
- Parse questions and build a question database (showing well-formedness and
the "discourse" structure of the questions)
- Answer questions individually or in batch mode (with top five answers saved
to question database)
- See (sentence) answers immediately and highlight to see answers in context
in the document to which they pertain
- Link results to answer key and see how well your texts provide answers
(scored using the TREC question-answering inverse rank)
- Integrated context-sensitive help file
- Can be customized easily to integrate user dictionary as a lexical
resource; and
- Can be customized easily to add other features to meet user needs
To CL Research Home Page