Alphabetic WordNet 3.0 WordNet (3.0) has been totally converted into alphabetical format into a DIMAP dictionary. This conversion uses the official WordNet distribution.See the full description of how the alphabetic version of WordNet 3.0 was created, then follow the link here to register your download. A separate entry has been made for each distinct word (including underscore words) in every synset of WordNet, with a distinct sense for each synset in which the word appears. All information available in WordNet has been converted into DIMAP format. All hypernyms have been entered as DIMAP superconcepts; all hyponyms as DIMAP instances; and all other relations (synonyms, meronyms, holonyms,troponyms, antonyms, pertainyms, entailments, causes, similars, and also sees) as distinct DIMAP roles. All verb frames have been explicitly converted into various kinds of features in DIMAP senses, with complex frames explicitly represented as collocation patterns. Each sense has been explicitly identified with an id feature corresponding to the WordNet file number and sense number. Adjective types are explicitly identified in DIMAP features. Glosses have been taken apart into definitional components, example usages, and grammatical patterns (usually prepositional accompaniments). Full details of the conversion can be found in the DIMAP help file. A separate Heads dictionary contains one sense for each word that appears as either the final word of mutiword and hyphenated noun and adjective entries or the first word of multiword verb entries. DIMAP dictionaries are available for earlier versions of WordNet (from version 1.5 onward). The size of the compressed files is approximately 20 MB; uncompressed, the size of the DIMAP dictionaries is 55 MB.
UMLS Specialist Lexicon The UMLS Specialist Lexicon (2006) has been totally converted into alphabetical format into a DIMAP dictionary. The Specialist Lexicon of the Unified Medical Language System is designed for the specialized lexical needs of medical community. This lexicon contains over 220,000 terms and was developed to provide the lexical information needed for the SPECIALIST Natural Language Processing System. It is intended to be a general English lexicon that includes many biomedical terms. Coverage includes all commonly occurring English words, as well as specialized biomedical vocabulary. The data elements in the lexicon describe syntactic characteristics of each entry, including inflection codes, case, gender, syntactic category, complements for verbs and nouns, modification types for adverbs, and more. This is lexicon was developed as a free, publicly available resource, with only moderate restrictions (e.g., you can't claim it as your own).
Alphabetic FrameNet Dictionary The FrameNet 1.3 data have been converted into an alphabetic dictionary. This dictionary contains 9471 entries, with 7575 entries for lexical items (many having multiple senses with different parts of speech) and 1896 entries that encode the frames and frame relations. Details of these items can be found through the main FrameNet site. A more detailed description of the DIMAP dictionary and how it was used in SemEval-2007 can be found in the paper "CLR: Integration of FrameNet in a Text Representation System".
The Preposition Project Data Data from The Preposition Project Online include a DIMAP dictionary of all English prepositions (November 2008) (courtesy of Oxford University Press), containing much of the data and with disambiguated hypernymic relationships as used in the digraph analysis of preposition classes.
DIMAP ODE Dictionary The electronic versions of the Oxford Dictionary of English are not publicly available, but may be licensed through CL Research for research purposes. See Oxford University Press - CL Research Collaboration for details (also see paper on the synergy between NLP and computational lexicography). DIMAP versions are available for the 1st edition (1998), the 2nd edition (2003), and the next edition currently under development.
DIMAP Macquarie Dictionary The electronic versions of the Macquarie Dictionary are not publicly available, but may be licensed through CL Research for research purposes. See Macquarie - CL Research Collaboration for details (also see TREC-9 report and paper on the synergy between NLP and computational lexicography). DIMAP versions are available for the 3rd edition, the 4th edition, and include integrated links to the Macquarie Thesaurus (known as the Dictaurus).
The Minnesota Contextual Content Analysis (MCCA) DIMAP Dictionary (included in MCCALite) contains 11,000 entries, many with multiple senses. Each sense has one of 116 emphasis categories as used in MCCA.
As part of the Dictionary Parsing Project (DPP), the publicly available Webster's Revised International Dictionary (1913) was converted into DIMAP dictionaries, one for each letter of the alphabet. These dictionaries were created in 1998. An initial effort was made to parse these dictionaries and identify semantic relations from the definitions. These efforts are described in detail at the links provided.
DIMAP dictionaries have been used in various research investigations, particularly Comparison of Lexical Resources and Subordinating Conjunctions papers).
The compiled dictionary used in the Proximity Parser as integrated in DIMAP can be examined in the Proximity Parser Demo.printf("You must enter your email address in the Email field. Please return to the registration page and enter your email address."); ?> Last modified: April 20 2009 16:42:35.
This document maintained by Ken Litkowski.
Copyright © 2009 CL Research