
ACL SIGLEX Lexicon Acquisition, Development, and
Analysis
Links to (1) research groups investigating the nature and structure of the
lexicon, (2) tools for development
and analysis of lexicons (including word-sense disambiguation), and (3)
acquisition of lexical items and their properties.
- FrameNet, a
lexicon-building effort in which the investigators (1) study words; (2)
describe the frames or conceptual structures which underlie these; (3) examine
sentences, using a very large corpus of contemporary English that contains
these words; and (4) record the ways in which information from the associated
frames are expressed in these sentences. FrameNet I data are available for
download. (A Windows program, FrameNet Explorer, is available from
CL Research.)
- Extended WordNet,
a project to develop a set of tools that can be applied to current and future
versions of WordNet to extend it for knowledge processing applications. The
extensions are enhancements of the glosses that currently contain definitions,
comments, and examples of sets of words that are linked in WordNet. Enhanced
glosses are syntactically parsed, will have each word tagged with its part of
speech, and will themselves be linked with other glosses that describe related
concepts.
- The UMLS SPECIALIST lexicon is a large syntactic lexicon of biomedical and
general English, designed to provide the information needed for the SPECIALIST
Natural Language Processing System. Coverage includes both commonly occurring
English words and biomedical vocabulary. The lexicon entry for each lexical
item records syntactic, morphological, and orthographic information. This is a
rich lexicon providing linguistically motivated data including
subcategorization patterns for verbs, nouns, and adjectives, fully described in
the SPECIALIST Lexicon Technical Report. (An alphabetic version of this
dictionary is available from CL Research.)
- Lexical
Conceptual Structure (LCS) Lexicon, capturing the semantics of a lexical
item (verbs
and prepositions)
through a combination of semantic structure (specified by the shape of the
graph and its structural primitives and fields) and semantic content (specified
through constants). An online, interactive
version allows lookup of specific lexical items.
- VerbNet,
addresses questions of word sense distinctions with respect to verbs, and how
regular extensions of meaning can be achieved through the adjunction of
particular syntactic phrases.
- Verb Frame Search Tool is a tool to
examine data from the various verb frame projects: VerbNet, LCSs, and FrameNet.
Besides reformatting the original data to ease reading, present different
views, and allow head-to-head comparison, the tool permits searching or
summarizing of commentary (e.g., thematic roles), and adds WordNet 1.6 glosses
as practical.
- SENSEVAL, an affiliated
subgroup of ACL SIGLEX, whose purpose is to evaluate the strengths and
weaknesses of computer programs for automatically determining the sense of a
word in context (Word Sense Disambiguation or WSD) with respect to different
words, different varieties of language, and different languages. An umbrella
page for Senseval has now been established at
http://www.senseval.org/.
- WASPS, a
semi-automatic lexicographer's workbench for writing word sense profiles,
exploring the synergy between the lexicographer's task of identifying and
describing word senses, and the computational task of word sense disambiguation
(WSD).
- Open Mind Word
Expert, an active learning system for collecting word sense tagging from
the general public over the Web and building a sense-tagged corpus.
- Tools from the
Linguistic Computing Laboratory,
including Structural Semantic Interconnections (which outputs a semantic graph
includings the senses chosen and the semantic interconnections between them for
sample texts), Valido (a visual tool for supporting the validator in the
difficult task of assessing the quality and suitability of sense annotation),
and TermExtractor (a software package for the extraction of relevant terms from
a specific domain), all based on the OntoLearn methodology.
Last modified October 20, 2005
Maintained by Ken Litkowski (webmaster@siglex.org)