A PRIMER ON COMPUTATIONAL LEXICOLOGY
Kenneth C. Litkowski
Copyright 1992

   SAMPLE DICTIONARIES AND COMPUTATIONAL LEXICOLOGY. . . . . . . . . . . .   1
       1  BACKGROUND AND ORIENTATION FOR DIMAP LEXICOLOGY. . . . . . . . .   2
       2  LEXICOGRAPHIC PRINCIPLES FOR ORGANIZING A
          COMPUTATIONAL LEXICON. . . . . . . . . . . . . . . . . . . . . .   3
          2.1  Main entries and headwords. . . . . . . . . . . . . . . . .   3
          2.2  Grouping and ordering of senses . . . . . . . . . . . . . .   4
          2.3  Pseudoentries (linguistic regularities) . . . . . . . . . .   6
       3  PART OF SPEECH (CATEGORY) INFORMATION. . . . . . . . . . . . . .   8
       4  ORTHOGRAPHY, PHONOLOGY, MORPHOLOGY, AND
          ADMINISTRATIVE INFORMATION . . . . . . . . . . . . . . . . . . .   8
       5  LEXICOGRAPHIC INFORMATION, INCLUDING DEFINITIONS . . . . . . . .   9
       6  SYNTACTIC FEATURES . . . . . . . . . . . . . . . . . . . . . . .  10
       7  SYNTACTIC STRUCTURES . . . . . . . . . . . . . . . . . . . . . .  13
          7.1  A Dictionary for Unification Grammars . . . . . . . . . . .  14
          7.2  Lexical Functional Grammar. . . . . . . . . . . . . . . . .  15
          7.3  Diathesis Alternations. . . . . . . . . . . . . . . . . . .  15
       8  SEMANTICS. . . . . . . . . . . . . . . . . . . . . . . . . . . .  16
          8.1  Semantic Roles and Their Relationship to
               Syntactic Structure . . . . . . . . . . . . . . . . . . . .  16
          8.2  Semantic representation, a word's type, and
               selectional restrictions. . . . . . . . . . . . . . . . . .  21
       9  ONTOLOGIES AND TYPE HIERARCHIES. . . . . . . . . . . . . . . . .  27
          9.1  Theoretical structure for an ontology . . . . . . . . . . .  27
          9.2  Basic ontological modeling choices. . . . . . . . . . . . .  28
          9.3  WordNet - An ontological database . . . . . . . . . . . . .  33
       10 LEXICAL RELATIONS. . . . . . . . . . . . . . . . . . . . . . . .  40
          10.1 Semantic networks and conceptual graphs . . . . . . . . . .  40
          10.2 Collocational functions . . . . . . . . . . . . . . . . . .  43
          10.3 Lexical subordination and qualia structures . . . . . . . .  46
          10.4 Lexical rules . . . . . . . . . . . . . . . . . . . . . . .  51


               SAMPLE DICTIONARIES AND COMPUTATIONAL LEXICOLOGY

     DIMAP is a toolkit for lexicon building that has its roots in
computational linguistics and natural language processing.  As the
lexicon assumes an ever-increasing importance in these fields, these
roots give DIMAP a particular flexibility easily adaptable to
emerging trends.  More significantly, the effort to encode current
formalism in DIMAP has facilitated the sharper identification of
their commonalities and differences.  These topics form the basis
for this chapter of the users manual, providing a brief introduction
to principles of computational lexicology, showing how to make use
of DIMAP's tools and data structures, and presenting sample
lexicons.

     This chapter is organized primarily around the theoretical
perspective and actual use of the several elements comprising a
lexical entry.  After a brief introduction describing the background
and genesis of DIMAP, the following topics are addressed:  (1)
lexicographical principles underlying a computational lexicon; (2)
grammatical category or part-of-speech; (3) orthographical,
phonological, morphological, and administrative components of an
entry; (4) lexicographical information, including the definition;
(5) syntactic features; (6) syntactic structures, (7) semantics
(including meaning representation and selectional restrictions), (8)
type hierarchies and ontologies, and (9) lexical, derivational, and
collocational relations.  In describing these topics, references are
made to sample dictionaries included with DIMAP and to the
corresponding literature, including

     --   basic principles of lexicon development from Allen; Atkins;
          Atkins, Kegl, and Levin; and Mel' uk;

     --   grammar formalism that rely heavily on the lexicon,
          including lexical functional grammar (Meyer et al.) and
          head-driven phrase structure grammar (HPSG) (Flickinger);

     --   type hierarchies and ontologies (Allen; Carlson and
          Nirenburg; WordNet; Meyer et al.);

     --   collocational patterns (Mel' uk; Velardi et al.); and

     --   semantic relations within a dictionary and the lexicon,
          including the notions of lexical subordination (Levin and
          Rapoport), qualia structures (Pustejovsky), semantic
          networks (Sowa, 1984), and lexical rules and derivations
          (Copestake and Briscoe; Flickinger).

This literature is not comprehensive, nor is there full agreement on
the issues they discuss.  However, they provide a useful vehicle for
discussing many issues relevant for computational lexicons.

  1  BACKGROUND AND ORIENTATION FOR DIMAP LEXICOLOGY

     Three seminal papers in studies of ordinary dictionaries were
published in 1978.  Amsler and Litkowski provided descriptions of
the network of definitions present in ordinary dictionaries.  Amsler
focused on the taxonomies, while Litkowski provided a graph-
theoretic model for the taxonomic and other relations.  Meanwhile,
Evens and Smith provided details on lexical-semantic relations.  The
DIMAP data structure is especially designed to contain fields for
specifying hierarchical and other semantic links between (distinct
senses of) the lexical entries.

     These papers and many others, both before and since,
demonstrate the highly interwoven nature of ordinary dictionaries. 
The different meanings (senses) of many words provide evidence that
many concepts are derivative, made up of more primitive components. 
There have been many papers discussing the nature of semantic and
syntactic primitives.  Regardless of the conclusions of such
discussions, it is well-accepted that definitions are related to one
another.  When toy lexicons are developed for computational
linguistics applications and natural language processing, they
seldom include such derivational links.  As you develop lexicons for
your own applications, you will find, at the beginning, very few
links are present.  DIMAP was designed with such relationships in
mind, so it will facilitate their identification and specification.

     The more immediate genesis of DIMAP is to be found in Ahlswede,
which described a set of interactive routines to create, maintain,
and update lexicons.  Ahlswede emphasized that the lexicon produced
in his system was based on lexical-semantic relations, but easily
extended to other models of the lexicon structure.  The sample
dictionary available to you when first starting DIMAP includes the
word "aphasia" which appears in Ahlswede's paper.  (This lexicon is
also backed up as DICT1 and its associated index files.  See chapter
7 for a full description of a suite of dictionary files.)

     DIMAP includes the basic facilities described by Ahlswede, but
also extends these routines in several directions.  The first
expansion is mostly transparent to the user, in providing variable-
length rather than fixed-length fields for the many components.  You
do not have to worry about a limited amount of space for any part of
a DIMAP entry.  The system is designed to handle these variable
length records through random access routines.

     DIMAP does not as yet provide facilities for automatically
generating the data (that is, hierarchical and other lexical-
semantic relations) in the entries.  However, you can establish such
links and by a judicious structuring of a hierarchy you can store
information parsimoniously.

     You can perform a considerable amount of lexicological research
using DIMAP, since it has the attached CED.  You can, for example,
convert a selected set of definitions from the CED to DIMAP entries
(following the procedures described in chapter 5).  You can then
convert these entries to an ASCII file, perhaps containing only the
definitions from the CED (following the procedures described in
chapter 6).  You can then edit these definitions, perhaps excluding
function words, or parse the definitions using your own parser to
identify the head words of the definitions, finally creating a set
of words with which to perform a batch download (again following
procedures in chapter 5).  Based on your analysis of these
definitions, you may want to create a special set of links in the
DIMAP entries.  Although the CED is not the most comprehensive
dictionary, it does contain sufficient information for serious
investigation of specialized vocabularies or of lexical relations.

  2  LEXICOGRAPHIC PRINCIPLES FOR ORGANIZING A COMPUTATIONAL LEXICON

     The dictionary available when you first start DIMAP contains
entries for all the dictionary entries appearing in the first 250
pages of Allen.  Each word used in his sample lexicons has been
included, along with any features that are part of his entries. 
This sample dictionary is useful not only for demonstrating how
individual components of DIMAP can be used, but also for integrating
this lexicon with parsers that are capable of working his examples. 
The entries in this dictionary provide a useful starting point for
presenting principles of organizing a computational lexicon.

       2.1     Main entries and headwords

     Entries in a computational lexicon generally contain
orthographic, phonological, and morphological information; syntactic
features and syntactic structure information; and semantic and
pragmatic information.  In DIMAP, the basic organizational unit is
the main entry, which is comparable to what Ilson and Mel' uk and
Meyer et al. call a SUPERENTRY.  The lexical units in a dictionary
are intended to ensure the lexicalization of the meaning, uniting
bundles and configurations of semantic elements into actual lexical
units and supplying syntactic and lexical co-occurrence information,
thus providing all information associated with the behavior of the
lexical unit.  In ordinary dictionaries or in some computational
lexica, there may be several ENTRIES corresponding to homographs;
each of these may be called a LEXEME.  As will be seen, there is no
necessity for making each lexeme a main entry in DIMAP.  Each entry
should be viewed as a complete frame data structure, intended
eventually to allow structure-sharing where entries containing the
same information in a particular subframe will point to the same
structure.  In DIMAP, this is generally handled by encoding a
distinct entry containing the repeated information and having a
pointer to that entry in the SUPERCONCEPT, INSTANCE, or ROLE fields. 
Pointers can also be encoded in the FEATURE field, as will be shown
below.

     What constitutes a main entry headword is subject to various
criteria.  In Meyer et al., the main entry headword can only be
solid words or hyphenated compounds.  Proper nouns are intended to
be put into a separate knowledge structure (although with the same
format).  Idioms (including true idioms, noun-noun compounds, non-
compositional compounds, and verbs with particles) are entered under
the syntactic head of the idiom.  In DIMAP, all of these should be
entered as the main entry headword.  The principle reflected in
DIMAP is that recognition proceeds from left to right, so that any
compound or idiom is recognized as beginning with its first word. 
In DIMAP, this is captured by the entry type, 'r' for regular and
'i' for idiom.  If an entry is coded with an 'i', then the
dictionary contains another entry which begins with this entry. 
This accounts for the presence of an entry "have a" coded in DICT1,
where this phrase has no meaning in itself, but is the initial
phrase of the idiom "have a ball", also an entry in DICT1.

       2.2     Grouping and ordering of senses

     The creation of senses for a computational lexicon has
important consequences for the commitment to parsing that is
implemented.  In toy lexicons implemented in texts on natural
language processing, the entries are simplistic and oriented toward
the grammatical component, with the semantic component playing a
subordinate and trivial role.  Entries seldom have more than one
sense, and when they do, they generally reflect different parts of
speech or senses with widely different usages.  As more information
is reposed in the lexicon, the structure of an entry assumes greater
importance, particularly the manner in which the senses relate to
one another.

     The principles for identifying a sense in DIMAP should follow
those used in ordinary lexicography, although some interesting
possibilities are available with a computational lexicon.  In
general, the principles stated in Mel' uk and Meyer et al. are
appropriate.  In summary, they are:  (1) if, for a suggested lexical
unit, two possible mappings to the ontology can apply, then two
lexical units must be created (that is, create two senses if you
wish to have separate meanings pointing to different parts of a type
hierarchy); (2) if there are incompatible selectional restrictions
for a suggested lexical unit, there should be two senses; (3) if
there are two incompatible co-occurrence sets (morphological,
syntactic such as subcategorization frames, or lexical such as
collocations), two senses should be created; and (4) if there are
two possible readings of a word, two senses should be created. 
These principles will become clearer as the components of an entry
are described below.

     In Mel' uk, a vocable is the set of all lexical units (senses)
for which the lexicographic definitions are linked with a semantic
bridge.  A semantic bridge between lexical units L1 and L2 is a
component common to their definitions, which formally expresses a
semantic link.  A basic lexical unit of a vocable is a lexical unit
which has a semantic bridge with the majority of the other lexical
units of the vocable.  A semantic field is the set of all lexical
units that share an explicitly distinguished non-trivial semantic
component.  A lexical field is the set of all vocables whose basic
lexical units belong to the same semantic field.  Although Mel' uk
uses a vocable to group similar senses under a SUPERENTRY, any main
entry can have any number of sense groupings under it.  In DIMAP,
there is no real need for separate entries for homographs.  In
computational linguistics, parsing (recognition) must always begin
with the least common denominator, the spelling of a word.  Sets of
senses within a main entry may be linked in any grouping.  Mel' uk's
terminology provides concepts that may be useful in thinking about
such groups.

     Mel' uk articulates the decomposition principle that the
definition of a lexical unit must contain only terms that are
semantically simpler than the lexical unit.  Further, through his
semantic bridge principle, the definitions of any two lexical units
of the same vocable must be explicitly linked, whether by a semantic
bridge or by a sequence of semantic bridges.  These principles
should be followed in constructing a lexicon and ensuring its
internal consistency.  Most importantly, these principles should be
applied in determining the relationship between one definition and
the rest of the lexicon, including other definitions of the same
main entry.  The importance of these principles is considered below
in discussions of issues such as lexical subordination, type
coercion, lexical rules and relations, and lexical inheritance.

     Mel' uk makes six observations pertinent to grouping and
ordering the senses of an entry:  (1) grouping into one polysemous
vocable has a semantic motivation, namely that all lexemes must
share at least one important semantic component; (2) division into
sense groups is also semantically based; (3) ordering is based on
semantic proximity; (4) ordering is based on which entry is
semantically simpler (for example, the change of state of the same
thing is simpler than the change of state into something else); (5)
an intransitive sense is placed before a transitive sense, again
based on semantic simplicity, in that the transitive includes a
causal component (the transitive is defined in terms of the
intransitive); and (6) sometimes the placement is not at issue but
only whether a distinct sense is needed (this can be somewhat
arbitrary).  In DIMAP, these kinds of groupings and orderings can be
accomplished through SUPERCONCEPT links.  Instead of a link to
another main entry, the links are made to other senses of the same
main entry.

       2.3     Pseudoentries (linguistic regularities)

     The preceding discussion focuses on lexical entries that
characterize the world around us.  A distinct group of lexical
entries can be encoded to characterize linguistic and lexical
generalities.  For this purpose, DIMAP introduces entries that begin
with the symbol '#'.  These are called pseudoentries because they
encode only grammatical or semantic abstractions.  They constitute
metalinguistic entries in the lexicon.

     Pseudoentries vary in importance with the grammatical theory. 
The sample dictionary containing entries for the words used by Allen
(DICT1) also contains pseudoentries (encoding primarily semantic
interpretation rules) that Allen identifies as necessary for a
consistent framework for performing compositional semantics.  Hence,
the prefix gives these entries a metalinguistic stature.  These
entries include:

     #abstract, #action, #animate, #anything, #assert, #automata,
     #command, #event, #inf, #inst/action, #legal-entity, #living,
     #location, #name, #non-animate, #non-living, #obj/action, #org,
     #past, #person, #physobj, #present, #pro, #time, #to/action,
     #vegetative, #wh, #wh-query, #y-n-query

These entries correspond to nodes in the various type hierarchies
Allen presents to describe grammatical entities.  The power captured
in these formalism is still in a nascent stage in computational
linguistics.

     In the unification-based HPSG (Pollard and Sag), a considerable
amount of syntactic generalization resides in the lexicon. 
Flickinger describes the role of the lexicon in HPSG in detail. 
Flickinger describes a word class hierarchy which makes a commitment
to HPSG, so that the syntactic properties defined for each word
class are consistent within that framework.  A class is defined by
its set of discrete properties making up a lexical entry which is
inherited by each member of the class.  In this schemata, a given
lexical item may belong to many classes and hence inherit the
particular syntactic, morphological, or semantic properties (or
cluster of properties) of each class.

     A sample dictionary (DICT2) consists solely of pseudoentries
encoding the syntactic regularities that Flickinger would relegate
to the lexicon.  The focus in the sample DIMAP dictionary (and
indeed in Flickinger) is on syntactic regularities and properties. 
The sample dictionary contains entries for the word classes
elaborated in chapter 2 of Flickinger (pp.17-56).  These include the
following:

     #3-1, #3rd-sing, #adj, #adv, #anaphoric, #anomalous-equi, #aux,
     #base, #c-conj, #c-noun, #comp, #complementation, #complete,
     #control, #copula, #det, #det-number, #det-plu, #det-sing,
     #det-type, #ditrans, #ditrans-to, #each-type, #equi,
     #every-type, #expletive, #fin, #incomplete, #lex-np,
     #main-verb, #major, #mass, #minor, #name, #non-3rd-sing,
     #non-past, #non-reflexive, #noun, #noun-type, #number,
     #numeral, #object-equi, #object-raising, #part-of-speech,
     #passive, #past, #past-part, #perf, #plu, #prep, #pres-part,
     #pron, #prop, #raising, #reflexive, #s-inf-it, #s-inf-norm,
     #s-it, #s-norm, #sing, #transitive, #verb, #verb-form,
     #verb-ger, #verb-type, #word-class

Although the terminology adopted in creating these entries (in
DIMAP) and in their definitions (in Flickinger) is not self-evident,
their tenor is captured by the above list.  Examining the contents
of these entries, along with a knowledge of HPSG, will make the
details clear.

     Despite the attempt to place as much information as possible
into the lexicon, there still remain issues of what separation must
be maintained between the grammar and the lexicon.  Ilson and
Mel' uk discusses several lexico-grammatical problems:

     Quasi-passives vs. real passives - Quasi-passives should
     constitute lexemes separate from their actives, whereas true
     passives are grammatical forms of the same lexeme.  Quasi-
     passives are not possible with all verbs, whereas all active
     transitive verbs can be passivized by general rules.  It can be
     argued that real passives should not be described as separate
     entries in the dictionary entry proper.

     Diathesis alternations - Two government patterns for a verb
     sense may have the same meaning.  Therefore, it can be argued,
     only one sense is necessary in the dictionary.

     Subject/object complements - Some of these are obligatory and
     must be included among the arguments of the corresponding
     verbs, while others are optional and freely added.  Thus, it
     can be argued, recognition should be treated in the grammar and
     not result in distinct entries.

In each of these cases, some information can be placed in the
dictionary.  Perhaps the key distinction is one of processing
efficiency:  place information in the lexicon if it can be accessed
and used more efficiently than backtracking through several paths in
a parser.  The separation between the grammar and the lexicon is an
active area of research today.  With the development of lexical
rules, derivational rules, and collocational functions that can be
placed in the lexicon itself, it is difficult to determine exactly
where to leave off in the creation of dictionary entries.

  3  PART OF SPEECH (CATEGORY) INFORMATION

     The first piece of information required for NLP is a word's
part of speech.  The possible values for any given parser will vary. 
DIMAP provides a list of categories ordinarily used.  If CED entries
are converted to DIMAP format, the category is always included in
the conversion.  There is a standard list used in DIMAP, one of
which is chosen for each sense.  This includes a category 'none', an
option that is particularly applicable to pseudoentries.

     An alternative representation in DIMAP would be to use the
category 'none' in the required part-of-speech or category field of
a sense and then to identify a CAT feature name, with any desired
value from the list one wishes to employ.  The only requirement is
user-imposed, to ensure consistency of the entries.

  4  ORTHOGRAPHY, PHONOLOGY, MORPHOLOGY, AND ADMINISTRATIVE
     INFORMATION

     In Meyer et al., distinct fields are used to record variants,
abbreviations, unpredictable phonology, irregular forms, stem
variants, and declension classes.  In DIMAP, there are no specific
fields intended to capture these types of information.  These data
may be regarded as "instances" of a specific entry.  This
information is generally recorded in DIMAP by creating a distinct
entry for the variant, abbreviation, irregular forms, and so forth,
containing only pointers through the SUPERCONCEPT links to the entry
where more detailed syntactic and semantic information is available. 
At that entry, an INSTANCE pointer can be used to provide the
reverse links to the abbreviations, variant spellings, declension
forms, and irregular forms, as desired.  When CED entries are
converted to DIMAP entries, inflected forms, variants, irregular
forms, and abbreviations are identified as distinct main entries
with links to the base form through the SUPERCONCEPT field.  The
reverse links are not created in the interests of saving space. 
Inflected forms also contain pointers to pseudoentries for the
specific form; the sample dictionary DICT3 contains these entries
and may be merged with other dictionaries.

     The use of this information generally depends on the specific
system the user has in mind.  As a result, the amount of information
included in an entry is highly application-dependent.  Additional
information that may be viewed as residing under this general
heading could be information about when the entry was created, by
whom, and recording the time and nature of updates.  In general,
this type of information is not used computationally.

  5  LEXICOGRAPHIC INFORMATION, INCLUDING DEFINITIONS

     Information that appears in an ordinary dictionary
(particularly including definitions, usage notes, examples, and
status labels) is not used computationally.  However, this material
can be analyzed for the purpose of establishing representations that
will be employed computationally.  Indeed, this is the main reason
why the CED is included with DIMAP.  In DIMAP, the primary fields
directly available to the user appear under the dictionary
information category and include specific fields for the definition
of a sense, usage notes as usually transcribed in a dictionary, and
status or usage label (such as "archaic" or "chiefly Australian" or
"Astronomy" or "(of a person)").

     Although DIMAP is intended primarily as a tool for developing
computational lexicons, it can serve as a tool for lexicography as
well.  But if DIMAP is viewed more broadly as the basis for a new
kind of lexicography, without the limitations of a printed
dictionary, there is an opportunity for greater thoroughness in the
treatment of ordinary lexical information, particularly the
definitions themselves.

     Atkins provides a set of lexicographic rules for building a
template MRD structure.  These rules can be elaborated for building
a lexicon using DIMAP.  There are several principles that need to be
followed in considering the template structure:

     --   regular polysemy (lexical rules);

     --   modulation ("the ways in which the effective semantic
          contribution of a word form may vary under the influence of
          different contexts");

     --   a very general or 'major' sense (or a series of major
          senses) for each headword (hence, a hierarchical structure
          within the senses of an individual headword); and

     --   basing the structure for each lexical item on appropriate
          theory.

     Atkins views this template as maintaining the traditional
notion that a definition should be viewed as consisting of a genus
term and differentiae.  The genus comes from the core word in a
definition (perhaps after analysis of lexicographical defining
conventions), either directly or from a defining formula (for
example, 'object used for' or 'object on which' for device or 'used
to hold' for container).  Differentiae come from analysis of any
material other than the core word and may incorporate such central
notions (for nouns) as whether the object is free-standing or in
some meronymous relationship ('part-of' or 'attached-to'), use of a
device, its form ('cylindrical'), and domain specificity.  An
important defining mechanism in her view is the extension of a sense
through a link-rule (called a derivational rule or a rule of lexical
subordination by others).  These notions are explored further from a
computational point of view in section 10 of this chapter.

     In this spirit of bringing a new level of rigor to the
definitions themselves, Mel' uk views a definition in an explanatory
and combinatory dictionary (ECD) as having at least the following
three important features:  (1) the definiendum (or main entry) is a
propositional form consisting of the lexical unit in question,
variables representing its semantic arguments (with the same
variables appearing in the definition), and structural elements
(such as prepositions with, out of, ...) linking the variables to
the lexeme; (2) the definition must render explicit the semantic
invariants found in the definiendum; and (3) the definition must
avoid idiomatic expressions.

     Mel' uk offers several heuristic criteria for the formation of
definitions that satisfy these principles.  In Ilson and Mel' uk,
these are summarized as follows (and constitute the semantic zone of
an entry).  The definition is a paraphrase of the propositional form
satisfying several formal requirements:  

     --   Substitutability - the definition must be substitutable for
          the propositional form in all possible environments),

     --   Decomposition - the definition paraphrase is formulated in
          terms of lexemes such that each one of them--identified by a
          sense number--is semantically simpler than the definiendum,
          with each semantic component playing one of three roles,
          specifying either a relation between arguments, semantic
          restrictions on its arguments, or a modification of another
          component, and

     --   Inheritance of Arguments - a component of the definition
          brings to its host ALL its own actants, which must be
          explicitly accounted for in the definition.

This makes ECD definitions satisfying all formal requirements
equivalent to semantic networks.

  6  SYNTACTIC FEATURES

     The next piece of information usually described in elementary
textbooks on NLP is the feature.  This field is used to record
syntactic features associated with each particular sense.  Each word
sense may have any number of features that characterize its
syntactic and semantic properties.  Many features have been used in
NLP; here, we recount the ones mentioned in Allen, Flickinger, and
Meyer et al., which generally follow the literature.

     A feature is represented by giving its name and its value (with
multiple values allowed for a given word).  Allen describes useful
feature systems (pp.89ff) that include (at least for English) the
following:

                number        singular, plural
                person        first, second, third (can be combined with
                              number)
                verb forms    infinitive, present, present participle,
                              past participle, past
                auxiliary verb
                type          be, do, have, modal
                verb sub-
                categorization
                (complement)
                type          none, direct object, indirect object,
                              adjective, prepositional phrase,
                              infinitive, "for-to" complement, "to"
                              infinitive, "that" complement, wh-
                              complement, gerundial complement
                mood          declarative, yes-no question, wh-question,
                              imperative, embedded
                voice         active, passive 

          The features above generally suffice for syntactic NLP.  In
DIMAP, they are best handled in the field defined specifically for
them, that is, the feature component of the definition structure. 
You can enter as many features as you like for each sense of an
entry word.  To enter this information in the change routines (see
chapter 4), you need to enter a feature name and a feature value. 
You may provide multiple values for a feature, either by simply
separating each value by a space or by enclosing them with braces
(as does Allen).  The choice is up to you and depends how you want
to process them.  The names and values are also arbitrary.  However,
it is well to decide on these matters beforehand, to ensure
consistency in your representation.

     In parsing systems that give more importance to the lexicon,
the capability that makes it possible to store syntactic
generalizations in the lexicon is the system of attribute-value
pairs (feature structures) that are encoded.  In developing the
lexicon for the HPSG formalism, Flickinger identifies two types of
syntactic properties:  a set of features and a set of
subcategorization specifications.  The features are separated into
those with atomic values and those with category values (feature-
value pairs).

     The atomic-value features are drawn from a small finite set
where each feature has a limited set of possible atomic values.  The
following lists the features and their values as present in
Flickinger:

               CAT (category)           Noun, Verb, Preposition
               VFORM (verb
               form)                    Finite, Infinitive, Base, Past
               INVERTED                 +, -
               CASE                     Accusative, Nominative
               PFORM
               (preposition
               form)                    To, Neutral
               NFORM (noun
               form)                    Normal, It
               COMP
               (complement)             For, That
               DTYPE
               (determiner)             Each, Every
               PREDICATIVE              +, -
               NTYPE (noun
               type)                    Common, Proper, Pronoun
               AGREEMENT                Mass, Singular, Plural

        To represent these features in DIMAP, use the feature field of the
dictionary entry and enter the feature name from the left-hand
column and one or more of the values from the right-hand column. 
The sample dictionary DICT2 makes use of these features in encoding
its entries.  No part of speech values are present in this sample
dictionary, but note how this provides an alternative to identifying
a part of speech from a predetermined set, as is done in DIMAP.  If
you wish to encode a different set of categories than the
predetermined set, you can do so under the features.

     Meyer et al. indicate that syntactic features can be inherited
from a class.  In DIMAP, this can be accomplished through the
SUPERCONCEPT field; pseudoentries (entries prefixed with '#') can be
used to record class information.  For features that are not
sufficiently regular to be captured in a class, entries can be made
in the FEATURES field of a DIMAP sense.

     This is the first place that the possibility of inheriting
information has been mentioned specifically.  It is well to consider
carefully the feature combinations that are present within a
lexicon, to remove as much redundancy as possible and to allow
information in an entry to be inherited as much as possible.  The
immediate approach that is suggested envisions entries like
'#plural' or '#masculine' or '#declarative'.  Such entries might
consist of only a single feature.  But you should consider the
possibility that pseudoentries can contain two or more features. 
For example, '#1st_sing' might encode a person feature and a number
feature.

     A generalized approach for identifying potential complex
feature bundles is, first, to develop the features that make sense
for a particular system.  Second, encode these features and their
values for particular entries.  Third, examine several entries and
factor their features and values (that is, determine their least
common denominator), identifying the possibility for creating
pseudoentries that bundle several feature and value combinations. 
This approach was followed by Flickinger in creating the word-class
hierarchy contained in DICT2.  The classes were carefully
constructed to capture syntactic regularities that could be
inherited within a parsimonious structure.

  7  SYNTACTIC STRUCTURES

     As expressed by Meyer et al., in parsing, information is to be
obtained from the lexical entry for a word, as necessary to
characterize the use of the word.  This will include the syntactic
features associated with the sense and also something about the
syntactic structure.  Each word sense should have only one
permissible associated syntactic structure.  The framework of the
structure (which should be an underspecified representation--that
is, with certain slots whose values need to be identified from the
surrounding text) needs to be encoded in the sense.  The encoded
information identifies the elements of the parse structure which the
current lexeme requires as arguments (with verbs identifying all of
their arguments, modifiers specifying requirements for their heads,
prepositions identifying their objects and mode of attachment, and
nouns perhaps indicating constraints on their verbs--see discussion
of qualia structures in section 10).

     As with syntactic features, structural information can be
inherited from a class; again, this can be captured by SUPERCONCEPT
links in DIMAP.  Otherwise, the structural information is encoded in
the sense entry.

     The precise nature of what is to be encoded depends on the
parser and knowledge representation that is desired.  To that
extent, the entry must be tailored.

     Systems in which very little syntactic structure resides in the
lexicon do not identify the structure per se, but rather simply
specify the subcategorizing properties of an entry.  The
responsibility for parsing lies in the recognition procedures built
into the grammar rules for a particular subcategorization.  For
these systems, the use of syntactic features as described in the
last section might suffice.

       7.1     A Dictionary for Unification Grammars

     More recent parsing systems theorize that many idiosyncrasies
of syntactic structure are intimately associated with specific items
of the lexicon.  Therefore, they attempt to encode the syntactic
structure directly.

     The lexical, syntactic hierarchy constructed by Flickinger is
included as a sample dictionary (DICT2).  In this dictionary,
extensive use is made of the feature component of the DIMAP
dictionary structure to encode subcategorization specifications. 
This sample implements an approach to the lexicon that represents a
major departure from earlier linguistic theories.

     It is claimed that, in HPSG, the number of phrase structure
rules can be reduced to fewer than 20 very general ones (Proudian
and Pollard; Flickinger, Pollard, and Wasow).  Moreover, the
syntactic and semantic regularities present in the lexicon can be
structured into a lexical hierarchy that substantially eliminates
redundant information in the entries by allowing the use of
inheritance.  For these reasons, it is particularly compelling to
demonstrate how this can be accomplished in a DIMAP dictionary.

     The most important piece of syntactic information in an entry
is the subcategorization information, which characterizes the
pattern of complements and adjuncts permissible for the word sense. 
It is this information which combines with the grammar rules in HPSG
in parsing.  This is the location where the syntactic
generalizations have been placed in the lexicon.  In DIMAP, this
information is set up under the FEATURE component of a dictionary
entry.  The feature name in the sample entries is either 'compl'
(complements) or 'adjns' (adjuncts).  The feature value consists of
a category specification (set of feature-value pairs) and semantic
properties (thematic role assignments).

     The feature value in these entries is itself a list of the
thematic roles and feature-value pairs.  In DIMAP, a feature value
is limited to 128 characters, so there is some limitation on the
amount of information that may be contained in any one feature
value.  However, as can be seen in some of the sample entries, there
can be several features with the name 'compl' or 'adjns', so that
all the necessary information can be portrayed.

     Each feature value in DICT2 is enclosed in brackets.  The first
element inside the bracket is the thematic role; it is followed by a
set of feature-value pairs which together make up the syntactic
restrictions (or subcategory specification) imposed by the lexical
item on its complements and adjuncts.  Each of these is nothing more
than a feature-value pair taken from the list of features and their
possible values, as identified in the previous section on syntactic
features.

     The sample dictionary is only a small part of what would be
needed in a robust parsing system.  Nonetheless, it is a legitimate
superstructure in itself for any parser that follows the HPSG
approach.  Close examination of this sample shows the flexibility of
DIMAP and gives some ideas on how other lexicons based on similar
unification grammars might be structured.

       7.2     Lexical Functional Grammar

     Meyer et al. follow principles of Lexical Functional Grammar
(LFG), which is closely allied to HPSG.  (The sample dictionary
DICT4 contains examples encoded using this formalism, see
particularly the entries for a, quickly, bright, smell, eat, and
drop by.)  In this case, the syntactic structure information encoded
into the lexical entry will be part of an f-structure parse of a
sentence (or other fragment), and is referred to as the "fs-
pattern".  In this pattern, variables are used to identify nodes in
the f-structure with which the current lexeme has syntactic or
semantic dependencies.  The variable name (encoded as $var0, $var1,
$var2, ...) is associated with the root node of a subtree in the
representation.  In the lexical entry, the word "root" is used to
indicate an f-structure and the variable then identifies the
syntactic or semantic relationship.  The current lexeme is always
encoded as $var0.

     In parsing, a bottom-up active chart parser retrieves the
(inherited or local) fs-pattern of each word in an input sentence
(with phrase structure rules providing top-down expectation).  The
fs-pattern is unified with the f-structure produced by the syntactic
parser.  The unification serves to bind the variables identifying
the lexemes falling in syntactic or semantic dependency to the
lexeme in question.  (Of course, the unification will fail if
obligatory structures are missing, syntactically disallowed
structures are present, or there is a mismatch in agreement
features.)

       7.3     Diathesis Alternations

     A particularly important phenomenon for some verbs is that of
the alternation in where their arguments might be placed.  For
example, "John broke the window" and "The window broke" involve the
same essential sense of the verb "break", but vary in what arguments
are used and where they are placed.  These examples show that it is
possible for two senses of a word to have the same meaning, but have
different syntactic realizations.  For Levin (1991a), these examples
"reflect the interaction between a representation of the meaning of
a verb and the principles that determine the syntactic realization
of its arguments."  For Pustejovsky, these examples lead to "qualia"
structures for nouns, moving some of the lexical responsibility for
determining the appropriate syntactic realization away from verbs.

     Each syntactic realization will have its own entry in the
lexicon; in parsing, only one sense will emerge as having been
recognized.  The placement of such entries within a hierarchical
lexicon raise some interesting considerations (including merging
several senses into one), which will be discussed at more length in
the next three sections on semantic representation, ontologies, and
lexical rules.

  8  SEMANTICS

     The representation of the meaning of a word consists of two
parts:  (1) an identification of the relationship between a word's
syntax and the semantic roles played by its arguments (if any), and
(2) the compositional structures that the lexical item contributes
to the representation of the meaning of a text fragment within which
it is used.  The first part is strongly linked to the syntactic
structures discussed in the previous section, and is discussed in
the first subsection below.  The second part (discussed in the
second subsection below) is strongly linked to the view of the
world, and, as suggested by Meyer et al., would include (1)
knowledge about how concepts may fit together (an ontology), (2)
knowledge about the world, (3) knowledge about speaker/hearer
intentions, (4) knowledge about speaker/hearer attitudes, and (5)
knowledge about the structure of the discourse or text.

       8.1     Semantic Roles and Their Relationship to Syntactic
               Structure

     For Levin (1991b) representing semantic information means
primarily capturing, for a verb, the number and types of arguments
its requires and the semantic relation each of these arguments bears
to the verb.  This ignores other kinds of semantic relations, such
as synonyms, antonyms, hyponyms, appropriate modifiers (which might
be identified by Mel' uk--see section 10 below), and other relations
identified by Cruse.  To Levin, the central concern "is the
formulation of representation that makes explicit the semantic
relations between a verb and its arguments, as well as other aspects
of the meaning or a verb related to its status as an argument-taking
lexical item."  The representation should allow the placement of a
word within the larger organizational schema of verbs.

     --   Semantic representation (logical forms)

     Allen provides the basic requirements for representing the
semantic relation component of meaning.  In Allen's semantic
interpretation rules, the right hand side is the semantic
representation.  Allen intends (pp. 212ff) the semantic
representation to be a logical form for a particular word sense that
can then be composed with other logical forms.  This approach is
well accepted and provides the basis for knowledge representation
and reasoning systems (which thus rest on the composition of meaning
structures specified within the senses of an entry).

     In Allen's notation, a logical form consists of 4 elements
within parentheses:

     --   an operator, indicating the type of structure being used to
          describe an entity;

     --   a name, that is, an arbitrary variable that is used to
          identify the specific instance of the entity being
          described;

     --   the type of the entity; and

     --   any modifiers of the entity, each of which is a logical
          form.

     The operator generally corresponds to a syntactic entity or
semantic case relation, including:

     --   sentences, including declarative sentences (ASSERT), yes/no
          questions (Y/N-QUERY), wh-questions (WH-QUERY), and commands
          (COMMAND);

     --   tense, including distinct operators for past (PAST), present
          (PRESENT), etc.;

     --   embedded sentence, including compounds (BUT), relative
          clauses (EMBEDDED), and infinitive phrases (INF);

     --   verb cases, including any case relations governed by a verb
          (such as AGENT, THEME, BENEFICIARY, TO-LOC, AT-LOC, and
          EXPERIENCER), where noun phrases are treated as modifiers of
          verbs;

     --   adjectives, where the operator is the attribute name (COLOR
          for white, RACE for white); and

     --   determiners, including DEF/SING, DEF/PLU, INDEF/SING,
          INDEF/PLU.

     Verbs (but other parts of speech as well) require specific
complement patterns, frequently articulated as case relations. 
These are the constituents for which the verb subcategorizes.  Allen
calls these the inner cases of the verb, to distinguish them from
constituents that are not obligatory.  To distinguish these
obligatory constituents, Allen uses brackets ('[', ']') rather than
parentheses around a modifier.

     In DIMAP, the logical form for a sense in a dictionary entry is
encoded in the right hand side of the "semantic interpretation rule"
field.  This information is entered through prompts which ask for
operator, name, type, and modifier information, as described in
chapter 4.  Usually the name field is filled with a '*' for
instantiation during parsing.  The operator and type fields are
specified with any user-entered string or perhaps with a question
mark ('?') to indicate that no value is given for the field (for
later composition with other elements).

     In Allen's logical forms, the operator indicates a syntactic,
case, or semantic relation and the type characterizes the meaning of
a component.  The type is filled with the most specific applicable
concept in a type hierarchy or ontology of all concepts.  In Allen,
when the type is unknown, it is specified by T; in DIMAP, it is
indicated by !T.  Thus, T(MAIN-V) is intended to retrieve the type
of the main verb.  Alternatively, Allen uses the notation V (in
DIMAP, this is indicated by !V), followed by some syntactic
structure relation, to retrieve the semantics of the item.  Thus,
V(OBJ) is supposed to retrieve the meaning or semantics of the
object of the verb.  Exactly what should be retrieved is considered
in more depth in the next section, in the discussion of type
hierarchies and ontologies.

     Instead of using the SEMANTIC INTERPRETATION RULE field in
DIMAP, this information can be entered in DIMAP by encoding the case
relation in the FEATURE field, where the feature name is the case
relation and the feature value is tied to a particular syntactic
relation.  Thus, for a verb, we can have a feature AGENT with value
V(SUBJ).  The type or meaning of an item can be implicitly expressed
in its SUPERCONCEPT links.

     To Meyer et al., there is a similar importance given to the
interaction between the information contained in the semantic fields
and the syntactic structure fields.  The mapping rules that are
specified in the syntactic structure are unified with the f-
structure, thereby binding the variables and enforcing the
constraints (this constituting most of the parsing process, and is
discussed in the next subsection).  Then, in the meaning pattern
that is specified in the semantics, the meaning of the current
lexeme is obtained by the "^" operator preceding the reference to
that variable.  The meaning pattern of the entire lexeme is the
meaning of the variable $var0, that is, ^$var0.  The case relations,
however, are considered part of the semantics of the lexeme, and
hence intimately part of the representation in the ontology.

     For the remainder of this subsection, we consider specific case
relations associated with particular types of verbs.  This will
provide an indication of what general structures are usually thought
to be included in these representations.

     --   Linking regularities

     Arguments that are perceived to bear a particular semantic role
are consistently expressed in the same way across a wide variety of
verbs.  For causative use of verbs of change of state (break,
freeze, redden), the agent and patient arguments are expressed as
the subject and object.  These are referred to as agent-patient verbs,
which describe actions where some generally animate entity, the
agent, brings about a direct (usually physical) effect on or a
change in the location of another entity, the patient.  Subclasses
of agent-patient verbs are verbs of contact-effect (cut, smash),
ingesting (eat, drink), and causative uses of verbs of change of
position (roll, move, rotate).  The term theme, rather than patient,
is used to refer to the argument that denotes the entity whose
position changes (for verbs of motion) or whose position is
specified (for verbs of position).

     agent-patient verbs with more than two arguments that describe
the placement or attachment of an entity at some location include
verbs of placing (put, stand) and verbs of attaching (fasten, bolt). 
With verbs of change of possession (including verbs of giving (sell,
lend) and taking (buy, steal)), the agent argument is the subject
and the argument denoting the entity transferred is the object. 
Verbs of psychological state (admire, astonish, like), cognition,
desire, authority, and perception have the same arguments.

     Verbs of change of position (including directed motion (come,
go, rise), manner of motion (dance, run, jog), placing (put, stand),
and exerting force (push, pull, drag) also require prepositional
phrases specifying the trajectory that the theme travels, frequently
known as path, source and goal roles.  Directional complements are
also found with verbs belonging to more abstract domains that appear
to involve a notion of transfer, including verbs of communication
(talk, speak, whisper).

     For verbs with these linking regularities, the lexical semantic
representation involves simply listing the arguments that a verb
requires and identifying the semantic roles played by these
arguments.

     --   Verb classes

     As mentioned earlier, diathesis alternations (alternations in
the expression of arguments of verbs) involve several arguments that
may or may not be realized.  Further, these alternations identify
systematic semantic-syntactic correspondences that reflect
semantically coherent classes.  transitivity alternations involve a
change in the verb's transitivity.  Many verbs of change of state
(break), and more generally, verbs of change of position (roll,
move, turn) frequently experience a causative-inchoative alternation (for
example, the difference between "John broke the window" and "the
window broke."  Verbs of contact-effect (cut, slash, bite) do not
experience this alternation, but do experience the conative alternation
(for example, the difference between "John slashed the meat" and
"John slashed at the meat."  According to Levin, generalizations
involving diathesis alternations apparently refer to components of
meaning that can in turn be used to induce a verb classification.

     Verbs of change of possession (buy, sell)  and verbs of change
of position (slide) both require the same set of semantic roles
(agent, source, goal, and theme) but associate these roles with
different syntactic realizations.  Verbs of transfer of possession
fall into two classes according to whether they pattern like buy
(where the subject is both agent and goal) or sell (where the subject
is both agent and source).

     The subjects of some intransitive verbs, the unergative verbs
(shout, smile), pattern like the subject of transitives (that is, an
agent), while the subject of others, the unaccusative verbs (die,
appear), pattern like the objects of transitives (that is, a patient
or theme).  The unaccusative class includes telic verbs (verbs
denoting events with an inherent endpoint, primarily change of state
and location verbs), and the unergative class includes atelic verbs
(denoting events with no inherent endpoint, essentially activity
verbs).  Verbs of light emission (flicker, glow, shine), whose
single argument is neither volitional nor animate, are classified as
unergative verbs, suggesting that the distinguishing property of
activity verbs is that of an event without an inherent endpoint,
rather than taking an animate and volitional argument.

     --   Adjunct characteristics of verbs

     Adjuncts qualify the event or state denoted by the verb by
adding information expressing when, where, how, and why the event
took place or the state held.  Benefactive adjuncts indicate a
person who benefits from the action denoted by the verb;
instrumental adjuncts indicate the tool used in performing the
action; manner adjuncts indicate the manner in which an action is
performed.  Regularities (with constraints on distribution) are
found in the expression of adjuncts.  Some adjuncts are found only
with verbs that take an agent argument, denoting an action that is
controllable and able to be performed intentionally.  Certain
adjuncts permit alternate syntactic realizations.  Benefactive
adjuncts can be expressed as the indirect (first) object in a double
object construction as well as by a for phrase.  Indirect objects
with a benefactive interpretation are only found with verbs of
particular semantic classes:  verbs of creation (make) or obtaining
(buy), but not with verbs involving change of position (put).

     --   Lexical semantic representation

     The preceding discussion of types of arguments for verbs
portrays the range of arguments that are generally found.  Every
noun phrase in a sentence should be associated with a semantic role;
it would appear that in some systems of roles a noun phrase may be
assigned two roles.  Many sets of semantic roles have been proposed. 
Levin (1991b) discusses the difficulties of developing a theoretical
framework that is consistent in the use of semantic roles. 
Basically, the issue is whether a set of roles can be proposed to
cover all verbs or whether it is necessary to define roles with
respect to individual verbs.  The problem is that it is difficult to
encode multiple relations in a semantic role list.  This difficulty
will emerge when the set of verbs is arranged hierarchically to take
advantage of syntactic and semantic regularities.  The solution
seems to be the decomposition of a verb's meaning, but here the
problem is the identification of primitive elements.  This issue is
discussed more fully below in the discussion of type hierarchies and
ontologies.

     In a list of semantic roles, the elements are labels that
identify arguments according to the semantic role they bear to the
verb.  The criteria used for adopting a particular set are usually
based on (1) a brief description intended to capture the intuitive
understanding of what qualifies as an instance of the role and (2)
an examination of systematic semantic-syntactic correspondences. 
Minimally, the set of semantic roles is chosen in order to account
for the entailment and paraphrase relations involving verbs and
their arguments.

       8.2     Semantic representation, a word's type, and
               selectional restrictions

     The interpretation of a sentence involves determining the
appropriate meaning of each word and representing this meaning in a
useful way.  Each word may have several meanings or senses. 
Although syntactic features and parsing may enable some
disambiguation (or identification of the appropriate sense), other
semantic information provides further insights.  A fundamental piece
of information about a word sense is the 'type' of the concept it
represents.  The representation of the meaning of a word is closely
related to its type; and, insofar as that meaning has arguments
associated with it, there may be selectional restrictions for those
arguments expressed in terms of 'types'.  In this section,
therefore, we shall be concerned with representing the essential
meaning of a word and identifying selectional restrictions for its
arguments.  Both these aspects relate very strongly with one's view
of the world, which needs to be expressed by an ontology, the
subject of the next section.

     --   Kernels of meaning

     The meaning of a lexical item consists of the concept that it
expresses and its argument structure.  A concept is operationalized
through words, but in general should not be viewed as a word or even
a set of words.  Concepts are related to one another.  A grammar
expresses how concepts, as expressed by words, may relate to each
other syntactically.  An ontology, the subject of the next section,
groups concepts by their properties.  But an ontology does not
characterize the elements of meaning we are trying to represent. 
Instead, these elements of meaning must either be primitives,
expressed as a word or short phrase, or may sometimes need to be
expressed, not in words, but as an image or a sound or other
sensation.  (Images or sounds may conveniently be included in DIMAP
as data files or programs attached to FEATUREs.)

     As described by Levin (1991b), decomposition assumes that
meaning is composed of a number of primitive predicates.  Jackendoff
(1990) suggests that such entities as thing, event, state, action, place,
path, property, and amount are primitives.  Similarities in meaning are
captured by attributing common elements to decompositions.  Verbs
group by sharing properties.  The decomposition should predict and
explain regularities in the expression and distribution of arguments
and adjuncts.

     The basic requirement of this approach is the selection of an
appropriate set of primitive elements.  One criterion is the use of
entailment.  If the meaning of one word entails that of a second,
the decomposition of the first is typically assumed to include that
of the second.  A representation that uses decomposition must
specify the means of combining the elements that enter into the
decomposition.  Schank proposes functional composition.  (See the
discussion under ontology in section 9 for methods for representing
elements in DIMAP.)

     Some decompositions are intended to be exhaustive, some are
partial.  Conceptual dependency diagrams are supposed to be
exhaustive, while Jackendoff's system envisions the decompositions
supplemented by modifiers which encode certain idiosyncratic aspects
of the meaning.  The distinction between linguistic knowledge and
real world knowledge must be taken into account.  Schank wants to
elucidate the causal structure of the event denoted by a verb to
facilitate drawing inferences; hence, chains of events are included
if they are part of the meaning of a verb.  Some decomposition
approaches allow the introduction of constants to fill certain
argument positions.

     Some works in lexical semantics assume that the notions of
motion (events) and location (states) are the concepts around which
predicates can be classified and their argument structures
organized.  Works with a lexical orientation assert that verbs fall
into two major classes:  verbs of location (taking arguments with
the role of theme--the located object--and location) and verbs of
motion (taking arguments with the roles of theme--the moving object,
source, goal, and path).

     Under this approach, verbs that are not verbs of motion or
location are viewed as verbs of motion or location by analogy.  An
integral part of this approach is the notion of fields.  According
to Jackendoff's Thematic Relations Hypothesis (Jackendoff 1983,
p.188), the principal event-, state-, path-, and place-functions are
a subset of those used for the analysis of spatial location and
motion, differing according to what types of entities that may
appear as theme or reference objects and what kind of relation
assumes the role played by location in the field of spatial
expressions.  In this approach, there is also an independent causal
dimension (Jackendoff 1987, 1990), the 'actional tier', to deal with
causation, instruments, and related notions.

     -- Selectional restrictions

     As indicated in the last section, Allen's approach to semantic
interpretation (p.197ff) is centered around the notion of a logical
form:  the meaning of each word sense is encoded according to a
precise structure enabling that meaning to be incorporated
(composed) with the meaning of other words (to which it is tied
syntactically) to build the meaning of a larger unit of text.  The
composition (merging, unification) process involves determining
which word sense to use and then performing the actual composition.

     Allen determines which sense(s) to use from the "if" part of
semantic interpretation rules, which specifies the conditions when
the rule is applicable.  The "if" part (referred to as the left hand
side, or LHS) specifies both syntactic and semantic criteria,
encoded as patterns that describe the constraints on the phrases
containing the lexical entry.  The pattern identifies the syntactic
position the lexical entry must occupy, along with the syntactic
constructs surrounding it.  The values in the surrounding context
are usually not identified exactly, but rather are specified through
a list of selectional restrictions.  Thus, the adjective green
(referring to the color) must appear in the following syntactic and
semantic context:

                         (NP ADJS green
                             HEAD +physobj)

while in the "inexperienced" context, we would require

                         (NP ADJS green
                             HEAD +human).

     Selectional restrictions encoded in this way can act as
constraints enforced in parsing by using them in conjunction with a
type hierarchy or ontology.  The way this works is as follows:  When
a candidate for a particular slot is proposed (for example, for the
HEAD slot), the candidate is checked for its position in the
hierarchy.  We determine whether it is possible to reach a node with
the specified characterization from the entry definition of the
head.  (Negative restrictions can also be used--searching the type
hierarchy beginning with a candidate and reaching a node that is the
value following the negative sign means that the candidate should be
rejected.)

     In DIMAP, selectional restrictions and syntactic context can be
specified in the LHS of the SEMANTIC INTERPRETATION RULE field. 
This LHS is structured using slot-filler notation.  In LISP format,
the type of the syntactic constituent is identified directly after
the opening parenthesis (usually S for sentence, NP for noun phrase,
or PP for prepositional phrase).  An arbitrary number of slot-filler
pairs may then follow, as desired.  Each pair consists of a slot
name and a slot value.  The slot names are arbitrary, but typically
identify syntactic roles.  The slot values may specify a lexical
item or any number of selectional restrictions, each one prefixed by
a plus or minus sign.

     DICT2 provides several examples using Allen's formalism. 
Complex semantic interpretation rules appear under several entries,
particularly the pseudoentries (those prefixed with "#").  Also, see
chapter 3 for details on viewing semantic interpretation rules and
chapter 4 for details for entering this information into a DIMAP
dictionary.

     --   Semantic composition within an ontology

     In Meyer et al., the representation of meaning is specified by
a map into a separate ontology or into structures encoding attitudes
or relations.  The ontology provides a system of concepts (that is,
identifying the concepts and any relations among them).  Each word
sense is linked to some concept in the ontology, which is expected
to be independent of particular languages.  In DIMAP, a separate
ontology and a set of attitudes and relations can be created by
using pseudoentries.  Thus, the ontology, attitudes, and relations
contain the general structure of the meaning, while the individual
lexical entries contain specific values (or variables for values),
and selectional restrictions on these values, to be placed into the
slots of the general structure.  DICT4 implements examples from
Meyer et al. showing semantic representations that explore these
notions, as discussed below.  (See the entries for drop by, eat,
smell, coffee, fresh-brewed, bright, delicious, by, in, and of, as
well as #visit, #ingest, #voluntary-olfactory-event, #involuntary-
olfactory-event, #olfactory-attribute, and #olfactory-sense.)

     The essence of the meaning is contained in the ontology,
through its links within the hierarchy and the argument structure
encoded within its feature list (see particularly the entries in
DICT4 for #visit, #voluntary-olfactory-event, #involuntary-
olfactory-event, and #ingest).  For these examples, the argument
structure in the ontology indicates that these concepts have agent,
experiencer, theme, or instrument arguments and identifies the range of
ontological concepts that may validly fill these positions.  The
lexical entries for drop by, eat, and smell (both a noun and a verb
sense), on the other hand, only identify the syntactic variable that
will fit into those argument positions, in some cases with
additional selectional restrictions.

     In general, there will not be a one-to-one mapping between a
word sense and an ontological concept.  For more general words, it
is to be expected that several lexemes will map to a single concept
in the ontology.  In a terminological lexicon, there will be a
tendency for nomenclature to correspond to conceptual objects
precisely (such as chemical compounds, machinery, and electronic
components).

     If there is not a single concept in the ontology to which a
lexeme is mapped, the mapping is taken to the concept that is the
most specific concept that is still more general than (that is, that
subsumes) the meaning of the lexeme in question.  Once this concept
is determined, constraints/information are specified in appropriate
slots in the lexicon.  The information can be the value for the
slot, either a default, a constant, or a reference to a variable
that will be filled during semantic processing and conceptual
dependency building.  The constraints constitute selectional
restrictions on what the value for a slot is allowed to be.  The
selectional restrictions may be any concept (or boolean combination
of concepts) from the ontology, where the value must then be a
descendant node of (or equal to) the concept in the ontology.

     The ontological category to which a lexeme is mapped does not
have to be of the same syntactic category.  Thus, a verb can map to
an ATTRIBUTE ("the flower smells sweet"), rather than an EVENT.  A
verb can map to a RELATION (e.g., OWNER-OF or CONTAINS); a noun to
an EVENT (e.g., discussion or process), rather than an object; a
noun to an ATTRIBUTE (e.g., age, temperature, color, or size); a
noun to a RELATION (e.g., ownership or possession); and an adjective
to a complex mapping, such as RELATION (as head concept) + OBJECT
(value of RELATION slot) (e.g., wooden ==> (MADE-OF (WOOD)).  In
DICT4, the entry for smell contains one noun sense and one verb
sense that map into the ontological category #voluntary-olfactory-
event.

     In some cases, the meaning of a word may indicate the
modification of another concept, or the semantic relationship in
which the meaning of other words stand relative to each other.  Many
adjectives do not instantiate a concept, but specify a particular
value of an attribute of some other concept, typically the concept
which corresponds to the meaning of the word which the adjective
modifies syntactically.  Therefore, instead of a link to an
ontological concept, the variable representing the head ("^$var1")
is used, indicating that the concept instantiated by the lexeme
which is bound to $var1 in the syntactic structure fs-pattern will
have the particular characteristic represented by the adjective.  If
the slot and value being added are already specified in the concept
in question, then this new information overrides that information,
since it is more specific.  The adjectives fresh-brewed, bright, and
delicious in DICT4 show how this is implemented.  Note also that
since both bright and delicious are adjectives that can be used both
attributively and predicatively, but the nouns they modify will have
the same characteristics, an intermediate entry is created in DIMAP
to avoid the duplication of information.  The selectional
restrictions on the noun they modify are actually contained in the
entries bright_1 and delicious_1, since otherwise this information
would be repeated in both senses of bright and delicious.

     Variables are also used in the semantic representation of
prepositions.  In this case, selectional restrictions are placed on
both the head of the phrase and the object of the preposition.  (See
the examples in DICT4 for the prepositions in, by, and of, where the
semantics instantiates a relation between two other concepts, in
these cases identifying instrument, location, destination,
ownership, and domain relations between the two concepts.)

     In addition to representing the conceptual meaning of a text,
Meyer et al. also encode knowledge about speaker/hearer attitudes
and structure of the discourse or text (which they term domain and
textual relations).  Attitudes encode belief (epistemic), value
(evaluative), deontics, expectations, volitions, and importance
(saliency).  Each of these is encoded in a quintuple consisting of a
type, a value, an attributed-to slot, a scope, and a time.  In
DIMAP, these are encoded as pseudoentries with slots and values (see
#evaluative and #attitude in DICT4) that may also be filled in
particular lexical entries (see sense 8 for smell in DICT4).

  9  ONTOLOGIES AND TYPE HIERARCHIES

     As humans, we frequently categorize the world around us in our
attempt at understanding.  So too in computational linguistics.  We
categorize the meaning of a word sense through specification of its
type:  What type of object is represented by a word and where does
that concept fit within the world of concepts.  The types or kinds
of objects or concepts can usually be arranged into a hierarchy. 
However, defining or specifying these types (or ontology) needs to
be done carefully.  (See Allen, pp.195-7, for some discussion of
these issues.)  In general, types should be characterized using
words.

     DIMAP specifically includes mechanisms for representing types
and placing them into a hierarchy.  The principal component for
handling this hierarchy is the SUPERCONCEPT field of an entry.  This
field is designed to point to a parent (or superconcept or genus or
hypernym or AKO (a-kind-of) link).  Each sense of an entry can be
linked to one or more superconcepts:  You simply enter the word you
wish to use as the type.  Since it is presumed that the type is a
word, it is likewise presumed that the type will have its own entry
in the dictionary, along with one or more senses.  Therefore, you
are requested to identify which sense of the genus term you wish to
be identified as the parent of the sense you are creating.  You may
specify only one sense; if you wish to specify all senses, DIMAP
suggests using zero ("0") as the sense link.  If you wish to specify
more than one, but not all, senses of a genus term, create
intermediate entries to which the several senses are linked.

       9.1     Theoretical structure for an ontology

     For computational purposes, it is important that the ontology
be structured in a way that facilitates its use in both
computational linguistics and knowledge representation.  In Meyer et
al. and Carlson and Nirenburg, the ontological concepts are
developed according to the following principles:  (1) whenever
possible, scientific rather than lay terms are used; (2) consistency
in the naming of ontological concepts going down a subtree is
maintained; (3) an indication of some distinguishing characteristic
of the ontological concept (that is, a characteristic distinguishing
the concept from its sister-concepts) is included in the name; and
(4) definitions are provided for all concepts, so that when the name
of a concept corresponds to a polysemous word, the intended meaning
is clear.

     Carlson and Nirenburg identify "ontological" links between
world model elements.  The basic top-level ontological
classification divides all concepts into free-standing entities and
properties.  Properties (whose semantics is relational in nature)
are described in terms of constraints on entity classes that they
can relate.  When this structure is superimposed on knowledge
representation, the meanings from the ontology trickle down to text
meaning representation as instances of world model entities,
sometimes somewhat modified.  This representation may include
speaker attitudes and domain and textual relations (hence such
information must be included in the lexicon and perhaps the ontology
as well).

     The ontology is a knowledge base in which world model elements
are specified (in theory, independent of any specific language). 
The model is formulated in the syntax of the knowledge
representation language.  The knowledge base in this language thus
consists of a collection of frames, where a frame is a set of slots
and fillers or values.  A filler can be any symbol or a function
call or even a data file.

     Frames are used to represent concepts.  A concept is the basic
building block of the ontology.  Slots are interpreted as a subset
of concepts called properties.  Fillers are (1) names of elements of
the ontology, (2) expressions consisting of ontology elements and
modifiers of the elements, (3) collections of the elements and
modifiers, (4) demons or lambda expressions, or (5) special purpose
symbols and strings.  Mostly, fillers or values will just be other
elements of the ontology.  The modifiers can be facets used to
identify the status of the values of various properties; they can
include actual values of the property referred to by the slot name,
a function describing a range of values, or defaults listing the
most typical value(s) for the given property of the given concept. 
The fillers or values can be expressed as semantic constraints
(selectional restrictions); they refer to ontologicial concepts (and
their subclasses) from which the fillers of the value and default
facets must be selected.

     A filler can be a string, symbol, number or a (numerical or
symbolic range.  A symbol in a filler can be an ontological concept,
signifying that the actual filler can be either the concept in
question or any of the concepts that are defined as its subclasses. 
Symbolic (disjunctive or conjunctive) value sets may be given. 
Numbers, numerical ranges, and symbolic ranges are also legal
fillers.  Syntactic conventions in representation can be to prepend
an ampersand (&) to symbolic value set members in order to
distinguish them from ontological entity names.

       9.2     Basic ontological modeling choices

     The specific development of an ontology may vary from person to
person; there is unlikely to be agreement for some time on the
precise elements to be included.  However, a specific example is
instructive.  In Carlson and Nirenburg, the concepts OBJECT, EVENT,
and PROPERTY are SUBCLASSES of the concept ALL, which serves as the
root of the network:

     (ALL
        (SUBCLASSES (value +property +object +event))
     )

In DIMAP, this root is implemented as the entry #all in DICT4, with
the subclasses identified as INSTANCES.  The SUBCLASSES property and
its inverse, IS-A, are the major classifying relations in the world
model.  (In general, each relation has an inverse.)  The three
subclasses here are interpreted as a disjunctive set.  To override
this default interpretation, it is necessary to prefix the set with
"and" or "or".

     Properties are described with two special slots, domain and
range.  These are special properties that apply to other properties
and specify the beginning and end points, respectively, of the links
that the properties represent.  Thus, we can have the following
entry (see #is-a in DICT4):

     (IS-A
          (DOMAIN +all)
          (RANGE +all)
     )

When the value for a slot is the name of an ontological concept X,
its semantics is "X and all entities that are descendants of X in
the ontology.  The filler +all in both slots of the #is-a frame
stands for any concept in the ontology.  Based on the semantic
constraint on the filler of the range slot in a property, properties
are classified into two large classes--attribute and relation. 
Relations have references to concepts in their range slots;
attributes have references to values from value sets.

     Objects can typically have identifiable parts or constitute a
part of some other objects, can belong to somebody, can be at a
specifiable location and can have the time of its coming into
existence specified.  Events can have identifiable component events
or can be a component of another event, can take place at
specifiable times and/or in specifiable locations, can be caused by
another event and cause another event, can have effects and can be
instantiatable only provided certain conditions are met.  The basic
formats for their frames are (see also #object and #event in DICT4):

     (OBJECT
          (IS-A +all)
          (HAS-AS-PART +object)
          (PART-OF +object)
          (BELONGS-TO +human +organization)
          (AGE > 0)
               (measuring-unit +year)
          (LOCATION +place))

     (EVENT
          (IS-A +all)
          (SUBEVENTS +event)
          (SUBEVENT-OF +event)
          (TIME > 0)
               (measuring-unit +second)
          (LOCATION +place)
          (CAUSED-BY +event)
          (CAUSES +event)
          (PRECONDITION +event)
          (EFFECT +event))

     Case relations in the ontology are different from case
relations in a grammar, that is, they do not describe the predicate-
argument structure of the verbs of a particular language.  Rather,
they are conceptual roles typically associated with events and
objects.  (Notwithstanding the separation from a grammar--and as was
described in the last section, a lexical entry that refers to an
ontological entry indicates how the syntactic structure maps into
these case roles.)  Case roles must have a domain and a range.  The
domain specifies the types of frames in which a particular case-role
can occur as a slot, while the range specifies what the fillers of
the slot can be.  An example of a case role frame is:

     (AGENT
          (IS-A +case-role)
          (DOMAIN +event)
          (RANGE +animal +force
               (default +intentional-agent))
          (INVERSE +agent-of)
          (DEFINITION    "the entity that causes or is responsible for an
                         action"))

The entry for #agent in DIMAP-2 in DICT4 has a SUPERCONCEPT link to
#case-role in the ontology.  The domain and range slots are entered
as FEATURES, while the inverse slot is entered as a ROLE.

     The entries for #is-a, #object, #event, and #agent extend the
ontology beyond a simple delineation of the objective world.  The
inclusion of these metalinguistic entities makes it possible to
characterize a great deal of linguistic processing itself.  Their
inclusion imposes a discipline on the ontology to ensure that it is
self-consistent.  This is very important.

     The basic inventory of case roles included by Carlson and
Nirenburg are:  agent, theme, experiencer, beneficiary, instrument, location,
source, goal, and path (see definitions on pp. 7-8).  They can be
extended to include:  co-agent, co-theme, and "modifying" properties
of events (spatiotemporal relations and conditions).  Spatiotemporal
relations specify the general time and location of an event.  (See
also Allen, pp. 198-206.)

     Carlson and Nirenburg next turn to a breakdown of the top-level
object and event subtrees.  Objects are broken down into physical
[discrete (animate, inanimate), mass-like, places], mental
[abstract, representational-objects (mathematical objects, language-
related objects, icons, pictorial objects)], and social
[geopolitical entities, organizations].  Events are subdivided
similarly into mental [cognitive, communicative, perceptual],
physical [change-location, perceptual], and social [communicative
(speech-act)].

     Further detail is provided for the perceptual events to
demonstrate the processes of concept identification and delineation
of conceptual boundaries.  This generally involves a careful
examination of what might be a discrete and exhaustive subset of a
particular concept.  One result is that different case roles are
assigned to each subconcept; this is ultimately the distinguishing
factor.

     An entry in the ontology can include properties other than
those that might be directly considered as linguistic in nature. 
Conditions specify the causal and intentional structure of events in
terms of preconditions and postconditions, and include relations such as
purpose, cause, effect, presupposition.  With the addition of these
slots, the role of the ontology moves beyond computational
linguistics into knowledge representation.  These slots serve as the
basis for conceptual information processing as described by Schank,
including the establishment of causal structures (Schank) and
scripts, plans, and goals (Schank and Abelson).

     An essential component to enabling these relations to serve
larger knowledge representation is the establishment of a formalism
for representing and manipulating these conceptual structures.  As
described by Meyer et al. and Nirenburg and Defrise, there are three
categories of relations, each with their types and subtypes:

     --   Domain relations (connecting events, states, and objects),
          including causal (volitional, nonvolitional, reason,
          enablement, purpose, and condition), conjunction (addition,
          enumeration, adversative, concessive, and comparison),
          alternation (inclusive-or and exclusive-or), coreference,
          temporal (at, after, and during), and spatial (in-front-of,
          left-of, above, in, on, and around);

     --   Textual relations (connecting elements of text), including
          particular, reformulation, and conclusion; and

     --   Intention-domain relations (connecting speaker intentions to
          the events described in the text).

     Each subtype (e.g., #enablement) would be a main entry (in an
ontology or DIMAP), linked through its type (e.g., #causal) to its
category (e.g., #domain-relations), and to the overarching ontology
entry (that is, #relations).  Each relation has a set of arguments,
which can be named (such as is the case for arguments like agent or
theme) or simply identified by position (such as first or second, with
or without any positional significance).

     A relation comes into play through some discourse clue,
established by parsing or other reasoning component to trigger the
relation.  Alternatively, a particular lexical item can overtly
result in the filling of a slot in an entry linked to a relation. 
This would be accomplished in DIMAP by having a SUPERCONCEPT link to
a relation, with one or more features, identified by name or
position as the feature name and a variable name as the feature
value.

     The lexical item can have slots for particular preconditions,
effects, and subevents.  These slots will have values that are other
entries in the ontology--what Carlson and Nirenburg call ontological
instances.  These ontological instances will be linked somewhere in
the ontology, for example, to script names, that will be activated
thereby with slots filled as a result of the activation of the
lexical item.  These slots will be filled starting with the lexical
item, into some place in the ontology, thence downward to
ontological instances, and then upward to scripts or other reasoning
components.

     The sample entry #teach in DICT4 provides an example of how
this would work.  The ontological entry #teach is triggered by the
word "teach" or some other lexical item (e.g., "instruct"), which
include features for such slots as agent, theme, and location linked to
syntactic variables.  The ontological entry #teach also has feature
slots that include preconditions, postconditions, and subevents.  The
values for these slots are ontological instances, each of which has
its own ontological entry, in this case, #teach-know-1, #teach-know-
2, #teach-describe, #teach-request-info, and #teach-answer.  Each of
these instances have reference to other ontological classes or
scripts through their SUPERCONCEPT link, but also have links to the
ontological entry for #teach via their features.  Their feature
slots are the usual types of semantic relations (experiencer, theme,
agent), but their values are unusual in that they reference features
of the ontological entry for #teach.  Thus, for example, the
experiencer slot of #teach-know-1 has the value #teach.agent.  So,
whatever value is attached to the agent slot of #teach is
coreferenced to #teach-know-1.  Finally, the SUPERCONCEPT links
indicate which other ontological entries or scripts are triggered in
developing the representation of a particular piece of knowledge.

     Complex events include named sets of component events; the
frame for a complex action therefore includes a slot for subevents. 
The representation of a complex action is typically a set of frames,
not a single frame.  Conditions and components are described 
through a reference to particular instances of certain events. 
These component actions are not just any instances of their
respective types but instances specifically constrained in the ways
necessary for the description of the main event.  Even though their
descriptions contain a reference to their class, their major
allegiance is to the concept in whose definition they appear.

     The introduction of the ontological instances is parallel to
further constraining the values of some fillers.  The difference
between the two approaches is in being able explicitly to refer
lexically to a modified concept if it is introduced as an
ontological instance, whereas to refer to the concept is possible
only on a value.  This allows for paths to be used, with the
notation <frame-name>.<slot-name> meaning "the filler of the given
slot in the given frame."  This achieves coreferentiality.

     Information recorded in the ontology relates to all objects,
events, and properties, but not to any particular instance of any
such class.  Developing the representation for and reasoning about
entities in the world always involves dealing with instantiations of
these ontological types.  The lexicon thus provides static knowledge
about the ontology and text meaning representation fragments. 
During analysis (parsing), the source text is converted into this
meaning representation based on world and contextual knowledge
(including pragmatics and discourse).

       9.3     WordNet - An ontological database

     WordNet (Miller et al.) is an on-line lexical reference system
in which nouns, verbs, and adjectives are organized into synonym
sets, each representing one concept and linked by different
relations.  This database, while not computationally oriented, can
be used as the basis for seeding data into an ontology of the type
described in the previous two sections.  In addition, several
principles and observations that emerged in its construction are
important in the design and implementation of a computational
ontology.  Each synonym set consists of a set of words and a
periphrastic expression (definition).  The overall structure of
WordNet is a directed graph of nodes, with each node representing a
concept, and with primitive nodes labeled by several lexicalizations
(words), corresponding to the structure described in Litkowski.

     -- Nouns

     Miller describes the nouns in WordNet.  He notes that nouns are
usually conveniently defined with a superordinate and distinguishing
features.  WordNet is organized into a lexical inheritance hierarchy
of synonym sets linked by pointers to the superordinate and by
pointers involving the distinguishing features of attributes
(modification), parts (meronymy), and functions (predication),
although only the meronymic relations are implemented.

     The nouns are partitioned into 25 "beginners" as follows:

     {act, action, activity}            {natural object}
     {animal, fauna}                    {natural phenomenon}
     {artifact}                         {person, human being}
     {attribute, property}              {plant, flora}
     {body, corpus}                     {possession}
     {cognition, knowledge}             {process}
     {communication}                    {quantity, amount}
     {event, happening}                 {relation}
     {feeling, emotion}                 {shape}
     {food}                             {state, condition}
     {group, collection}                {substance}
     {location, place}                  {time}
     {motive}

These categories can be grouped into a hierarchy using {thing,
entity} as the top, and {living thing, organism} and {non-living
thing, object} as the second level.  The hierarchies involved in
these groups seldom go more than 10 or 11 levels deep, most of them
technical.

     Distinguishing features are attached to each level and are
inherited.  Attributes are given by adjectives, parts by nouns, and
functions by verbs.  Short explanatory phrases are attached to each
synonym set.

     Attributes associated with a noun are reflected in the
adjectives that can normally modify it.  Several part-whole
relations are observable, including component-object, member-
collection, portion-mass, stuff-object, feature-activity, place-
area, and phase-process.  A functional feature of a noun describes
what instances of the concept normally do or what is normally done
with or to them; nouns play various semantic roles as arguments of
the verbs with which they co-occur (instruments, materials,
products, containers, etc.).  Functional information should be
included by pointers to verb concepts.  (See also the discussion of
lexical relations in section 10 of this chapter.)

     --   Adjectives

     Gross and Miller describe the adjectives in WordNet. 
Adjectives modify, modulate, or elaborate the meanings of nouns and
verbs.  In WordNet, it is assumed that adjectives modifying verbs
are simple adjectives that have -ly added as suffix.  Syntactically,
adjectives can appear immediately before the noun they modify
(attributive or prenominal) or in the predicate of a sentence after
a copular verb (predicative position).  An adjective is usually
marked when it can fill only particular positions.

     A qualifying adjective (heavy, old) gives a value to a
particular attribute of a noun (weight, age).  The organizing
principle of adjectives is the antonymy relation; this occurs within
a dimension based upon the attribute.  In WordNet, synonym sets for
adjectives are related to one another on the basis of a "focal
adjective", one that has a direct antonym with another adjective. 
Other adjectives that seem to have no direct antonym are related to
particular focal adjectives by a 'similarity' pointer.

     A large class of nonpredicative adjectives (musical, atomic)
seem to play a role similar to that of a modifying noun.  These
relational adjectives can be joined to nouns but not with qualifying
adjectives, are not gradable, cannot be nominalized, and do not have
direct antonyms.  These adjectives are maintained in WordNet by
pointers to the corresponding nouns.

     Gradation is an important semantic relation organizing lexical
memory for adjectives; however, this relation is not included in
WordNet.  Since most attributes have an orientation, they tend to be
anchored at a point of origin which is the expected or default
value; deviation from this default is called the marked value of the
attribute; it is not coded in WordNet.  Color adjectives can be
graded, nominalized, and conjoined with other qualifying adjectives. 
Only one color attribute is coded in WordNet--light/dark and
white/black; the opposition chromatic/achromatic is used to
introduce the names of color.  Finally, adjectives are selective
about the nouns they modify (except in figurative or idiomatic use),
so that the noun must have the attribute whose value is expressed by
the adjective; these are assumed to be computed as needed and are
not prestored in WordNet.  (Further discussion of these issues is
presented below under lexical relations.)

     Qualifying adjectives are coded in WordNet into bipolar
clusters, with the direct antonyms listed as the "head synonym set"
of two clusters, one headed by each, giving pointers to the
similarity set of each.  Each of the members of the similarity set
has its own synonym set, with a link back to the direct antonym set. 
If an adjective has a syntactic limitation (to prenominal,
immediately postnominal, or predicative position), a code for this
limitation is given.

     -- Verbs

     Fellbaum describes the semantic network of English verbs in
WordNet.  Verbs change meaning easily, based on the nouns with which
they co-occur or by different elaborations of one or two common core
components shared by most senses of the verb.  Verbs are divided
into 15 groups, largely on the basis of semantic criteria, including
verbs of bodily care and functions, change, cognition,
communication, competition, consumption, contact, creation, emotion,
motion, perception, possession, social interaction, weather verbs,
and verbs referring to states.  These groups reflect the major
conceptual categories event and state.  These groups derive their
names from the topmost verbs, or 'unique beginners,' which head
these groups.  These topmost verbs resemble 'core components', the
unelaborated concepts from which the verbs constituting the semantic
field are derived via semantic relations.

     Within each verb group are hundreds of 'synsets' or closely
synonymous sets of verbs, often with a periphrastic expression
(rather than a lexicalized synonym) that gives subtle meaning
differences and selectional restrictions.  The expression shows the
way in which the verb has become lexicalized by showing constituents
that have been conflated in the verb.

     WordNet is not organized with sets of components, but rather
through semantic relations linking verbs to each other.  (This
relation can be expressed explicitly; the components can probably
also be expressed explicitly.  Together, this explication can serve
to act as seeds into an explicit computational inheritance
hierarchy.  As given, WordNet cannot be used computationally.)  Some
semantic relations (or primitive components) that are embodied in
WordNet include cause, opposition, path, and manner.  The components
constitute subpredicates and correspond to root verbs, or topmost
'unique beginners,' heading semantic fields.  (A given verb may have
two such components.)

     This componential analysis can be viewed in terms of
entailment, in that a verb V1 that is a component of another verb V2
must be entailed by V2.  The semantic relations among verbs in
WordNet all interact with entailment.  Lexical entailment is a
semantic relation between two verbs V1 and V2 that holds when the
sentence Someone V1 logically entails the sentence Someone V2. 
Lexical entailment is a unilateral relation. Negation reverses the
direction of entailment.  The converse of entailment is
contradiction.

     The entailment relation between verbs resembles meronymy
between nouns, but meronymy is better suited to nouns than to verbs. 
Any acceptable statement about part-relations among verbs always
involves the temporal relation between the activities that the two
verbs denote.  One activity or event is part of another activity or
event only when it is part of, or a stage in, its temporal
realization.  Some activities can be broken down into sequentially
ordered subactivities.  These are complex activities that are said
to be mentally represented as scripts.  They tend not to be
lexicalized.  The analysis into lexicalized sub-activities is not
available for the majority of simple verbs in English.  Yet there
are some and the reason lies in the kinds of entailments that hold
between the verbs.

     The sets of verbs related by entailment have in common that one
member temporally includes the other.  A verb V1 will be said to
include a verb V2 if there is some stretch of time during which the
activities denoted by the two verbs co-occur, but no time during
which V2 occurs and V1 does not.  If there is a time during which V1
occurs but V2 does not, V1 will be said to properly include V2.

     The sentence frame used to test hyponymy between nouns, An x is
a y, is not suitable for verbs.  The semantic distinction between
two verbs is different from the features that distinguish two nouns
in a hyponymic relation.  For verbs, lexicalization involves many
kinds of semantic elaborations across different semantic fields. 
The many different kinds of elaborations that distinguish a 'verb
hyponym' from its superordinate have been merged into a manner
relation (dubbed troponymy).  The troponymy relation between two
verbs can be expressed by the formula To V1 is to V2 in some
particular manner.  'Manner' is interpreted here very loosely. 
Troponyms can be related to their superordinates along many semantic
dimensions.  Subsets of particular kinds of manners tend to cluster
within a given semantic field.

     Troponymy is a particular kind of entailment, in that every
troponym V1 of a more general verb V2 also entails V2.  The
activities referred to by a troponym and its more general
superordinate are always temporally co-extensive.  Troponymy
therefore represents a special case of entailment:  pairs that are
related by troponymy are also always temporally co-extensive and
related by entailment.  Verbs related by entailment and proper
temporal inclusion cannot be related by troponymy.

Verb Taxonomies

     Verbs cannot easily be arranged into the kind of tree
structures onto which nouns are mapped.  Within a single semantic
field it is frequently the case that not all verbs can be grouped
under a single unique beginner; some semantic fields must be
represented by several independent trees.  Motion verbs, for
example, have two top nodes, {move, make a movement} and {move,
travel}.

     Verb hierarchies tend to have a more shallow, bushy structure
than nouns; in few cases does the number of hierarchical levels
exceed four.  Moreover, virtually every verb taxonomy shows a bulge,
that is, a level far more richly lexicalized than the other levels
in the same hierarchy.  In most hierarchies, the level below the
most richly lexicalized one has few members.  For the most part,
they tend not to be independently lexicalized, but are compounded
from their superordinate verb and a noun or noun phrase.  As one
descends in a verb hierarchy, the variety of nouns that the verbs on
a given level can take as potential arguments decreases.  This seems
to be a function of the increasing elaboration and meaning
specificity of the verb.

Opposition Relations between Verbs

     After synonymy and troponymy, opposition is the most frequently
coded semantic relation.  Much of the opposition is based on the
morphological markedness of one member of an opposed pair.  Some
pairs are conceptually opposed, but are not direct antonyms.  Many
deadjectival verbs formed with a suffix such as -en or -ify inherit
opposition relations from their root adjectives.  These are, for the
most part, verbs of change and would decompose into "become +
adjective" or "make + adjective."  A variety of negative
morphological markers attach to verbs to form their respective
opposing members.  The semantics of this morphological opposition is
not simple negation.  Some pairs are gradables (terms are points on
a scale) that can be modified by degree adverbs.

     Some direct antonyms are associated with each other rather than
with verbs that are synonyms of their respective opposites and that
express the same concept as that opposite.  These pairs are
illustrative of an opposition relation that is found between co-
troponyms (troponyms of the same superordinate verb).  They
constitute an opposing pair because the direction of motion, upward
or downward, is opposed or the manners are opposed (slow or fast) in
ways that distinguish each troponym from its superordinate.

     Converses are opposites that do not have a common superordinate
or entailed verb; they occur within the same semantic field:  they
refer to the same activity, but from the viewpoint of different
participants.  Most antonymous verbs are stative or change-of-state
verbs that can be expressed in terms of attributes.  There are many
opposition relations among stative verbs:  live/die,
exclude/include, differ/equal, wake/sleep.

     Many verb pairs are not only in an opposition relation, but
also share an entailed verb (hit and miss entail aim).  These verbs
are not related by temporal inclusion.  The relation between the
entailing and the entailed verbs is one of backward presupposition,
where the activity denoted by the entailed verb always precedes in
time the activity denoted by the entailing verb.  Entailment via
backward presupposition also holds between certain verb pairs
related by a result or purpose relation.  A verb V1 that is entailed
by another verb V2 via backward presupposition cannot be said to be
a part of V2.  (The set of verbs related by entailment can be
classified exhaustively into two mutually exclusive categories on
the basis of temporal inclusion.)  Some opposition relations
interact with the entailment relation in a systematic way.  One
member of these pairs constitutes a 'restitutive'; this kind of
opposition also always includes entailment, in that the restitutive
verb always presupposes what one might call the 'deconstructive' one
(damage/repair).  Many reversive un- or de- verbs also presuppose
their unprefixed, opposed member (tie/untie).

The Causal Relation

     The causative relation picks out two verb concepts, one
causative (like give), the other what might be called the
'resultative' (like have).  The subject of the causative verb
usually has a referent that is distinct from the subject of the
resultative; the subject of the resultative must be an object of the
causative verb, which is therefore necessarily transitive.  The
causative member of the pair may have its own lexicalization,
distinct from the resultative.  English does not have many
lexicalized causative-resultative pairs; it has an analytic, or
periphrastic, causative, formed with cause to/make/let/have/get to, that
is used productively.

     A periphrastic causative is not semantically equivalent to a
lexicalized causative, but refers to a more indirect kind of
causation than the direct, lexicalized form.  WordNet recognizes
only lexicalized causative-resultative pairs.  The synonyms of the
members of such a pair inherit the cause relation, indicating that
this relation holds between the entire concepts rather than between
individual word forms only.  Unlike entailment, the causation
relation is not inherited by the troponyms.

     Causative verbs have the sense of cause to be/become/happen/have or
cause to do.  They relate transitive verbs to either states or
actions.  In both cases, causation can be seen as a kind of change. 
Many verbs clearly have the semantics of such a causative change,
but they do not have lexicalized resultatives.  There are many verbs
in English that have both a causative and an anticausative usage. 
Most of them cluster in WordNet among the verbs of change, where
many verbs alternate between a transitive causative form and an
intransitive anticausative (or unaccusative, or inchoative) form. 
Most anticausative verbs imply either an animate agent or an
inanimate cause.  A few verbs are compatible only with an inanimate
cause.  The causative relation also shows up systematically among
the motion verbs.

     Causation is a specific kind of entailment:  if V1 necessarily
causes V2, then V1 also entails V2.  The entailing verb denotes the
causation of the state or activity identified by the entailed verb. 
The entailment between these verbs is also characterized by the
absence of temporal inclusion.  But unlike backward presupposition,
the entailed verb precedes the entailing verb in time:  A must first
bequeath something to B before B owns it.  The causative relation is
unidirectional.

Syntactic Properties and Semantic Relations

     Considering research that analyzes the constraints on verbs'
argument taking properties in terms of their semantic make-up, based
on the assumption that the distinctive syntactic behavior of verbs
and verb classes arises from their semantic components (see
particularly the discussions of Levin in the previous sections),
WordNet does not incorporate all of a speaker's knowledge about
semantic and syntactic properties of verbs.  To cover at least the
most important syntactic aspects of verbs, therefore, WordNet
includes for each verb synonym set one or several sentence frames,
which specify the subcategorization features of the verbs in the set
by indicating the kinds of sentences they can occur in.  One can
search among the verbs for the kinds of semantic-syntactic
regularities found in the literature or search for all the synonym
sets that share one or more sentence frames in common and compare
their semantic properties; or one can start with a number of
semantically similar verb synonym sets and see whether they exhibit
the same syntactic properties.  An exploration of the syntactic
properties of co-troponyms occasionally provides the basis for
distinguishing semantic subgroups of troponyms.

  10 LEXICAL RELATIONS

     The type hierarchies and ontologies described in the previous
sections instantiate the principle relations between entries in a
computational lexicon.  In addition, by following principles of
lexical inheritance, these hierarchies will reduce duplication in
representation.  By caching the top-level nodes of these
hierarchies, a fully-specified entry can be quickly reconstructed
during computation.  These hierarchies, however, do not exhaust the
range or relations that might exist between lexical entries and that
might be used during computation.  Several formalisms are explored
in this section to provide further options for incorporation into
the lexicon.  (The user is also strongly encouraged to consider the
extensive discussions of relations in Evens, Cruse, and Sowa. 
Relations play a key role in both computational linguistics and
knowledge representation.)

       10.1    Semantic networks and conceptual graphs

     The type hierarchies and ontologies represent an initial
linking of the entries into a semantic network.  Further linking and
specification of a semantic network in DIMAP can be achieved using
the role field of the entries.  Some of this linking was described
in the first section of this chapter in connection with the word
aphasia.  A further demonstration of the linking is given in the
entry for #action, where cases associated with the entry are
identified as roles and where the links are equated to selectional
restrictions.

     -- Conceptual graphs

     Polovina and Heaton provide a simplified introduction to
conceptual graphs based on the work of Sowa (1984).  Conceptual
graphs are based upon the general form:  CONCEPT_1   RELATION  
CONCEPT_2, which may be read as "A RELATION of a CONCEPT_1 is a
CONCEPT_2".  Thus, [Mammal]   (part)   [Trunk] reads "A part of a
mammal is a trunk."

     All concepts have referents, which refer to a particular
individual of that concept.  The concept [Mammal: Clyde] indicates
that Clyde is a mammal.  A concept that appears without an
individual referent has a generic referent, [<Type_Label>: *].  The
generic concept [Trunk] may take up an individual referent with a
unique number referent "[Trunk: #1234]."  Larger graphs may be
constructed.  Sowa presents a conceptual catalog including such
relations as agent, object, instrument, part, and material.

     In representing a complex conceptual graph, it may be necessary
for a concept to have several relations attached to it, with the
second concepts having further relations.  Thus, we can have the
graph:

     [Spoon] -
          (instrument)  [Eat] -
               (object)   [Walnut]   (part)   [Shell: *n]
               (agent)   [Monkey],
          (material)   [Shell: *n].

This represents A monkey eating a walnut with a spoon made out of
the walnut's shell.  The hyphen indicates that the relations of a
concept are continued on a subsequent line.  The comma terminates
the part of a graph that relates to the last hyphen.  Any part of
the graph following the comma relates directly back to the hyphen
before the last hyphen.  The *<whatever> is a coreferent marker for
concepts.  The period terminates the whole graph.

     Note that some of this conceptual graph is factual and not
definitional.  The factual part should not be represented in a
dictionary.  Rather, we might expect that the concepts that make up
this graph would be composed during the process of representing a
sentence or other phrase.  In DIMAP, the basic conceptual graph that
is considered definitional would have an entry for concept_1.  This
entry would have a ROLE relation, with a link to concept_2.

     In conceptual graphs, type labels fall into a type hierarchy. 
Thus, Mammal < Animal.  These relationships are represented as
before, using the SUPERCONCEPT links in DIMAP and can fit within an
ontology described in the previous sections.  These conceptual
graphs have certain properties (based on the lattice that is formed)
and are suitable for further computation to identify
generalizations, combination, and inference (for which, see
references).

     Velardi et al. use conceptual graphs to provide the formalism
for collocative meaning representation.  Conceptual meaning provides
the cognitive content of words--it can be expressed by features or
primitives (as described in previous sections).  Collocative meaning
provides the associations between words or word classes--describing
the uses for a word, but not attempting an explanation of word
associations in terms of meaning relations between a lexical item
and other items or classes.

     Collocative meaning corresponds to the syncategorematic
distinction in psychology:  "almost entirely defined by their
pattern of use."  Humans may more naturally describe word senses
with their syntactic feature and structure characteristics and
relations with other words than with conceptual kindship and other
internal features.  In principle, the inferential power of
collocative, or surface, meaning representation is lower than for
conceptual meaning.  Nonetheless, representing word sense and
sentences with surface semantics is useful for many NLP
applications.

     Velardi et al.'s system is able to acquire syncategorematic
concepts, by learning and interpreting patterns of use from text
exemplars.  The input to the system is 

     -- a list of syntactic collocates (subject-verb, verb-object,
noun-preposition-noun, noun-adjective, etc. extracted through
morphologic and syntactic analysis of the selected corpus) - This
can be accomplished with syntactic parsing of sentence parts and
using some context-dependent heuristics to cut sentences into
clauses.

     -- a semantic bias - The semantic bias is the kernel of any
learning algorithm, as no system can learn much more than what it
already knows.  This consists of:

          a domain-dependent concept hierarchy - This is a many-to-
          many mapping from words to word sense names and an ordered
          list of conceptual categories, to whatever extent a type
          hierarchy has been developed.

          a set of domain-dependent conceptual relations, and a many-
          to-many mapping between syntactic relations and the
          corresponding conceptual relations

          a set of coarse-grained selectional restrictions on the use
          of conceptual relations, represented by concept-relation-
          concept (CRC) triples

The system produces two types of output:

     1.   a set of fine-grained CRCs, that are clustered around
concepts or around conceptual relations; and

     2.   an average-grained semantic knowledge base, organized in CRC
triples.

     Fine-grained CRC are those in which concepts directly map into
content words (e.g. [COW]  (PATIENT)  [BREED]).  These CRCs are
true because they are observed in the domain subworld.  Average-
grained CRC are those in which concepts are ancestors of content-
word concepts.  These CRC are typically true, but they may have a
limited number of exceptions observed in the domain sublanguage
(e.g. [ANIMAL]  (PATIENT)  [BREED] is typically true, even though
breeding mosquitoes is quite odd).  Coarse-grained CRC are those in
which concepts are at a higher level in the taxonomy (e.g. [ACTION]
  (BENEFICIARY)   [ANIMATE_ENTITY]).  They state necessary, but not
sufficient, conditions on the use of conceptual relations.

     An algorithm is given to acquire syncategorematic knowledge on
concepts.  This algorithm is based on machine learning principles of
the type enunciated by Langley.  Several research problems are
described, along with an evaluation of the adequacy of the
technique.  The research is compared to research on concept
formation and lexical acquisition.

     To represent CRCs in DIMAP, for example,

[ACTIVITY]   (LOCATION)   [PLACE]
[CHANGE]   (FINAL_STATE)   [PRODUCT]
[ARTIFACT]   (MATTER)   [MATTER]
[ACTIVITY]   (FIGURATIVE_LOCATION)   [AMBIT]
[FARMING]   (LOCATION)   [GREENHOUSE]
[AGRICULTURAL_ACTIVITY]   (LOCATION)   [BUILDING_FOR_CULTIVATION]

the ROLE field should be used.  The first member of each CRC would
be the main entry, with the relation coded as the role name, and the
third member of the relation would be would be entered as a role
link.

       10.2    Collocational functions

     The collocational relations identified by Velardi et al. focus
on the semantic categories of their arguments.  A slightly different
view focuses on the relations themselves.  Mel' uk, in developing
the combinatory facet of the explanatory and combinatory dictionary
is concerned with identifying the lexical collocations of the
headword of a lexical entry.  These collocational functions include
both syntagmatic and paradigmatic information (about synonyms,
superordinates, antonyms, and conversives) and information about
derivatives and compounds.  The arguments of these lexical functions
are themselves lexical units and therefore will also be dictionary
entries.

     The lexical combinatoric describes the syntax and meaning of
those idiomatic or semi-idiomatic expressions containing the lexeme. 
There are about 50 elementary lexical functions (with specific
syntactic roles) the terms of which, taken either alone or in
combination, can express the meanings of many semi-idiomatic
expressions.  Lexical functions also include a set of "substitution
functions," which express semantic or syntactic relations between
lexemes.  Furthermore, there functions can have semantic constraints
(selectional restrictions) on the arguments.

     The functions describe relations to other lexemes.  Such
functions include the ISA, AKO, and instance links.  In DIMAP, these
functions can be represented in ROLE structures, with the function
name identified as the role name, the argument to the function as
the entry word, and the value of the function as the role value. 
The value of the function should be another entry in the lexicon. 
The lexical functions on the next page are in general use.

                         Sample Lexical Functions

A0             adjective derived from entry word
               A0(dog) = doggy
A1, A2              typical adjective for numbered
                    participant
               A1(suspicion) = full of
Able1, Able2        ability of numbered participant
               Able1(fear) = fearful
               Able2(fear) = fearsome
Adv0               adverb from entry word
               Adv0(happy) = happily
Adv1, Adv2          adverb from numbered participant
               Adv1(fear) = fearfully
Anti               antonym (exact or near)
               Anti(happy) = sad
Bon               standard praise for entry
               Bon(advice) = sound
Caus               cause
               Caus(sit) = set
Centr               center of
               Centr(city) = heart
Cont               continue
               Cont(go) = keep, keep on
Contr               non-antonymic contrast
               Contr(chair) = table
Convijk             conversive (opposite where
                    participants switch roles)
               Conv321(buy) = sell
Culm               culmination of
               Culm(ability) = peak
Degrad               degradation of
               Degrad(marriage) = fall apart
Epit           standard epithet (representing a
               part of entry)
               Epit(body) = physique
Excess              excessive functioning of 
               Excess(eyelid) = flutter
Fact012             verb meaning "the realization of,"
                    with the entry as grammatical
                    subject and the participants as
                    objects
               Fact0(suspicion) = confirm
Figur               metaphor of the entry
               Figur(love) = fire
Fin               stopping of
               Fin(fly) = land
Func012             verb which takes the entry as
                    subject of first participant,
                    second participant, etc.
               Func1(idea) = come to
Gener               generic word
               Gener(blue) = color
Germ               the core of
               Germ(problem) = crux
Imper               the command associated with
               Imper(care) = Watch out!
Involv              verb meaning non-participant
               involvement
               Involv(scent) = fill
Incep               the beginning of
               Incep(fly) = take off
Laborij             verb which takes the participants
                    ij as subject and object and the
                    entry as secondary object
               Labor12(esteem) = hold (i.e., x
               holds y in esteem)
LabRealij           verb meaning "the
                    realization of," with the
                    first two participants as
                    subject and object and the
                    entry as secondary object
                    (a combination of Labor and
                    Real)
               LabReal12(mind) = bring to (x
                              brings y to mind)
Liqu               the elimination of
               Liqu(group) = disband
Locin               preposition for "in"
               Locin(house) = in
Locab               preposition for "from"
Locad               preposition for "to"
Magn           intensity
               Magn(hatred) = deep
Manif               is manifest in, with the entry as
                    subject
               Manif(tear) = well up
Minus               less of
               Minus(wind) = slacken
Mult               a regular aggregate of
               Mult(paper) = ream
Nocer               to harm, injure, or impair
               Nocer(access) = cut off
Obstr               to function with difficulty
               Obstr(justice) = obstruct
Oper123             verb which takes numbered
                    participant as subject and entry as
                    object
               Oper1(party) = throw
Perm               permit or allow
               Perm(go) = let
Plus               more of
               Plus(joy) = grow
Pos123              positive attributes of numbered
                    participants
               Pos1(game) = skilled
Pred               copula for nouns and adjectives
               Pred(prey) = fall, be
Propt               preposition for "because of"
               Propt(greed) = out of
Prox               to be on the verge of
               Prox(disaster) = on the brink of
Qual123             highly probable qualities of
                    numbered participants
               Qual1(theft) = sneaky
Real123             verb meaning to realize with entry
                    as object and numbered participants
                    as subject
               Real1(ambition) = realize
S0               noun for entry
               S0(hate) = hatred
S123           typical noun for numbered
               participant
               S2(crime) = victim
Sinst               typical instrument
Sloc               typical location
Smed               typical means
Smod               typical mode
Sres               typical result
               Sloc(house) = yard
Sing               one instance of
               Sing(paper) = sheet
Son               to emit a typical sound
               Son(frog) = croak
Sympt               to be a physical symptom of
               Sympt(fire) = smoke
Syn               synonym and near synonym
               Syn(happy) = glad
V0               verb for entry
               V0(sing) = song
Ver               true, correct, or proper
               Ver(ruling) = fair

     In the functions followed by a number (0, 1, 2, or 3), the
numbered participant corresponds to an argument identified in the
definition of the entry lexeme.  The entry for smell (senses 6 and
8) in the sample dictionary DICT4 show examples of the lexical
functions gener and magn.

     As we have seen, many of these lexical functions can be
represented in DIMAP using the SUPERCONCEPT, FEATURE, and INSTANCE
structures.  However, these functions provide further insights into
the use of language and the relationships that may exist between
lexical items.  These functions have not yet been incorporated well
into computational models, being viewed primarily as capturing
relationships between lexical items.  Their potential is
considerable.  Moreover, just as agent was made a lexical entry in
Carlson and Nirenburg's ontology, so too can these lexical functions
be made into lexical entries.

       10.3    Lexical subordination and qualia structures

     Lexical items are frequently related to one another by
derivation.  The process of lexical extension involves the ability
of a lexical item in one semantic class to take on an extended use
in a second existing class.  The second use generally is in some
specific special relationship to the first.  Thus, a verb of
attachment is used as a verb of creation; this might be tied to the
fact that the process of attaching can be the means of creating
something.  This is an instance of a more general pattern of
extension that arises when the action denoted by verbs in one class
is a means of achieving the action denoted by verbs in the second
class.  Thus, verbs of gesture frequently show extended uses as
verbs of expressing feelings.

     Levin and Rapoport introduce the notion of lexical
subordination, which they describe as responsible for extended
meanings of words, where an existing verb productively takes on a
new and predictable meaning, that is, the new meaning may be
considered to have been derived from an existing meaning of the
verb.  Some examples are

     --   resultative construction:  the state denoted by an adjective
          holds of a noun phrase as a result of the action denoted by
          the verb, including transitive verbs (hammer the metal
          flat), unergative intransitive verbs (laugh herself silly),
          and certain verb-particle constructions (scrape the putty
          off);

     --   conflation of cause, manner, motion, and path components: 
          the extension involves "causing a change of location" (John
          floated a bottle into the cave);

     --   gesture-expression:  expressing by means of a gesture, with
          the transitive use of a class of typically intransitive
          verbs (smiled her thanks); and

     --   "one's way" and "a hole" constructions:  results brought
          about by the action denoted by their verb (explained his way
          past the guard and kicked a hole in the fence).

     These constructions do not simply involve a verb plus
preposition, particle, adjective, or noun.  Rather, both lexical
units make independent contributions to the meaning of the
construction, compositionally.

     Lexical subordination operates at the level of lexical
conceptual structure.  It takes a verb in its original, or basic,
sense and subordinates it under a lexical predicate.  In the
'result' construction, the new representation involves the addition
of a variable in the new verb sense, a variable not present in the
lexical conceptual structure of the original, unsubordinated verb. 
(The reason the term 'subordination' is used is that the new lexical
conceptual structure contains the original sense as a subordinate
clause.)

     Pustejovky builds upon these notions of lexical subordination
by proposing structured forms (or templates) that embody many of the
derivational forms within a single sense, rather than having several
senses each with its own set of features.  The particular part that
is extracted from the single sense depends on the syntactic and
semantic context.  Thus, when considering the several senses of
break, one allowing an agent subject, another having a theme subject,
and another allowing an instrument subject, he takes advantage of the
fact that each sense involves the central notion of "breaking".

     To accomplish this, Pustejovsky would give each lexical item a
qualia structure, specifying four aspects of its meaning:

     --   the relation between it and its constituent parts
          (constitutive role);

     --   that which distinguishes it within a larger domain (its
          physical characteristics) (formal role);

     --   its purpose and function (telic role); and

     --   whatever it brings about (agentive role).

This minimal semantic distinction is given expressive force when
combined with a theory of event types (the event structure).  Since
the lexical semantic representation of a word is not an isolated
expression, but is in fact linked to the rest of the lexicon, the
semantics for a lexical item is integrated through the different
qualia associated with a word (the lexical inheritance structure). 
Finally, the part of the meaning of a word is translated from its
underlying semantic representations into expressions that are
utilized by the syntax (the argument structure).

Argument Structure

     The construction of a lexical entry begins with a simple
listing of the parameters or arguments associated with a predicate.  
 To this extent, the entry would be constructed in a way similar to
that described earlier in this chapter.  However, because several
senses are to be folded into one, the argument structure will become
more sophisticated.

Event Structure

     One level of semantic description involves an event-based
interpretation of a word or phrase, recursively defined on the
syntax, so that it is also a property of phrases and sentences. 
There are three classes of events:  states (eS), processes (eP), and
transitions (eT).  These events may be decomposed into other events,
as needed (hence, a subeventual analysis).

Qualia Structure

     Many senses for a word are derived from a base sense, hence
implying a richer notion of compositionality.  The expressions that
behave as arguments to a function are not simple, passive objects,
but are active in the semantics.  Certain complements add to the
basic meaning by virtue of what they denote through a process of
semantic type coercion (a semantic operation that converts an
argument to the type that is expected by a function, where it would
otherwise result in a type error).  Thus, processes can shift their
event type to become a transition event, process verbs can
participate in a resultative construction, or a subpart or related
part of an object can stand for the object itself (metonymy).

     There is a system of relations that characterizes the semantics
of nominals (nouns), where the qualia structure of the noun
determines its meaning as much as the list of arguments determines a
verb's meaning.

     Constitutive Role:  the relation between an object and its
     constituents, or proper parts (material, weight, parts and
     component elements);

     Formal Role:  that which distinguishes the object within a
     larger domain (orientation, magnitude, shape, dimensionality,
     color, position);

     Telic Role:  purpose and function of the object (purpose that
     an agent has in performing an act, built-in function or aim
     that specifies certain activities); and

     Agentive Role:  factors involved in the origin or "bringing
     about" of an object (creator, artifact, natural kind, causal
     chain).

     If we begin with a verb's lexical entry specifying the type of
its complements, we can search the lexical entry of the noun
complements for values matching the specified type.  Each of the
qualia roles can be viewed as a partial function from a noun
denotation into its subconstituent denotations.  When one of these
functions is applied, it returns the value of a particular qualia
role.  If the complement does not match and needs to be coerced, a
type coercion dictates to the complement that it must conform to its
type specification and so the qualia roles are searched for an
appropriate type.  If there is none, a type error is produced. 
There may be several readings available, sometimes resulting in
ambiguity.

     An example is provided by the words novel and began in DICT4. 
A sentence like John began a novel is problematic for the word
began, which requires that its object describe a transition event. 
However, the word novel does not fit this type and so we would
require a different sense of the word began; this sense does not
exist.  By enlarging the representation for the word novel to
include different types of roles, the problem can be solved.  The
entry for novel would have the following components

     novel ($var0)
          const = narrative ($var0)
          form = book ($var0)
          telic = read (T, $var1, $var0)
          agentive = (*OR* artifact ($var0) write (T, $var1, $var0))

There are several ways of viewing a novel, as constituting a
narrative, in the form of a book, to be read or written, and as an
artifact of someone's effort.  Only when viewed as something to be
read or written does novel fit the context.  (The 'T' in the telic
and agentive positions indicate that these are transition events.) 
The entry for began would thus have a theme feature with the
selectional restriction that it be a transitional event.

     A more complex example is provided by the entry for the word
bake, which contains an object which may change the sense of bake
that is selected.  For sense 1, the superconcept link would be in
the hierarchy involving an agent causing a change of state (when the
object is a potato); for sense 2, the superconcept link would be to
verbs of creation (when the object is a cake).  It is only when the
verb interacts with the possible roles of the object that it is
determined that a potato is changed or a cake is created.

     Once semantic weight is given to lexical items other than
verbs, the semantic distinctions that are possible are quite wide-
ranging.  We can think of certain modifiers as modifying only a
subset of the qualia for a noun.

     Pustejovsky distinguishes the following systems and the
paradigms that lexical items fall into:

     --   count/mass alternations;
     --   container/containee alternations
     --   figure/ground reversals
     --   product/producer diathesis
     --   plant/fruit alternations
     --   process/result diathesis
     --   object/place reversals
     --   state/thing alternations
     -- place/people.

Lexical Inheritance

     The flexibility that arises when a word's meaning can be
generated by composition can be placed within a global knowledge
base, capturing the inheritance relations between concepts and how
the concepts are integrated into a coherent expression in a given
sentence.  There are two inheritance mechanisms:  fixed and
projective.  The first includes a fixed network of relations, which
is traversed to discover existing related and associated concepts
(e.g. hyponyms and hypernyms).  The projective inheritance operates
generatively from the qualia structure of a lexical item to create a
relational structure for ad hoc categories.  The latter can deal
with what is usually assumed to be commonsense knowledge.

     With a fixed inheritance structure, we can identify a sequence
 Q1, P1, ... , Pn  as an inheritance path, which can be read as the
conjunction of ordered pairs { x1,y1 }.  The conclusion space of a
set of sequences � is the set of all pairs  Q,P  such that a sequence
 Q,...,P  appears in �.  The traditional is-a relation relates the
pairs by a generalization operator, �G (a lattice), as well as other
relations.

     In addition to these fixed relational structures, we can
dynamically create arbitrary concepts through the application of
certain transformations on lexical meanings.  For example, for any
predicate, Q (the value of a qualia role), we can generate its
opposition, �Q.  A projective transformation, �, on a predicate Q1
generates a predicate, Q2, such that �(Q1) = Q2, where Q2  �.  The
set of transformations includes: � (negation), � (temporal
precedence), � (temporal succession), = (temporal equivalence), and
act (an operator adding agency to an argument).

     The space of concepts traversed by the application of such
operators will be related expressions in the neighborhood of the
original lexical item.  A series of applications of transformations,
�1, ... ,�n, generates a sequence of predicates,  Q1, ... , Qn ,
called the projective expansion of Q1, P(Q1).  The projective
conclusion space, P(�R), is the set of projective expansions
generated from all elements of the conclusion space, � on role R of
predicate Q.  That is, P(�R) = { P(Q1), P(Qn)     Q1, ... , Qn    �R}. 
For example, we can have the concept of "being confined" with its
opposite "not being confined"; these concepts can be related
temporally, with an operator arising from the transition event of
"escaping".  Thus, "the prisoner escaped" has a closer association
than "the prisoner ate".  "Escaping" will fall within the conclusion
space for the telic role of prisoner.  Generating the projective
conclusion space as a graph, we can take those graphs that result in
no contradictions to be the legitimate semantic interpretations of
the entire sentence.

       10.4    Lexical rules

     Flickinger describes lexical rules relating lexical entries
both for inflection and for derivation.  The present sample
dictionary does not include the lexical rules, but they can easily
be added using typed feature structures (see Copestake and Briscoe). 
Atkins uses the term LINK-RULE, while Levin talks about meaning
extension.  Both assume that the basic sense and the derived sense
exist within the dictionary.  Copestake and Briscoe want to
formalize these notions by calling them lexical rules and making
them a component of a unification-based lexicon employing (default)
inheritance and typed feature structures.  In many cases, there
might not be a derived sense in the dictionary, but rather the
derivation exists through some sort of coercion during syntactic and
semantic interpretation (for example, when a metaphorical
interpretation is adopted).  Even in these cases, lexical rules
characterize what is occurring.

     Lexical rules can cover a variety of situations:  derivational
morphological processes, change of syntactic class (conversion),
argument structure of the derived predicate, affixation, and
metonymic sense extensions.  Establishment of lexical rules within
the lexicon must also take into account possible blocking which
might occur (where a lexeme already exists that expresses the sense
that would otherwise be derived).  Thus, lexical rules should
"express sense extension processes, and indeed derivational ones, as
fully productive processes which apply to finely specified subsets
of the lexicon, defined in terms of both syntactic and semantic
properties expressed in the type system underlying the organization
of the lexicon."

     Copestake and Briscoe present a lexical representation system
using typed feature structures, which are necessary to formalize the
notions of an ontology, as described earlier.  Feature structures
must be well-formed with respect to types.  Particular features will
only be appropriate to specified types and their subtypes.  Types
are hierarchically ordered.  Constraints can be associated with
types to allow non-default inheritance.  Default inheritance
consists of default unification of feature structures ordered by an
inheritance hierarchy.  The type system constrains both default
inheritance and lexical rule application.

     The type system defines a partial ordering on the types, thus
identifying which types are consistent.  Only feature structures
with a common subtype can be unified; if two types are unordered in
the hierarchy, they are inconsistent.  Every consistent set of types
has a unique greatest lower bound.  Thus, when two feature
structures of types a and b are unified, the type of the result will
be a   b, which must be unique if it exists.  If a   b does not
exist, the types are inconsistent and unification fails.

     Every type must have a feature structure to provide the
constraints necessary to identify the type and establish the range
of features that are appropriate for that type.  This makes a well-
formed feature structure.  Constraints are inherited by all subtypes
of a type, but a subtype may introduce new features (which will be
inherited by all its subtypes).  The constraints for a type must be
mutually consistent. This inheritance of constraints allows concise
definitions of all lexical entries.

     A type is a bare atom naming the feature structure (and would
be entered as an ontological entry in DIMAP--see DICT4), as the word
#artifact in (1).  The "artifact" type has a telic feature whose
value is a feature structure of type formula.  An atomic type would
have only a name, but no features.  A feature value can be either
atomic or represent a feature structure, perhaps just through the
mention of the type, as the word formula in (1).

(1)
[artifact
TELIC = formula]

(2)
[physobj
FORM = physobj
PHYSICAL-STATE = solid]

To represent the structure in (1), we would have one sense, with no
part of speech, and with only an entry in the feature slot, with
feature name telic and feature value formula.  In (2), the same
principles of representation hold.  Here, solid is an atomic feature
structure and physform is a complex feature structure.  We can give
further structure to physform, as in (3) and (4).

(3)
[ind_obj
FORM = [phsyform
     SHAPE = individuated] ]

(4)
[ substance
FORM = [ phsyform
     SHAPE = unindividuated] ]

     There are several options for representing these structures in
DIMAP:  (1) They can be represented directly.  That is, there would
be entries for #ind_obj and #substance, each with one sense having
the feature form, the former with the value "+physform (shape
individuated)" and the latter with the value "physform (shape
unindividuated)".  (2)  The two different values of shape can be
entered in (the feature component of) two different senses of an
entry for #physform.  Then, in the feature component of #ind_obj and
#substance enter the values of form as "physform 1" and "physform
2", respectively.  In this schema, we would have to know that the
entry at #physform contains information necessary to expand (or
fully specify, to use Flickinger's terminology) the entries for
#ind_obj and #substance.  This option might have the advantage that
the expansion can be accomplished recursively.  That is, if the
entry #physform has feature values that are also non-atomic, a
program developing the fully-specified form for #ind_obj and
#substance would automatically continue to move up the hierarchy
specified through these links.  (3) Use ISA links.  If we merely
specify the value of form as "+physform" and then link this sense
hierarchically to #physform 1, we could pick up the nesting chain
necessary to build the fully-specified entry.  (4) Use role links. 
Instead of using the feature component, the role component can be
used more directly.  In this case, form would be the role name,
#physform would be the role value, and the particular sense could
also be identified.  It seems that options 1 and 2 are the better
choices.  In selecting an option, it is necessary to ensure that you
have a clean structure that will permit unification and other
operations with feature structures.  These will need to deal with
several idiosyncrasies of feature structures.  For example, in (5)
and (6), we would have to recognize scalar as referring to whole
integers, gender as being dichotomous with values "M" and "F" (or
three-valued), and boolean as having the values "0" or "1" (or "+"
or "-").

(5)
[creature
AGE = scalar
SEX = gender]

(6)
[animal
EDIBLE = boolean]

     In (7), (8), (9), and (10), the feature structures of the top
level of a lexical hierarchy are presented.  The relations
identified by ' ' would be represented using the SUPERCONCEPT link
in DIMAP.  (See entries in DICT4 for #lex-sign, #noun, #count-noun,
and #mass-noun.  In these entries, note particularly the
SUPERCONCEPT and FEATURE values and how generic values such as
string or boolean can serve as type constraints.  RQS stands for
"relativized qualia structure," nomrqs stands for "nominal
relativized qualia structure.")

(7)
lex-sign <= top [lex-sign
          ORTH = string]

(8)
noun <= lex-sign [noun
          COUNT = boolean 
          RQS = nomrqs]

(9)
count-noun <= lex-sign [noun
          COUNT = +]

(10)
mass-noun <= lex-sign [noun
          COUNT = -]

     Thus, a lexical entry for the word haddock (see DICT4) would
need to identify links to appropriate places in hierarchies that
should be unified, to result in the expanded feature structure shown
in (11).  In terms of the feature structures noted above, all that
we would have to give would be pointers to #count-noun and #animal,
in order to obtain the fully-specified form in (11).

(11)
[count-noun
ORTH = "haddock"
SYNTAX = [COUNT = +]
RQS = [animal 
     SEX = gender
     AGE = scalar
     EDIBLE = boolean
     PHYSICAL-STATE = solid
     FORM = [physform 
          SHAPE = individuated] ] ]

     A lexical rule is a feature structure of type lexical-rule,
specified as follows:

(12)
[lexical_rule
0 = lex-sign
1 = lex-sign]

Every lexical rule must have the features 0 and 1, each of which
must have a value which is of type #lex-sign.  A new lexical sign is
generated by taking a lexical entry which satisfies the
specifications in the <1> feature path (that is, unifies with that
feature structure) and creating the feature structure in the <0>
feature path.

     For example, a rule for #grinding (making an individuated
object into a substance) can be specified as follows

(13)
grinding <= lexical_rule [grinding 
          1 = [count-noun 
               ORTH = $var0
               RQS = ind_obj]
          0 = [mass-noun
               ORTH = $var0
               RQS = substance] ]

Specific contexts (predicational and syntactic) will force the
application of the lexical rule (coercion).  Moreover, the typed
framework of the lexicon allows us to identify those lexical items
to which a lexical rule can apply.  For example, we can have the
more specific 'grinding' lexical rule shown in (14).  This rule
would be identified in DIMAP as #animal-grinding with a SUPERCONCEPT
link to #grinding.  The RQS (relativized qualia structure) in (14)
is consistent with and more specific than the RQS structures in
(13).  If we next take the entry for haddock shown in (11), we see
that the #animal-grinding rule is applicable to it.  To achieve the
applicability, we unify #animal-grinding with #grinding, with the
primary effect being to add the RQS of #ind_obj to that of #animal. 
(Since these two stand in an ISA relationship, the unification is
immediate, except that the more specific entry may have default
values that override what may be present in entries higher in the
hierarchy.)  The entry at (11) then easily unifies to give the
result in (15).

(14)
animal-grinding [grinding
     1 = [RQS = [animal
               EDIBLE = +]
     0 = [RQS = food_substance] ]

(15)

[mass-noun
ORTH = "haddock"
COUNT = -
RQS = [ food_substance
     TELIC = [ formula
          PRED = eat] ] ]

     The result of applying a lexical rule to a specific sense is
notated sense + rule_name, as in 

          lamb_2 < lamb_1 + #animal_grinding

This sense extension can be represented directly in DIMAP, if
desired.  This can be handled like the representation for past
tenses of verbs, using ISA links.  In this case, "lamb" (sense 2)
would have two ISA links, one to lamb (sense 1) and one to
#animal_grinding (which presumably has only one sense). 
Alternatively, there is no need to mention such links explicitly,
but rather assume that they would be brought into play during
processing.  If an individuated sense of lamb or haddock did not
satisfy the context, the lexical rules associated with those entries
could be evoked.

                        REFERENCES AND BIBLIOGRAPHY

Ahlswede, T.  (1985).  "A Toolkit for Lexicon Building." 
     Proceedings of the 23rd Annual Conference of the Association
     for Computational Linguistics, 268-75.

Allen, J.  (1987).  Natural Language Understanding.  Menlo Park, CA: 
     The Benjamin/Cummings Publishing Company, Inc.

Amsler, R. E.  (1980).  The Structure of the Merriam-Webster Pocket
     Dictionary, TR-164.  Austin, TX:  Department of Computer
     Science, University of Texas.

Atkins, B. T. S.  (1991).  "Building a Lexicon:  The Contribution of
     Lexicography."  International Journal of Lexicography, 4(3),
     167-204.

Atkins, B. T. S., J. Kegl, and B. Levin.  (1988).  "Anatomy of a
     Verb Entry:  from Linguistic Theory to Lexicographic Practice." 
     International Journal of Lexicography, 1(2), 84-126.

Boguraev, B. K.  (1991).  "Building a Lexicon:  The Contribution of
     Computers."  International Journal of Lexicography, 4(3), 227-
     60.

Bresnan, J. (ed).  (1982).  The Mental Representation of Grammatical
     Relations.  Cambridge, MA:  MIT Press.

Briscoe, T. and A. Copestake.  (1991).  "Sense Extensions as Lexical
     Rules."  Proceedings of the IJCAI Worksop on Computational
     Approaches to Non-Literal Language (Also as ESPRIT BRA-3030
     ACQUILEX Working Paper No. 22), .

Carlson, L. and S. Nirenburg.  (1990).  World Modeling for NLP,
     Technical Report CMU-CMT-90-121.  Pittsburgh, PA:  Carnegie
     Mellon University, Center for Machine Translation.

Chodorow, M. S. and R. J. Byrd.  (1985).  "Extracting Semantic
     Hierarchies from a Large On-Line Dictionary," Proceedings of
     the 23rd Annual Meeting of the Association for Computational
     Linguistics.  Chicago, IL:  Association for Computational
     Linguistics, 299-304.

Copestake, A. A. and E. J. Briscoe.  (1991).  "Lexical Operations in
     a Unification Based Framework."  Proceedings of the ACL SIGLEX
     Workshop on Lexical Semantics and Knowledge Representation, 88-
     101.

Cruse, D. A.  (1986).  Lexical Semantics.  Cambridge:  Cambridge
     University Press.

Evens, M. and R. N. Smith.  (1978).  "A lexicon for a computer
     question-answering system."  American Journal of Computational
     Linguistics, Microfiche 81, 1-99.

Evens, M. W. (ed.).  (1988).  Relational Models of the Lexicon: 
     Representing Knowledge in Semantic Networks.  Cambridge: 
     Cambridge University Press.

Fellbaum, C.  (1990).  "English Verbs as a Semantic Net." 
     International Journal of Lexicography, 3(4), 278-301.

Fikes, R. E. and T. Kehler.  (1985).  "The Role of Frame-Based
     Representation in Reasoning."  Communications of the ACM,
     28(9), 904-20.

Flickinger, D. P.  (1987).  Lexical Rules in the Hierarchical
     Lexicon, PhD Dissertation.  Stanford, CA:  Stanford University.

Flickinger, D., C. Pollard, and T. Wasow.  (1985).  "Structure-
     sharing in lexical representation," Proceedings of the 23rd
     Annual Meeting of the Association for Computational
     Linguistics.  Chicago, IL:  Association for Computational
     Linguistics, 262-7.

Frawley, W.  (1988).  "Relational models and metascience," in Evens,
     M. W. (ed), Relational Models of the Lexicon:  Representing
     Knowledge in Semantic Networks.  Cambridge:  Cambridge
     University Press, 334-72.

Gross, D. and K. Miller.  (1990).  "Adjectives in WordNet." 
     International Journal of Lexicography, 3(4).

Holub, A. I.  (1990).  Compiler Design in C.  Englewood Cliffs, NJ: 
     Prentice Hall.

Ilson, R. and I. A. Mel' uk.  (1989).  "English BAKE Revisited
     (BAKE-ing an ECD)."  International Journal of Lexicography,
     2(4), 326-45.

Jackendoff, R.  (1983).  Semantics and Cognition.  Cambridge, MA: 
     The MIT Press.

Jackendoff, R.  (1987).  "The Status of Thematic Relations in
     Linguistic Theory."  Linguistic Inquiry, 18, 369-411.

Jackendoff, R.  (1990).  Semantic Structures.  Cambridge, MA:  The
     MIT Press.

Levin, B.  (1991a).  "Building a Lexicon:  The Contribution of
     Linguistics."  International Journal of Lexicography, 4(3),
     205-26.

Levin, B.  (forthcoming).  "Approaches to Lexical Semantic
     Representation," in D. Walker, A. Zampolli, and N. Calzolari
     (ed), Automating the Lexicon, I: Research and Practice in a
     Multilingual Environment.  Oxford:  Oxford University Press.

Levin, B. and T. R. Rapoport.  (1988).  "Lexical Subordination." 
     Proceedings of the 24th Annual Meeting of the Chicago
     Linguistic Society, Part One:  The General Session , 275-289.

Litkowski, K. C.  (1978).  "Models of the Semantic Structure of
     Dictionaries."  American Journal of Computational Linguistics,
     (Mf.81), 25-74.

Mel' uk, I.  (1988).  "Semantic Description of Lexical Units in an
     Explanatory Combinatorial Dictionary:  Basic Principles and
     Heuristic Criteria."  International Journal of Lexicography,
     1(3), 165-188.

Meyer, I., B. Onyshkevych, and L. Carlson.  (1990).  Lexicographic
     Principles and Design for Knowledge-Based Machine Translation,
     Technical Report CMU-CMT-90-118.  Pittsburgh, PA:  Carnegie
     Mellon University, Center for Machine Translation.

Miller, G. A.  (1990).  "Nouns in WordNet:  A Lexical Inheritance
     System."  International Journal of Lexicography, 3(4).

Miller, G. A., R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. 
     (1990).  Five Papers on WordNet, CSL Report 43.  Princeton,
     N.J.:  Cognitive Science Laboratory.

Nirenburg, S. and C. Defrise.  (forthcoming).  "Practical
     Computational Linguistics," in R. Johnson and M. Rosner (ed),
     Computational Linguistics and Formal Semantics.  Cambridge: 
     Cambridge University Press.

Pereira, F. C. N. and S. M. Shieber.  ().  Prolog and Natural-
     Language Analysis.  (CSLI Lecture Notes 10).  Stanford, CA: 
     Center for the Study of Language and Information.

Pollard, C. and I. A. Sag.  (1987).  Information-Based Syntax and
     Semantics: Volume 1 - Fundamentals.  (CSLI Lecture Notes, No.
     13).  Menlo Park, CA:  Center for the Study of Language and
     Information.

Polovina, S. and J. Heaton.  (1992).  "An Introduction to Conceptual
     Graphs."  AI Expert, 7(5), 36-43.

Proudian, D. and C. Pollard.  (1985).  "Parsing Head-Driven Phrase
     Structure Grammar."  Proceedings of the 23rd Annual Conference
     of the Association for Computational Linguistics, 167-71.

Pustejovsky, J.  (1991).  "The Generative Lexicon."  Computational
     Linguistics, 17(4), 409-41.

Schank, R. and R. Abelson.  (1977).  Scripts, Plans, Goals, and
     Understanding.  Hillsdale, NJ:  Lawrence Erlbaum Associates.

Schank, R. C.  (1972).  "Conceptual dependency: A theory of natural
     language understanding."  Cognitive Psychology, 3(4), 552-631.

Schank, R. C.  (1973).  "Identification of Conceptualizations
     Underlying Natural Language," in R. C. Schank and K. M. Colby
     (ed), Computer Models of Thought and Language.  San Francisco,
     CA:  W. H. Freeman.

Sowa, J. F.  (1984).  Conceptual Structures:  Information Processing
     in Mind and Machine.  Menlo Park, Calif.:  Addison-Wesley.

Sowa, J. F. (ed).  (1991).  Principles of Semantic Networks: 
     Explorations in the Representation of Knowledge.  San Mateo,
     Calif.:  Morgan Kaufmann.

Velardi, P., M. T. Pazienza, and M. Fasolo.  (1991).  "How to Encode
     Semantic Knowledge:  A Method for Meaning Representation and
     Computer-Aided Acquisition."  Computational Linguistics, 17(2),
     153-70.