A PRIMER ON COMPUTATIONAL LEXICOLOGY Kenneth C. Litkowski Copyright 1992 SAMPLE DICTIONARIES AND COMPUTATIONAL LEXICOLOGY. . . . . . . . . . . . 1 1 BACKGROUND AND ORIENTATION FOR DIMAP LEXICOLOGY. . . . . . . . . 2 2 LEXICOGRAPHIC PRINCIPLES FOR ORGANIZING A COMPUTATIONAL LEXICON. . . . . . . . . . . . . . . . . . . . . . 3 2.1 Main entries and headwords. . . . . . . . . . . . . . . . . 3 2.2 Grouping and ordering of senses . . . . . . . . . . . . . . 4 2.3 Pseudoentries (linguistic regularities) . . . . . . . . . . 6 3 PART OF SPEECH (CATEGORY) INFORMATION. . . . . . . . . . . . . . 8 4 ORTHOGRAPHY, PHONOLOGY, MORPHOLOGY, AND ADMINISTRATIVE INFORMATION . . . . . . . . . . . . . . . . . . . 8 5 LEXICOGRAPHIC INFORMATION, INCLUDING DEFINITIONS . . . . . . . . 9 6 SYNTACTIC FEATURES . . . . . . . . . . . . . . . . . . . . . . . 10 7 SYNTACTIC STRUCTURES . . . . . . . . . . . . . . . . . . . . . . 13 7.1 A Dictionary for Unification Grammars . . . . . . . . . . . 14 7.2 Lexical Functional Grammar. . . . . . . . . . . . . . . . . 15 7.3 Diathesis Alternations. . . . . . . . . . . . . . . . . . . 15 8 SEMANTICS. . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 8.1 Semantic Roles and Their Relationship to Syntactic Structure . . . . . . . . . . . . . . . . . . . . 16 8.2 Semantic representation, a word's type, and selectional restrictions. . . . . . . . . . . . . . . . . . 21 9 ONTOLOGIES AND TYPE HIERARCHIES. . . . . . . . . . . . . . . . . 27 9.1 Theoretical structure for an ontology . . . . . . . . . . . 27 9.2 Basic ontological modeling choices. . . . . . . . . . . . . 28 9.3 WordNet - An ontological database . . . . . . . . . . . . . 33 10 LEXICAL RELATIONS. . . . . . . . . . . . . . . . . . . . . . . . 40 10.1 Semantic networks and conceptual graphs . . . . . . . . . . 40 10.2 Collocational functions . . . . . . . . . . . . . . . . . . 43 10.3 Lexical subordination and qualia structures . . . . . . . . 46 10.4 Lexical rules . . . . . . . . . . . . . . . . . . . . . . . 51 SAMPLE DICTIONARIES AND COMPUTATIONAL LEXICOLOGY DIMAP is a toolkit for lexicon building that has its roots in computational linguistics and natural language processing. As the lexicon assumes an ever-increasing importance in these fields, these roots give DIMAP a particular flexibility easily adaptable to emerging trends. More significantly, the effort to encode current formalism in DIMAP has facilitated the sharper identification of their commonalities and differences. These topics form the basis for this chapter of the users manual, providing a brief introduction to principles of computational lexicology, showing how to make use of DIMAP's tools and data structures, and presenting sample lexicons. This chapter is organized primarily around the theoretical perspective and actual use of the several elements comprising a lexical entry. After a brief introduction describing the background and genesis of DIMAP, the following topics are addressed: (1) lexicographical principles underlying a computational lexicon; (2) grammatical category or part-of-speech; (3) orthographical, phonological, morphological, and administrative components of an entry; (4) lexicographical information, including the definition; (5) syntactic features; (6) syntactic structures, (7) semantics (including meaning representation and selectional restrictions), (8) type hierarchies and ontologies, and (9) lexical, derivational, and collocational relations. In describing these topics, references are made to sample dictionaries included with DIMAP and to the corresponding literature, including -- basic principles of lexicon development from Allen; Atkins; Atkins, Kegl, and Levin; and Mel' uk; -- grammar formalism that rely heavily on the lexicon, including lexical functional grammar (Meyer et al.) and head-driven phrase structure grammar (HPSG) (Flickinger); -- type hierarchies and ontologies (Allen; Carlson and Nirenburg; WordNet; Meyer et al.); -- collocational patterns (Mel' uk; Velardi et al.); and -- semantic relations within a dictionary and the lexicon, including the notions of lexical subordination (Levin and Rapoport), qualia structures (Pustejovsky), semantic networks (Sowa, 1984), and lexical rules and derivations (Copestake and Briscoe; Flickinger). This literature is not comprehensive, nor is there full agreement on the issues they discuss. However, they provide a useful vehicle for discussing many issues relevant for computational lexicons. 1 BACKGROUND AND ORIENTATION FOR DIMAP LEXICOLOGY Three seminal papers in studies of ordinary dictionaries were published in 1978. Amsler and Litkowski provided descriptions of the network of definitions present in ordinary dictionaries. Amsler focused on the taxonomies, while Litkowski provided a graph- theoretic model for the taxonomic and other relations. Meanwhile, Evens and Smith provided details on lexical-semantic relations. The DIMAP data structure is especially designed to contain fields for specifying hierarchical and other semantic links between (distinct senses of) the lexical entries. These papers and many others, both before and since, demonstrate the highly interwoven nature of ordinary dictionaries. The different meanings (senses) of many words provide evidence that many concepts are derivative, made up of more primitive components. There have been many papers discussing the nature of semantic and syntactic primitives. Regardless of the conclusions of such discussions, it is well-accepted that definitions are related to one another. When toy lexicons are developed for computational linguistics applications and natural language processing, they seldom include such derivational links. As you develop lexicons for your own applications, you will find, at the beginning, very few links are present. DIMAP was designed with such relationships in mind, so it will facilitate their identification and specification. The more immediate genesis of DIMAP is to be found in Ahlswede, which described a set of interactive routines to create, maintain, and update lexicons. Ahlswede emphasized that the lexicon produced in his system was based on lexical-semantic relations, but easily extended to other models of the lexicon structure. The sample dictionary available to you when first starting DIMAP includes the word "aphasia" which appears in Ahlswede's paper. (This lexicon is also backed up as DICT1 and its associated index files. See chapter 7 for a full description of a suite of dictionary files.) DIMAP includes the basic facilities described by Ahlswede, but also extends these routines in several directions. The first expansion is mostly transparent to the user, in providing variable- length rather than fixed-length fields for the many components. You do not have to worry about a limited amount of space for any part of a DIMAP entry. The system is designed to handle these variable length records through random access routines. DIMAP does not as yet provide facilities for automatically generating the data (that is, hierarchical and other lexical- semantic relations) in the entries. However, you can establish such links and by a judicious structuring of a hierarchy you can store information parsimoniously. You can perform a considerable amount of lexicological research using DIMAP, since it has the attached CED. You can, for example, convert a selected set of definitions from the CED to DIMAP entries (following the procedures described in chapter 5). You can then convert these entries to an ASCII file, perhaps containing only the definitions from the CED (following the procedures described in chapter 6). You can then edit these definitions, perhaps excluding function words, or parse the definitions using your own parser to identify the head words of the definitions, finally creating a set of words with which to perform a batch download (again following procedures in chapter 5). Based on your analysis of these definitions, you may want to create a special set of links in the DIMAP entries. Although the CED is not the most comprehensive dictionary, it does contain sufficient information for serious investigation of specialized vocabularies or of lexical relations. 2 LEXICOGRAPHIC PRINCIPLES FOR ORGANIZING A COMPUTATIONAL LEXICON The dictionary available when you first start DIMAP contains entries for all the dictionary entries appearing in the first 250 pages of Allen. Each word used in his sample lexicons has been included, along with any features that are part of his entries. This sample dictionary is useful not only for demonstrating how individual components of DIMAP can be used, but also for integrating this lexicon with parsers that are capable of working his examples. The entries in this dictionary provide a useful starting point for presenting principles of organizing a computational lexicon. 2.1 Main entries and headwords Entries in a computational lexicon generally contain orthographic, phonological, and morphological information; syntactic features and syntactic structure information; and semantic and pragmatic information. In DIMAP, the basic organizational unit is the main entry, which is comparable to what Ilson and Mel' uk and Meyer et al. call a SUPERENTRY. The lexical units in a dictionary are intended to ensure the lexicalization of the meaning, uniting bundles and configurations of semantic elements into actual lexical units and supplying syntactic and lexical co-occurrence information, thus providing all information associated with the behavior of the lexical unit. In ordinary dictionaries or in some computational lexica, there may be several ENTRIES corresponding to homographs; each of these may be called a LEXEME. As will be seen, there is no necessity for making each lexeme a main entry in DIMAP. Each entry should be viewed as a complete frame data structure, intended eventually to allow structure-sharing where entries containing the same information in a particular subframe will point to the same structure. In DIMAP, this is generally handled by encoding a distinct entry containing the repeated information and having a pointer to that entry in the SUPERCONCEPT, INSTANCE, or ROLE fields. Pointers can also be encoded in the FEATURE field, as will be shown below. What constitutes a main entry headword is subject to various criteria. In Meyer et al., the main entry headword can only be solid words or hyphenated compounds. Proper nouns are intended to be put into a separate knowledge structure (although with the same format). Idioms (including true idioms, noun-noun compounds, non- compositional compounds, and verbs with particles) are entered under the syntactic head of the idiom. In DIMAP, all of these should be entered as the main entry headword. The principle reflected in DIMAP is that recognition proceeds from left to right, so that any compound or idiom is recognized as beginning with its first word. In DIMAP, this is captured by the entry type, 'r' for regular and 'i' for idiom. If an entry is coded with an 'i', then the dictionary contains another entry which begins with this entry. This accounts for the presence of an entry "have a" coded in DICT1, where this phrase has no meaning in itself, but is the initial phrase of the idiom "have a ball", also an entry in DICT1. 2.2 Grouping and ordering of senses The creation of senses for a computational lexicon has important consequences for the commitment to parsing that is implemented. In toy lexicons implemented in texts on natural language processing, the entries are simplistic and oriented toward the grammatical component, with the semantic component playing a subordinate and trivial role. Entries seldom have more than one sense, and when they do, they generally reflect different parts of speech or senses with widely different usages. As more information is reposed in the lexicon, the structure of an entry assumes greater importance, particularly the manner in which the senses relate to one another. The principles for identifying a sense in DIMAP should follow those used in ordinary lexicography, although some interesting possibilities are available with a computational lexicon. In general, the principles stated in Mel' uk and Meyer et al. are appropriate. In summary, they are: (1) if, for a suggested lexical unit, two possible mappings to the ontology can apply, then two lexical units must be created (that is, create two senses if you wish to have separate meanings pointing to different parts of a type hierarchy); (2) if there are incompatible selectional restrictions for a suggested lexical unit, there should be two senses; (3) if there are two incompatible co-occurrence sets (morphological, syntactic such as subcategorization frames, or lexical such as collocations), two senses should be created; and (4) if there are two possible readings of a word, two senses should be created. These principles will become clearer as the components of an entry are described below. In Mel' uk, a vocable is the set of all lexical units (senses) for which the lexicographic definitions are linked with a semantic bridge. A semantic bridge between lexical units L1 and L2 is a component common to their definitions, which formally expresses a semantic link. A basic lexical unit of a vocable is a lexical unit which has a semantic bridge with the majority of the other lexical units of the vocable. A semantic field is the set of all lexical units that share an explicitly distinguished non-trivial semantic component. A lexical field is the set of all vocables whose basic lexical units belong to the same semantic field. Although Mel' uk uses a vocable to group similar senses under a SUPERENTRY, any main entry can have any number of sense groupings under it. In DIMAP, there is no real need for separate entries for homographs. In computational linguistics, parsing (recognition) must always begin with the least common denominator, the spelling of a word. Sets of senses within a main entry may be linked in any grouping. Mel' uk's terminology provides concepts that may be useful in thinking about such groups. Mel' uk articulates the decomposition principle that the definition of a lexical unit must contain only terms that are semantically simpler than the lexical unit. Further, through his semantic bridge principle, the definitions of any two lexical units of the same vocable must be explicitly linked, whether by a semantic bridge or by a sequence of semantic bridges. These principles should be followed in constructing a lexicon and ensuring its internal consistency. Most importantly, these principles should be applied in determining the relationship between one definition and the rest of the lexicon, including other definitions of the same main entry. The importance of these principles is considered below in discussions of issues such as lexical subordination, type coercion, lexical rules and relations, and lexical inheritance. Mel' uk makes six observations pertinent to grouping and ordering the senses of an entry: (1) grouping into one polysemous vocable has a semantic motivation, namely that all lexemes must share at least one important semantic component; (2) division into sense groups is also semantically based; (3) ordering is based on semantic proximity; (4) ordering is based on which entry is semantically simpler (for example, the change of state of the same thing is simpler than the change of state into something else); (5) an intransitive sense is placed before a transitive sense, again based on semantic simplicity, in that the transitive includes a causal component (the transitive is defined in terms of the intransitive); and (6) sometimes the placement is not at issue but only whether a distinct sense is needed (this can be somewhat arbitrary). In DIMAP, these kinds of groupings and orderings can be accomplished through SUPERCONCEPT links. Instead of a link to another main entry, the links are made to other senses of the same main entry. 2.3 Pseudoentries (linguistic regularities) The preceding discussion focuses on lexical entries that characterize the world around us. A distinct group of lexical entries can be encoded to characterize linguistic and lexical generalities. For this purpose, DIMAP introduces entries that begin with the symbol '#'. These are called pseudoentries because they encode only grammatical or semantic abstractions. They constitute metalinguistic entries in the lexicon. Pseudoentries vary in importance with the grammatical theory. The sample dictionary containing entries for the words used by Allen (DICT1) also contains pseudoentries (encoding primarily semantic interpretation rules) that Allen identifies as necessary for a consistent framework for performing compositional semantics. Hence, the prefix gives these entries a metalinguistic stature. These entries include: #abstract, #action, #animate, #anything, #assert, #automata, #command, #event, #inf, #inst/action, #legal-entity, #living, #location, #name, #non-animate, #non-living, #obj/action, #org, #past, #person, #physobj, #present, #pro, #time, #to/action, #vegetative, #wh, #wh-query, #y-n-query These entries correspond to nodes in the various type hierarchies Allen presents to describe grammatical entities. The power captured in these formalism is still in a nascent stage in computational linguistics. In the unification-based HPSG (Pollard and Sag), a considerable amount of syntactic generalization resides in the lexicon. Flickinger describes the role of the lexicon in HPSG in detail. Flickinger describes a word class hierarchy which makes a commitment to HPSG, so that the syntactic properties defined for each word class are consistent within that framework. A class is defined by its set of discrete properties making up a lexical entry which is inherited by each member of the class. In this schemata, a given lexical item may belong to many classes and hence inherit the particular syntactic, morphological, or semantic properties (or cluster of properties) of each class. A sample dictionary (DICT2) consists solely of pseudoentries encoding the syntactic regularities that Flickinger would relegate to the lexicon. The focus in the sample DIMAP dictionary (and indeed in Flickinger) is on syntactic regularities and properties. The sample dictionary contains entries for the word classes elaborated in chapter 2 of Flickinger (pp.17-56). These include the following: #3-1, #3rd-sing, #adj, #adv, #anaphoric, #anomalous-equi, #aux, #base, #c-conj, #c-noun, #comp, #complementation, #complete, #control, #copula, #det, #det-number, #det-plu, #det-sing, #det-type, #ditrans, #ditrans-to, #each-type, #equi, #every-type, #expletive, #fin, #incomplete, #lex-np, #main-verb, #major, #mass, #minor, #name, #non-3rd-sing, #non-past, #non-reflexive, #noun, #noun-type, #number, #numeral, #object-equi, #object-raising, #part-of-speech, #passive, #past, #past-part, #perf, #plu, #prep, #pres-part, #pron, #prop, #raising, #reflexive, #s-inf-it, #s-inf-norm, #s-it, #s-norm, #sing, #transitive, #verb, #verb-form, #verb-ger, #verb-type, #word-class Although the terminology adopted in creating these entries (in DIMAP) and in their definitions (in Flickinger) is not self-evident, their tenor is captured by the above list. Examining the contents of these entries, along with a knowledge of HPSG, will make the details clear. Despite the attempt to place as much information as possible into the lexicon, there still remain issues of what separation must be maintained between the grammar and the lexicon. Ilson and Mel' uk discusses several lexico-grammatical problems: Quasi-passives vs. real passives - Quasi-passives should constitute lexemes separate from their actives, whereas true passives are grammatical forms of the same lexeme. Quasi- passives are not possible with all verbs, whereas all active transitive verbs can be passivized by general rules. It can be argued that real passives should not be described as separate entries in the dictionary entry proper. Diathesis alternations - Two government patterns for a verb sense may have the same meaning. Therefore, it can be argued, only one sense is necessary in the dictionary. Subject/object complements - Some of these are obligatory and must be included among the arguments of the corresponding verbs, while others are optional and freely added. Thus, it can be argued, recognition should be treated in the grammar and not result in distinct entries. In each of these cases, some information can be placed in the dictionary. Perhaps the key distinction is one of processing efficiency: place information in the lexicon if it can be accessed and used more efficiently than backtracking through several paths in a parser. The separation between the grammar and the lexicon is an active area of research today. With the development of lexical rules, derivational rules, and collocational functions that can be placed in the lexicon itself, it is difficult to determine exactly where to leave off in the creation of dictionary entries. 3 PART OF SPEECH (CATEGORY) INFORMATION The first piece of information required for NLP is a word's part of speech. The possible values for any given parser will vary. DIMAP provides a list of categories ordinarily used. If CED entries are converted to DIMAP format, the category is always included in the conversion. There is a standard list used in DIMAP, one of which is chosen for each sense. This includes a category 'none', an option that is particularly applicable to pseudoentries. An alternative representation in DIMAP would be to use the category 'none' in the required part-of-speech or category field of a sense and then to identify a CAT feature name, with any desired value from the list one wishes to employ. The only requirement is user-imposed, to ensure consistency of the entries. 4 ORTHOGRAPHY, PHONOLOGY, MORPHOLOGY, AND ADMINISTRATIVE INFORMATION In Meyer et al., distinct fields are used to record variants, abbreviations, unpredictable phonology, irregular forms, stem variants, and declension classes. In DIMAP, there are no specific fields intended to capture these types of information. These data may be regarded as "instances" of a specific entry. This information is generally recorded in DIMAP by creating a distinct entry for the variant, abbreviation, irregular forms, and so forth, containing only pointers through the SUPERCONCEPT links to the entry where more detailed syntactic and semantic information is available. At that entry, an INSTANCE pointer can be used to provide the reverse links to the abbreviations, variant spellings, declension forms, and irregular forms, as desired. When CED entries are converted to DIMAP entries, inflected forms, variants, irregular forms, and abbreviations are identified as distinct main entries with links to the base form through the SUPERCONCEPT field. The reverse links are not created in the interests of saving space. Inflected forms also contain pointers to pseudoentries for the specific form; the sample dictionary DICT3 contains these entries and may be merged with other dictionaries. The use of this information generally depends on the specific system the user has in mind. As a result, the amount of information included in an entry is highly application-dependent. Additional information that may be viewed as residing under this general heading could be information about when the entry was created, by whom, and recording the time and nature of updates. In general, this type of information is not used computationally. 5 LEXICOGRAPHIC INFORMATION, INCLUDING DEFINITIONS Information that appears in an ordinary dictionary (particularly including definitions, usage notes, examples, and status labels) is not used computationally. However, this material can be analyzed for the purpose of establishing representations that will be employed computationally. Indeed, this is the main reason why the CED is included with DIMAP. In DIMAP, the primary fields directly available to the user appear under the dictionary information category and include specific fields for the definition of a sense, usage notes as usually transcribed in a dictionary, and status or usage label (such as "archaic" or "chiefly Australian" or "Astronomy" or "(of a person)"). Although DIMAP is intended primarily as a tool for developing computational lexicons, it can serve as a tool for lexicography as well. But if DIMAP is viewed more broadly as the basis for a new kind of lexicography, without the limitations of a printed dictionary, there is an opportunity for greater thoroughness in the treatment of ordinary lexical information, particularly the definitions themselves. Atkins provides a set of lexicographic rules for building a template MRD structure. These rules can be elaborated for building a lexicon using DIMAP. There are several principles that need to be followed in considering the template structure: -- regular polysemy (lexical rules); -- modulation ("the ways in which the effective semantic contribution of a word form may vary under the influence of different contexts"); -- a very general or 'major' sense (or a series of major senses) for each headword (hence, a hierarchical structure within the senses of an individual headword); and -- basing the structure for each lexical item on appropriate theory. Atkins views this template as maintaining the traditional notion that a definition should be viewed as consisting of a genus term and differentiae. The genus comes from the core word in a definition (perhaps after analysis of lexicographical defining conventions), either directly or from a defining formula (for example, 'object used for' or 'object on which' for device or 'used to hold' for container). Differentiae come from analysis of any material other than the core word and may incorporate such central notions (for nouns) as whether the object is free-standing or in some meronymous relationship ('part-of' or 'attached-to'), use of a device, its form ('cylindrical'), and domain specificity. An important defining mechanism in her view is the extension of a sense through a link-rule (called a derivational rule or a rule of lexical subordination by others). These notions are explored further from a computational point of view in section 10 of this chapter. In this spirit of bringing a new level of rigor to the definitions themselves, Mel' uk views a definition in an explanatory and combinatory dictionary (ECD) as having at least the following three important features: (1) the definiendum (or main entry) is a propositional form consisting of the lexical unit in question, variables representing its semantic arguments (with the same variables appearing in the definition), and structural elements (such as prepositions with, out of, ...) linking the variables to the lexeme; (2) the definition must render explicit the semantic invariants found in the definiendum; and (3) the definition must avoid idiomatic expressions. Mel' uk offers several heuristic criteria for the formation of definitions that satisfy these principles. In Ilson and Mel' uk, these are summarized as follows (and constitute the semantic zone of an entry). The definition is a paraphrase of the propositional form satisfying several formal requirements: -- Substitutability - the definition must be substitutable for the propositional form in all possible environments), -- Decomposition - the definition paraphrase is formulated in terms of lexemes such that each one of them--identified by a sense number--is semantically simpler than the definiendum, with each semantic component playing one of three roles, specifying either a relation between arguments, semantic restrictions on its arguments, or a modification of another component, and -- Inheritance of Arguments - a component of the definition brings to its host ALL its own actants, which must be explicitly accounted for in the definition. This makes ECD definitions satisfying all formal requirements equivalent to semantic networks. 6 SYNTACTIC FEATURES The next piece of information usually described in elementary textbooks on NLP is the feature. This field is used to record syntactic features associated with each particular sense. Each word sense may have any number of features that characterize its syntactic and semantic properties. Many features have been used in NLP; here, we recount the ones mentioned in Allen, Flickinger, and Meyer et al., which generally follow the literature. A feature is represented by giving its name and its value (with multiple values allowed for a given word). Allen describes useful feature systems (pp.89ff) that include (at least for English) the following: number singular, plural person first, second, third (can be combined with number) verb forms infinitive, present, present participle, past participle, past auxiliary verb type be, do, have, modal verb sub- categorization (complement) type none, direct object, indirect object, adjective, prepositional phrase, infinitive, "for-to" complement, "to" infinitive, "that" complement, wh- complement, gerundial complement mood declarative, yes-no question, wh-question, imperative, embedded voice active, passive The features above generally suffice for syntactic NLP. In DIMAP, they are best handled in the field defined specifically for them, that is, the feature component of the definition structure. You can enter as many features as you like for each sense of an entry word. To enter this information in the change routines (see chapter 4), you need to enter a feature name and a feature value. You may provide multiple values for a feature, either by simply separating each value by a space or by enclosing them with braces (as does Allen). The choice is up to you and depends how you want to process them. The names and values are also arbitrary. However, it is well to decide on these matters beforehand, to ensure consistency in your representation. In parsing systems that give more importance to the lexicon, the capability that makes it possible to store syntactic generalizations in the lexicon is the system of attribute-value pairs (feature structures) that are encoded. In developing the lexicon for the HPSG formalism, Flickinger identifies two types of syntactic properties: a set of features and a set of subcategorization specifications. The features are separated into those with atomic values and those with category values (feature- value pairs). The atomic-value features are drawn from a small finite set where each feature has a limited set of possible atomic values. The following lists the features and their values as present in Flickinger: CAT (category) Noun, Verb, Preposition VFORM (verb form) Finite, Infinitive, Base, Past INVERTED +, - CASE Accusative, Nominative PFORM (preposition form) To, Neutral NFORM (noun form) Normal, It COMP (complement) For, That DTYPE (determiner) Each, Every PREDICATIVE +, - NTYPE (noun type) Common, Proper, Pronoun AGREEMENT Mass, Singular, Plural To represent these features in DIMAP, use the feature field of the dictionary entry and enter the feature name from the left-hand column and one or more of the values from the right-hand column. The sample dictionary DICT2 makes use of these features in encoding its entries. No part of speech values are present in this sample dictionary, but note how this provides an alternative to identifying a part of speech from a predetermined set, as is done in DIMAP. If you wish to encode a different set of categories than the predetermined set, you can do so under the features. Meyer et al. indicate that syntactic features can be inherited from a class. In DIMAP, this can be accomplished through the SUPERCONCEPT field; pseudoentries (entries prefixed with '#') can be used to record class information. For features that are not sufficiently regular to be captured in a class, entries can be made in the FEATURES field of a DIMAP sense. This is the first place that the possibility of inheriting information has been mentioned specifically. It is well to consider carefully the feature combinations that are present within a lexicon, to remove as much redundancy as possible and to allow information in an entry to be inherited as much as possible. The immediate approach that is suggested envisions entries like '#plural' or '#masculine' or '#declarative'. Such entries might consist of only a single feature. But you should consider the possibility that pseudoentries can contain two or more features. For example, '#1st_sing' might encode a person feature and a number feature. A generalized approach for identifying potential complex feature bundles is, first, to develop the features that make sense for a particular system. Second, encode these features and their values for particular entries. Third, examine several entries and factor their features and values (that is, determine their least common denominator), identifying the possibility for creating pseudoentries that bundle several feature and value combinations. This approach was followed by Flickinger in creating the word-class hierarchy contained in DICT2. The classes were carefully constructed to capture syntactic regularities that could be inherited within a parsimonious structure. 7 SYNTACTIC STRUCTURES As expressed by Meyer et al., in parsing, information is to be obtained from the lexical entry for a word, as necessary to characterize the use of the word. This will include the syntactic features associated with the sense and also something about the syntactic structure. Each word sense should have only one permissible associated syntactic structure. The framework of the structure (which should be an underspecified representation--that is, with certain slots whose values need to be identified from the surrounding text) needs to be encoded in the sense. The encoded information identifies the elements of the parse structure which the current lexeme requires as arguments (with verbs identifying all of their arguments, modifiers specifying requirements for their heads, prepositions identifying their objects and mode of attachment, and nouns perhaps indicating constraints on their verbs--see discussion of qualia structures in section 10). As with syntactic features, structural information can be inherited from a class; again, this can be captured by SUPERCONCEPT links in DIMAP. Otherwise, the structural information is encoded in the sense entry. The precise nature of what is to be encoded depends on the parser and knowledge representation that is desired. To that extent, the entry must be tailored. Systems in which very little syntactic structure resides in the lexicon do not identify the structure per se, but rather simply specify the subcategorizing properties of an entry. The responsibility for parsing lies in the recognition procedures built into the grammar rules for a particular subcategorization. For these systems, the use of syntactic features as described in the last section might suffice. 7.1 A Dictionary for Unification Grammars More recent parsing systems theorize that many idiosyncrasies of syntactic structure are intimately associated with specific items of the lexicon. Therefore, they attempt to encode the syntactic structure directly. The lexical, syntactic hierarchy constructed by Flickinger is included as a sample dictionary (DICT2). In this dictionary, extensive use is made of the feature component of the DIMAP dictionary structure to encode subcategorization specifications. This sample implements an approach to the lexicon that represents a major departure from earlier linguistic theories. It is claimed that, in HPSG, the number of phrase structure rules can be reduced to fewer than 20 very general ones (Proudian and Pollard; Flickinger, Pollard, and Wasow). Moreover, the syntactic and semantic regularities present in the lexicon can be structured into a lexical hierarchy that substantially eliminates redundant information in the entries by allowing the use of inheritance. For these reasons, it is particularly compelling to demonstrate how this can be accomplished in a DIMAP dictionary. The most important piece of syntactic information in an entry is the subcategorization information, which characterizes the pattern of complements and adjuncts permissible for the word sense. It is this information which combines with the grammar rules in HPSG in parsing. This is the location where the syntactic generalizations have been placed in the lexicon. In DIMAP, this information is set up under the FEATURE component of a dictionary entry. The feature name in the sample entries is either 'compl' (complements) or 'adjns' (adjuncts). The feature value consists of a category specification (set of feature-value pairs) and semantic properties (thematic role assignments). The feature value in these entries is itself a list of the thematic roles and feature-value pairs. In DIMAP, a feature value is limited to 128 characters, so there is some limitation on the amount of information that may be contained in any one feature value. However, as can be seen in some of the sample entries, there can be several features with the name 'compl' or 'adjns', so that all the necessary information can be portrayed. Each feature value in DICT2 is enclosed in brackets. The first element inside the bracket is the thematic role; it is followed by a set of feature-value pairs which together make up the syntactic restrictions (or subcategory specification) imposed by the lexical item on its complements and adjuncts. Each of these is nothing more than a feature-value pair taken from the list of features and their possible values, as identified in the previous section on syntactic features. The sample dictionary is only a small part of what would be needed in a robust parsing system. Nonetheless, it is a legitimate superstructure in itself for any parser that follows the HPSG approach. Close examination of this sample shows the flexibility of DIMAP and gives some ideas on how other lexicons based on similar unification grammars might be structured. 7.2 Lexical Functional Grammar Meyer et al. follow principles of Lexical Functional Grammar (LFG), which is closely allied to HPSG. (The sample dictionary DICT4 contains examples encoded using this formalism, see particularly the entries for a, quickly, bright, smell, eat, and drop by.) In this case, the syntactic structure information encoded into the lexical entry will be part of an f-structure parse of a sentence (or other fragment), and is referred to as the "fs- pattern". In this pattern, variables are used to identify nodes in the f-structure with which the current lexeme has syntactic or semantic dependencies. The variable name (encoded as $var0, $var1, $var2, ...) is associated with the root node of a subtree in the representation. In the lexical entry, the word "root" is used to indicate an f-structure and the variable then identifies the syntactic or semantic relationship. The current lexeme is always encoded as $var0. In parsing, a bottom-up active chart parser retrieves the (inherited or local) fs-pattern of each word in an input sentence (with phrase structure rules providing top-down expectation). The fs-pattern is unified with the f-structure produced by the syntactic parser. The unification serves to bind the variables identifying the lexemes falling in syntactic or semantic dependency to the lexeme in question. (Of course, the unification will fail if obligatory structures are missing, syntactically disallowed structures are present, or there is a mismatch in agreement features.) 7.3 Diathesis Alternations A particularly important phenomenon for some verbs is that of the alternation in where their arguments might be placed. For example, "John broke the window" and "The window broke" involve the same essential sense of the verb "break", but vary in what arguments are used and where they are placed. These examples show that it is possible for two senses of a word to have the same meaning, but have different syntactic realizations. For Levin (1991a), these examples "reflect the interaction between a representation of the meaning of a verb and the principles that determine the syntactic realization of its arguments." For Pustejovsky, these examples lead to "qualia" structures for nouns, moving some of the lexical responsibility for determining the appropriate syntactic realization away from verbs. Each syntactic realization will have its own entry in the lexicon; in parsing, only one sense will emerge as having been recognized. The placement of such entries within a hierarchical lexicon raise some interesting considerations (including merging several senses into one), which will be discussed at more length in the next three sections on semantic representation, ontologies, and lexical rules. 8 SEMANTICS The representation of the meaning of a word consists of two parts: (1) an identification of the relationship between a word's syntax and the semantic roles played by its arguments (if any), and (2) the compositional structures that the lexical item contributes to the representation of the meaning of a text fragment within which it is used. The first part is strongly linked to the syntactic structures discussed in the previous section, and is discussed in the first subsection below. The second part (discussed in the second subsection below) is strongly linked to the view of the world, and, as suggested by Meyer et al., would include (1) knowledge about how concepts may fit together (an ontology), (2) knowledge about the world, (3) knowledge about speaker/hearer intentions, (4) knowledge about speaker/hearer attitudes, and (5) knowledge about the structure of the discourse or text. 8.1 Semantic Roles and Their Relationship to Syntactic Structure For Levin (1991b) representing semantic information means primarily capturing, for a verb, the number and types of arguments its requires and the semantic relation each of these arguments bears to the verb. This ignores other kinds of semantic relations, such as synonyms, antonyms, hyponyms, appropriate modifiers (which might be identified by Mel' uk--see section 10 below), and other relations identified by Cruse. To Levin, the central concern "is the formulation of representation that makes explicit the semantic relations between a verb and its arguments, as well as other aspects of the meaning or a verb related to its status as an argument-taking lexical item." The representation should allow the placement of a word within the larger organizational schema of verbs. -- Semantic representation (logical forms) Allen provides the basic requirements for representing the semantic relation component of meaning. In Allen's semantic interpretation rules, the right hand side is the semantic representation. Allen intends (pp. 212ff) the semantic representation to be a logical form for a particular word sense that can then be composed with other logical forms. This approach is well accepted and provides the basis for knowledge representation and reasoning systems (which thus rest on the composition of meaning structures specified within the senses of an entry). In Allen's notation, a logical form consists of 4 elements within parentheses: -- an operator, indicating the type of structure being used to describe an entity; -- a name, that is, an arbitrary variable that is used to identify the specific instance of the entity being described; -- the type of the entity; and -- any modifiers of the entity, each of which is a logical form. The operator generally corresponds to a syntactic entity or semantic case relation, including: -- sentences, including declarative sentences (ASSERT), yes/no questions (Y/N-QUERY), wh-questions (WH-QUERY), and commands (COMMAND); -- tense, including distinct operators for past (PAST), present (PRESENT), etc.; -- embedded sentence, including compounds (BUT), relative clauses (EMBEDDED), and infinitive phrases (INF); -- verb cases, including any case relations governed by a verb (such as AGENT, THEME, BENEFICIARY, TO-LOC, AT-LOC, and EXPERIENCER), where noun phrases are treated as modifiers of verbs; -- adjectives, where the operator is the attribute name (COLOR for white, RACE for white); and -- determiners, including DEF/SING, DEF/PLU, INDEF/SING, INDEF/PLU. Verbs (but other parts of speech as well) require specific complement patterns, frequently articulated as case relations. These are the constituents for which the verb subcategorizes. Allen calls these the inner cases of the verb, to distinguish them from constituents that are not obligatory. To distinguish these obligatory constituents, Allen uses brackets ('[', ']') rather than parentheses around a modifier. In DIMAP, the logical form for a sense in a dictionary entry is encoded in the right hand side of the "semantic interpretation rule" field. This information is entered through prompts which ask for operator, name, type, and modifier information, as described in chapter 4. Usually the name field is filled with a '*' for instantiation during parsing. The operator and type fields are specified with any user-entered string or perhaps with a question mark ('?') to indicate that no value is given for the field (for later composition with other elements). In Allen's logical forms, the operator indicates a syntactic, case, or semantic relation and the type characterizes the meaning of a component. The type is filled with the most specific applicable concept in a type hierarchy or ontology of all concepts. In Allen, when the type is unknown, it is specified by T; in DIMAP, it is indicated by !T. Thus, T(MAIN-V) is intended to retrieve the type of the main verb. Alternatively, Allen uses the notation V (in DIMAP, this is indicated by !V), followed by some syntactic structure relation, to retrieve the semantics of the item. Thus, V(OBJ) is supposed to retrieve the meaning or semantics of the object of the verb. Exactly what should be retrieved is considered in more depth in the next section, in the discussion of type hierarchies and ontologies. Instead of using the SEMANTIC INTERPRETATION RULE field in DIMAP, this information can be entered in DIMAP by encoding the case relation in the FEATURE field, where the feature name is the case relation and the feature value is tied to a particular syntactic relation. Thus, for a verb, we can have a feature AGENT with value V(SUBJ). The type or meaning of an item can be implicitly expressed in its SUPERCONCEPT links. To Meyer et al., there is a similar importance given to the interaction between the information contained in the semantic fields and the syntactic structure fields. The mapping rules that are specified in the syntactic structure are unified with the f- structure, thereby binding the variables and enforcing the constraints (this constituting most of the parsing process, and is discussed in the next subsection). Then, in the meaning pattern that is specified in the semantics, the meaning of the current lexeme is obtained by the "^" operator preceding the reference to that variable. The meaning pattern of the entire lexeme is the meaning of the variable $var0, that is, ^$var0. The case relations, however, are considered part of the semantics of the lexeme, and hence intimately part of the representation in the ontology. For the remainder of this subsection, we consider specific case relations associated with particular types of verbs. This will provide an indication of what general structures are usually thought to be included in these representations. -- Linking regularities Arguments that are perceived to bear a particular semantic role are consistently expressed in the same way across a wide variety of verbs. For causative use of verbs of change of state (break, freeze, redden), the agent and patient arguments are expressed as the subject and object. These are referred to as agent-patient verbs, which describe actions where some generally animate entity, the agent, brings about a direct (usually physical) effect on or a change in the location of another entity, the patient. Subclasses of agent-patient verbs are verbs of contact-effect (cut, smash), ingesting (eat, drink), and causative uses of verbs of change of position (roll, move, rotate). The term theme, rather than patient, is used to refer to the argument that denotes the entity whose position changes (for verbs of motion) or whose position is specified (for verbs of position). agent-patient verbs with more than two arguments that describe the placement or attachment of an entity at some location include verbs of placing (put, stand) and verbs of attaching (fasten, bolt). With verbs of change of possession (including verbs of giving (sell, lend) and taking (buy, steal)), the agent argument is the subject and the argument denoting the entity transferred is the object. Verbs of psychological state (admire, astonish, like), cognition, desire, authority, and perception have the same arguments. Verbs of change of position (including directed motion (come, go, rise), manner of motion (dance, run, jog), placing (put, stand), and exerting force (push, pull, drag) also require prepositional phrases specifying the trajectory that the theme travels, frequently known as path, source and goal roles. Directional complements are also found with verbs belonging to more abstract domains that appear to involve a notion of transfer, including verbs of communication (talk, speak, whisper). For verbs with these linking regularities, the lexical semantic representation involves simply listing the arguments that a verb requires and identifying the semantic roles played by these arguments. -- Verb classes As mentioned earlier, diathesis alternations (alternations in the expression of arguments of verbs) involve several arguments that may or may not be realized. Further, these alternations identify systematic semantic-syntactic correspondences that reflect semantically coherent classes. transitivity alternations involve a change in the verb's transitivity. Many verbs of change of state (break), and more generally, verbs of change of position (roll, move, turn) frequently experience a causative-inchoative alternation (for example, the difference between "John broke the window" and "the window broke." Verbs of contact-effect (cut, slash, bite) do not experience this alternation, but do experience the conative alternation (for example, the difference between "John slashed the meat" and "John slashed at the meat." According to Levin, generalizations involving diathesis alternations apparently refer to components of meaning that can in turn be used to induce a verb classification. Verbs of change of possession (buy, sell) and verbs of change of position (slide) both require the same set of semantic roles (agent, source, goal, and theme) but associate these roles with different syntactic realizations. Verbs of transfer of possession fall into two classes according to whether they pattern like buy (where the subject is both agent and goal) or sell (where the subject is both agent and source). The subjects of some intransitive verbs, the unergative verbs (shout, smile), pattern like the subject of transitives (that is, an agent), while the subject of others, the unaccusative verbs (die, appear), pattern like the objects of transitives (that is, a patient or theme). The unaccusative class includes telic verbs (verbs denoting events with an inherent endpoint, primarily change of state and location verbs), and the unergative class includes atelic verbs (denoting events with no inherent endpoint, essentially activity verbs). Verbs of light emission (flicker, glow, shine), whose single argument is neither volitional nor animate, are classified as unergative verbs, suggesting that the distinguishing property of activity verbs is that of an event without an inherent endpoint, rather than taking an animate and volitional argument. -- Adjunct characteristics of verbs Adjuncts qualify the event or state denoted by the verb by adding information expressing when, where, how, and why the event took place or the state held. Benefactive adjuncts indicate a person who benefits from the action denoted by the verb; instrumental adjuncts indicate the tool used in performing the action; manner adjuncts indicate the manner in which an action is performed. Regularities (with constraints on distribution) are found in the expression of adjuncts. Some adjuncts are found only with verbs that take an agent argument, denoting an action that is controllable and able to be performed intentionally. Certain adjuncts permit alternate syntactic realizations. Benefactive adjuncts can be expressed as the indirect (first) object in a double object construction as well as by a for phrase. Indirect objects with a benefactive interpretation are only found with verbs of particular semantic classes: verbs of creation (make) or obtaining (buy), but not with verbs involving change of position (put). -- Lexical semantic representation The preceding discussion of types of arguments for verbs portrays the range of arguments that are generally found. Every noun phrase in a sentence should be associated with a semantic role; it would appear that in some systems of roles a noun phrase may be assigned two roles. Many sets of semantic roles have been proposed. Levin (1991b) discusses the difficulties of developing a theoretical framework that is consistent in the use of semantic roles. Basically, the issue is whether a set of roles can be proposed to cover all verbs or whether it is necessary to define roles with respect to individual verbs. The problem is that it is difficult to encode multiple relations in a semantic role list. This difficulty will emerge when the set of verbs is arranged hierarchically to take advantage of syntactic and semantic regularities. The solution seems to be the decomposition of a verb's meaning, but here the problem is the identification of primitive elements. This issue is discussed more fully below in the discussion of type hierarchies and ontologies. In a list of semantic roles, the elements are labels that identify arguments according to the semantic role they bear to the verb. The criteria used for adopting a particular set are usually based on (1) a brief description intended to capture the intuitive understanding of what qualifies as an instance of the role and (2) an examination of systematic semantic-syntactic correspondences. Minimally, the set of semantic roles is chosen in order to account for the entailment and paraphrase relations involving verbs and their arguments. 8.2 Semantic representation, a word's type, and selectional restrictions The interpretation of a sentence involves determining the appropriate meaning of each word and representing this meaning in a useful way. Each word may have several meanings or senses. Although syntactic features and parsing may enable some disambiguation (or identification of the appropriate sense), other semantic information provides further insights. A fundamental piece of information about a word sense is the 'type' of the concept it represents. The representation of the meaning of a word is closely related to its type; and, insofar as that meaning has arguments associated with it, there may be selectional restrictions for those arguments expressed in terms of 'types'. In this section, therefore, we shall be concerned with representing the essential meaning of a word and identifying selectional restrictions for its arguments. Both these aspects relate very strongly with one's view of the world, which needs to be expressed by an ontology, the subject of the next section. -- Kernels of meaning The meaning of a lexical item consists of the concept that it expresses and its argument structure. A concept is operationalized through words, but in general should not be viewed as a word or even a set of words. Concepts are related to one another. A grammar expresses how concepts, as expressed by words, may relate to each other syntactically. An ontology, the subject of the next section, groups concepts by their properties. But an ontology does not characterize the elements of meaning we are trying to represent. Instead, these elements of meaning must either be primitives, expressed as a word or short phrase, or may sometimes need to be expressed, not in words, but as an image or a sound or other sensation. (Images or sounds may conveniently be included in DIMAP as data files or programs attached to FEATUREs.) As described by Levin (1991b), decomposition assumes that meaning is composed of a number of primitive predicates. Jackendoff (1990) suggests that such entities as thing, event, state, action, place, path, property, and amount are primitives. Similarities in meaning are captured by attributing common elements to decompositions. Verbs group by sharing properties. The decomposition should predict and explain regularities in the expression and distribution of arguments and adjuncts. The basic requirement of this approach is the selection of an appropriate set of primitive elements. One criterion is the use of entailment. If the meaning of one word entails that of a second, the decomposition of the first is typically assumed to include that of the second. A representation that uses decomposition must specify the means of combining the elements that enter into the decomposition. Schank proposes functional composition. (See the discussion under ontology in section 9 for methods for representing elements in DIMAP.) Some decompositions are intended to be exhaustive, some are partial. Conceptual dependency diagrams are supposed to be exhaustive, while Jackendoff's system envisions the decompositions supplemented by modifiers which encode certain idiosyncratic aspects of the meaning. The distinction between linguistic knowledge and real world knowledge must be taken into account. Schank wants to elucidate the causal structure of the event denoted by a verb to facilitate drawing inferences; hence, chains of events are included if they are part of the meaning of a verb. Some decomposition approaches allow the introduction of constants to fill certain argument positions. Some works in lexical semantics assume that the notions of motion (events) and location (states) are the concepts around which predicates can be classified and their argument structures organized. Works with a lexical orientation assert that verbs fall into two major classes: verbs of location (taking arguments with the role of theme--the located object--and location) and verbs of motion (taking arguments with the roles of theme--the moving object, source, goal, and path). Under this approach, verbs that are not verbs of motion or location are viewed as verbs of motion or location by analogy. An integral part of this approach is the notion of fields. According to Jackendoff's Thematic Relations Hypothesis (Jackendoff 1983, p.188), the principal event-, state-, path-, and place-functions are a subset of those used for the analysis of spatial location and motion, differing according to what types of entities that may appear as theme or reference objects and what kind of relation assumes the role played by location in the field of spatial expressions. In this approach, there is also an independent causal dimension (Jackendoff 1987, 1990), the 'actional tier', to deal with causation, instruments, and related notions. -- Selectional restrictions As indicated in the last section, Allen's approach to semantic interpretation (p.197ff) is centered around the notion of a logical form: the meaning of each word sense is encoded according to a precise structure enabling that meaning to be incorporated (composed) with the meaning of other words (to which it is tied syntactically) to build the meaning of a larger unit of text. The composition (merging, unification) process involves determining which word sense to use and then performing the actual composition. Allen determines which sense(s) to use from the "if" part of semantic interpretation rules, which specifies the conditions when the rule is applicable. The "if" part (referred to as the left hand side, or LHS) specifies both syntactic and semantic criteria, encoded as patterns that describe the constraints on the phrases containing the lexical entry. The pattern identifies the syntactic position the lexical entry must occupy, along with the syntactic constructs surrounding it. The values in the surrounding context are usually not identified exactly, but rather are specified through a list of selectional restrictions. Thus, the adjective green (referring to the color) must appear in the following syntactic and semantic context: (NP ADJS green HEAD +physobj) while in the "inexperienced" context, we would require (NP ADJS green HEAD +human). Selectional restrictions encoded in this way can act as constraints enforced in parsing by using them in conjunction with a type hierarchy or ontology. The way this works is as follows: When a candidate for a particular slot is proposed (for example, for the HEAD slot), the candidate is checked for its position in the hierarchy. We determine whether it is possible to reach a node with the specified characterization from the entry definition of the head. (Negative restrictions can also be used--searching the type hierarchy beginning with a candidate and reaching a node that is the value following the negative sign means that the candidate should be rejected.) In DIMAP, selectional restrictions and syntactic context can be specified in the LHS of the SEMANTIC INTERPRETATION RULE field. This LHS is structured using slot-filler notation. In LISP format, the type of the syntactic constituent is identified directly after the opening parenthesis (usually S for sentence, NP for noun phrase, or PP for prepositional phrase). An arbitrary number of slot-filler pairs may then follow, as desired. Each pair consists of a slot name and a slot value. The slot names are arbitrary, but typically identify syntactic roles. The slot values may specify a lexical item or any number of selectional restrictions, each one prefixed by a plus or minus sign. DICT2 provides several examples using Allen's formalism. Complex semantic interpretation rules appear under several entries, particularly the pseudoentries (those prefixed with "#"). Also, see chapter 3 for details on viewing semantic interpretation rules and chapter 4 for details for entering this information into a DIMAP dictionary. -- Semantic composition within an ontology In Meyer et al., the representation of meaning is specified by a map into a separate ontology or into structures encoding attitudes or relations. The ontology provides a system of concepts (that is, identifying the concepts and any relations among them). Each word sense is linked to some concept in the ontology, which is expected to be independent of particular languages. In DIMAP, a separate ontology and a set of attitudes and relations can be created by using pseudoentries. Thus, the ontology, attitudes, and relations contain the general structure of the meaning, while the individual lexical entries contain specific values (or variables for values), and selectional restrictions on these values, to be placed into the slots of the general structure. DICT4 implements examples from Meyer et al. showing semantic representations that explore these notions, as discussed below. (See the entries for drop by, eat, smell, coffee, fresh-brewed, bright, delicious, by, in, and of, as well as #visit, #ingest, #voluntary-olfactory-event, #involuntary- olfactory-event, #olfactory-attribute, and #olfactory-sense.) The essence of the meaning is contained in the ontology, through its links within the hierarchy and the argument structure encoded within its feature list (see particularly the entries in DICT4 for #visit, #voluntary-olfactory-event, #involuntary- olfactory-event, and #ingest). For these examples, the argument structure in the ontology indicates that these concepts have agent, experiencer, theme, or instrument arguments and identifies the range of ontological concepts that may validly fill these positions. The lexical entries for drop by, eat, and smell (both a noun and a verb sense), on the other hand, only identify the syntactic variable that will fit into those argument positions, in some cases with additional selectional restrictions. In general, there will not be a one-to-one mapping between a word sense and an ontological concept. For more general words, it is to be expected that several lexemes will map to a single concept in the ontology. In a terminological lexicon, there will be a tendency for nomenclature to correspond to conceptual objects precisely (such as chemical compounds, machinery, and electronic components). If there is not a single concept in the ontology to which a lexeme is mapped, the mapping is taken to the concept that is the most specific concept that is still more general than (that is, that subsumes) the meaning of the lexeme in question. Once this concept is determined, constraints/information are specified in appropriate slots in the lexicon. The information can be the value for the slot, either a default, a constant, or a reference to a variable that will be filled during semantic processing and conceptual dependency building. The constraints constitute selectional restrictions on what the value for a slot is allowed to be. The selectional restrictions may be any concept (or boolean combination of concepts) from the ontology, where the value must then be a descendant node of (or equal to) the concept in the ontology. The ontological category to which a lexeme is mapped does not have to be of the same syntactic category. Thus, a verb can map to an ATTRIBUTE ("the flower smells sweet"), rather than an EVENT. A verb can map to a RELATION (e.g., OWNER-OF or CONTAINS); a noun to an EVENT (e.g., discussion or process), rather than an object; a noun to an ATTRIBUTE (e.g., age, temperature, color, or size); a noun to a RELATION (e.g., ownership or possession); and an adjective to a complex mapping, such as RELATION (as head concept) + OBJECT (value of RELATION slot) (e.g., wooden ==> (MADE-OF (WOOD)). In DICT4, the entry for smell contains one noun sense and one verb sense that map into the ontological category #voluntary-olfactory- event. In some cases, the meaning of a word may indicate the modification of another concept, or the semantic relationship in which the meaning of other words stand relative to each other. Many adjectives do not instantiate a concept, but specify a particular value of an attribute of some other concept, typically the concept which corresponds to the meaning of the word which the adjective modifies syntactically. Therefore, instead of a link to an ontological concept, the variable representing the head ("^$var1") is used, indicating that the concept instantiated by the lexeme which is bound to $var1 in the syntactic structure fs-pattern will have the particular characteristic represented by the adjective. If the slot and value being added are already specified in the concept in question, then this new information overrides that information, since it is more specific. The adjectives fresh-brewed, bright, and delicious in DICT4 show how this is implemented. Note also that since both bright and delicious are adjectives that can be used both attributively and predicatively, but the nouns they modify will have the same characteristics, an intermediate entry is created in DIMAP to avoid the duplication of information. The selectional restrictions on the noun they modify are actually contained in the entries bright_1 and delicious_1, since otherwise this information would be repeated in both senses of bright and delicious. Variables are also used in the semantic representation of prepositions. In this case, selectional restrictions are placed on both the head of the phrase and the object of the preposition. (See the examples in DICT4 for the prepositions in, by, and of, where the semantics instantiates a relation between two other concepts, in these cases identifying instrument, location, destination, ownership, and domain relations between the two concepts.) In addition to representing the conceptual meaning of a text, Meyer et al. also encode knowledge about speaker/hearer attitudes and structure of the discourse or text (which they term domain and textual relations). Attitudes encode belief (epistemic), value (evaluative), deontics, expectations, volitions, and importance (saliency). Each of these is encoded in a quintuple consisting of a type, a value, an attributed-to slot, a scope, and a time. In DIMAP, these are encoded as pseudoentries with slots and values (see #evaluative and #attitude in DICT4) that may also be filled in particular lexical entries (see sense 8 for smell in DICT4). 9 ONTOLOGIES AND TYPE HIERARCHIES As humans, we frequently categorize the world around us in our attempt at understanding. So too in computational linguistics. We categorize the meaning of a word sense through specification of its type: What type of object is represented by a word and where does that concept fit within the world of concepts. The types or kinds of objects or concepts can usually be arranged into a hierarchy. However, defining or specifying these types (or ontology) needs to be done carefully. (See Allen, pp.195-7, for some discussion of these issues.) In general, types should be characterized using words. DIMAP specifically includes mechanisms for representing types and placing them into a hierarchy. The principal component for handling this hierarchy is the SUPERCONCEPT field of an entry. This field is designed to point to a parent (or superconcept or genus or hypernym or AKO (a-kind-of) link). Each sense of an entry can be linked to one or more superconcepts: You simply enter the word you wish to use as the type. Since it is presumed that the type is a word, it is likewise presumed that the type will have its own entry in the dictionary, along with one or more senses. Therefore, you are requested to identify which sense of the genus term you wish to be identified as the parent of the sense you are creating. You may specify only one sense; if you wish to specify all senses, DIMAP suggests using zero ("0") as the sense link. If you wish to specify more than one, but not all, senses of a genus term, create intermediate entries to which the several senses are linked. 9.1 Theoretical structure for an ontology For computational purposes, it is important that the ontology be structured in a way that facilitates its use in both computational linguistics and knowledge representation. In Meyer et al. and Carlson and Nirenburg, the ontological concepts are developed according to the following principles: (1) whenever possible, scientific rather than lay terms are used; (2) consistency in the naming of ontological concepts going down a subtree is maintained; (3) an indication of some distinguishing characteristic of the ontological concept (that is, a characteristic distinguishing the concept from its sister-concepts) is included in the name; and (4) definitions are provided for all concepts, so that when the name of a concept corresponds to a polysemous word, the intended meaning is clear. Carlson and Nirenburg identify "ontological" links between world model elements. The basic top-level ontological classification divides all concepts into free-standing entities and properties. Properties (whose semantics is relational in nature) are described in terms of constraints on entity classes that they can relate. When this structure is superimposed on knowledge representation, the meanings from the ontology trickle down to text meaning representation as instances of world model entities, sometimes somewhat modified. This representation may include speaker attitudes and domain and textual relations (hence such information must be included in the lexicon and perhaps the ontology as well). The ontology is a knowledge base in which world model elements are specified (in theory, independent of any specific language). The model is formulated in the syntax of the knowledge representation language. The knowledge base in this language thus consists of a collection of frames, where a frame is a set of slots and fillers or values. A filler can be any symbol or a function call or even a data file. Frames are used to represent concepts. A concept is the basic building block of the ontology. Slots are interpreted as a subset of concepts called properties. Fillers are (1) names of elements of the ontology, (2) expressions consisting of ontology elements and modifiers of the elements, (3) collections of the elements and modifiers, (4) demons or lambda expressions, or (5) special purpose symbols and strings. Mostly, fillers or values will just be other elements of the ontology. The modifiers can be facets used to identify the status of the values of various properties; they can include actual values of the property referred to by the slot name, a function describing a range of values, or defaults listing the most typical value(s) for the given property of the given concept. The fillers or values can be expressed as semantic constraints (selectional restrictions); they refer to ontologicial concepts (and their subclasses) from which the fillers of the value and default facets must be selected. A filler can be a string, symbol, number or a (numerical or symbolic range. A symbol in a filler can be an ontological concept, signifying that the actual filler can be either the concept in question or any of the concepts that are defined as its subclasses. Symbolic (disjunctive or conjunctive) value sets may be given. Numbers, numerical ranges, and symbolic ranges are also legal fillers. Syntactic conventions in representation can be to prepend an ampersand (&) to symbolic value set members in order to distinguish them from ontological entity names. 9.2 Basic ontological modeling choices The specific development of an ontology may vary from person to person; there is unlikely to be agreement for some time on the precise elements to be included. However, a specific example is instructive. In Carlson and Nirenburg, the concepts OBJECT, EVENT, and PROPERTY are SUBCLASSES of the concept ALL, which serves as the root of the network: (ALL (SUBCLASSES (value +property +object +event)) ) In DIMAP, this root is implemented as the entry #all in DICT4, with the subclasses identified as INSTANCES. The SUBCLASSES property and its inverse, IS-A, are the major classifying relations in the world model. (In general, each relation has an inverse.) The three subclasses here are interpreted as a disjunctive set. To override this default interpretation, it is necessary to prefix the set with "and" or "or". Properties are described with two special slots, domain and range. These are special properties that apply to other properties and specify the beginning and end points, respectively, of the links that the properties represent. Thus, we can have the following entry (see #is-a in DICT4): (IS-A (DOMAIN +all) (RANGE +all) ) When the value for a slot is the name of an ontological concept X, its semantics is "X and all entities that are descendants of X in the ontology. The filler +all in both slots of the #is-a frame stands for any concept in the ontology. Based on the semantic constraint on the filler of the range slot in a property, properties are classified into two large classes--attribute and relation. Relations have references to concepts in their range slots; attributes have references to values from value sets. Objects can typically have identifiable parts or constitute a part of some other objects, can belong to somebody, can be at a specifiable location and can have the time of its coming into existence specified. Events can have identifiable component events or can be a component of another event, can take place at specifiable times and/or in specifiable locations, can be caused by another event and cause another event, can have effects and can be instantiatable only provided certain conditions are met. The basic formats for their frames are (see also #object and #event in DICT4): (OBJECT (IS-A +all) (HAS-AS-PART +object) (PART-OF +object) (BELONGS-TO +human +organization) (AGE > 0) (measuring-unit +year) (LOCATION +place)) (EVENT (IS-A +all) (SUBEVENTS +event) (SUBEVENT-OF +event) (TIME > 0) (measuring-unit +second) (LOCATION +place) (CAUSED-BY +event) (CAUSES +event) (PRECONDITION +event) (EFFECT +event)) Case relations in the ontology are different from case relations in a grammar, that is, they do not describe the predicate- argument structure of the verbs of a particular language. Rather, they are conceptual roles typically associated with events and objects. (Notwithstanding the separation from a grammar--and as was described in the last section, a lexical entry that refers to an ontological entry indicates how the syntactic structure maps into these case roles.) Case roles must have a domain and a range. The domain specifies the types of frames in which a particular case-role can occur as a slot, while the range specifies what the fillers of the slot can be. An example of a case role frame is: (AGENT (IS-A +case-role) (DOMAIN +event) (RANGE +animal +force (default +intentional-agent)) (INVERSE +agent-of) (DEFINITION "the entity that causes or is responsible for an action")) The entry for #agent in DIMAP-2 in DICT4 has a SUPERCONCEPT link to #case-role in the ontology. The domain and range slots are entered as FEATURES, while the inverse slot is entered as a ROLE. The entries for #is-a, #object, #event, and #agent extend the ontology beyond a simple delineation of the objective world. The inclusion of these metalinguistic entities makes it possible to characterize a great deal of linguistic processing itself. Their inclusion imposes a discipline on the ontology to ensure that it is self-consistent. This is very important. The basic inventory of case roles included by Carlson and Nirenburg are: agent, theme, experiencer, beneficiary, instrument, location, source, goal, and path (see definitions on pp. 7-8). They can be extended to include: co-agent, co-theme, and "modifying" properties of events (spatiotemporal relations and conditions). Spatiotemporal relations specify the general time and location of an event. (See also Allen, pp. 198-206.) Carlson and Nirenburg next turn to a breakdown of the top-level object and event subtrees. Objects are broken down into physical [discrete (animate, inanimate), mass-like, places], mental [abstract, representational-objects (mathematical objects, language- related objects, icons, pictorial objects)], and social [geopolitical entities, organizations]. Events are subdivided similarly into mental [cognitive, communicative, perceptual], physical [change-location, perceptual], and social [communicative (speech-act)]. Further detail is provided for the perceptual events to demonstrate the processes of concept identification and delineation of conceptual boundaries. This generally involves a careful examination of what might be a discrete and exhaustive subset of a particular concept. One result is that different case roles are assigned to each subconcept; this is ultimately the distinguishing factor. An entry in the ontology can include properties other than those that might be directly considered as linguistic in nature. Conditions specify the causal and intentional structure of events in terms of preconditions and postconditions, and include relations such as purpose, cause, effect, presupposition. With the addition of these slots, the role of the ontology moves beyond computational linguistics into knowledge representation. These slots serve as the basis for conceptual information processing as described by Schank, including the establishment of causal structures (Schank) and scripts, plans, and goals (Schank and Abelson). An essential component to enabling these relations to serve larger knowledge representation is the establishment of a formalism for representing and manipulating these conceptual structures. As described by Meyer et al. and Nirenburg and Defrise, there are three categories of relations, each with their types and subtypes: -- Domain relations (connecting events, states, and objects), including causal (volitional, nonvolitional, reason, enablement, purpose, and condition), conjunction (addition, enumeration, adversative, concessive, and comparison), alternation (inclusive-or and exclusive-or), coreference, temporal (at, after, and during), and spatial (in-front-of, left-of, above, in, on, and around); -- Textual relations (connecting elements of text), including particular, reformulation, and conclusion; and -- Intention-domain relations (connecting speaker intentions to the events described in the text). Each subtype (e.g., #enablement) would be a main entry (in an ontology or DIMAP), linked through its type (e.g., #causal) to its category (e.g., #domain-relations), and to the overarching ontology entry (that is, #relations). Each relation has a set of arguments, which can be named (such as is the case for arguments like agent or theme) or simply identified by position (such as first or second, with or without any positional significance). A relation comes into play through some discourse clue, established by parsing or other reasoning component to trigger the relation. Alternatively, a particular lexical item can overtly result in the filling of a slot in an entry linked to a relation. This would be accomplished in DIMAP by having a SUPERCONCEPT link to a relation, with one or more features, identified by name or position as the feature name and a variable name as the feature value. The lexical item can have slots for particular preconditions, effects, and subevents. These slots will have values that are other entries in the ontology--what Carlson and Nirenburg call ontological instances. These ontological instances will be linked somewhere in the ontology, for example, to script names, that will be activated thereby with slots filled as a result of the activation of the lexical item. These slots will be filled starting with the lexical item, into some place in the ontology, thence downward to ontological instances, and then upward to scripts or other reasoning components. The sample entry #teach in DICT4 provides an example of how this would work. The ontological entry #teach is triggered by the word "teach" or some other lexical item (e.g., "instruct"), which include features for such slots as agent, theme, and location linked to syntactic variables. The ontological entry #teach also has feature slots that include preconditions, postconditions, and subevents. The values for these slots are ontological instances, each of which has its own ontological entry, in this case, #teach-know-1, #teach-know- 2, #teach-describe, #teach-request-info, and #teach-answer. Each of these instances have reference to other ontological classes or scripts through their SUPERCONCEPT link, but also have links to the ontological entry for #teach via their features. Their feature slots are the usual types of semantic relations (experiencer, theme, agent), but their values are unusual in that they reference features of the ontological entry for #teach. Thus, for example, the experiencer slot of #teach-know-1 has the value #teach.agent. So, whatever value is attached to the agent slot of #teach is coreferenced to #teach-know-1. Finally, the SUPERCONCEPT links indicate which other ontological entries or scripts are triggered in developing the representation of a particular piece of knowledge. Complex events include named sets of component events; the frame for a complex action therefore includes a slot for subevents. The representation of a complex action is typically a set of frames, not a single frame. Conditions and components are described through a reference to particular instances of certain events. These component actions are not just any instances of their respective types but instances specifically constrained in the ways necessary for the description of the main event. Even though their descriptions contain a reference to their class, their major allegiance is to the concept in whose definition they appear. The introduction of the ontological instances is parallel to further constraining the values of some fillers. The difference between the two approaches is in being able explicitly to refer lexically to a modified concept if it is introduced as an ontological instance, whereas to refer to the concept is possible only on a value. This allows for paths to be used, with the notation . meaning "the filler of the given slot in the given frame." This achieves coreferentiality. Information recorded in the ontology relates to all objects, events, and properties, but not to any particular instance of any such class. Developing the representation for and reasoning about entities in the world always involves dealing with instantiations of these ontological types. The lexicon thus provides static knowledge about the ontology and text meaning representation fragments. During analysis (parsing), the source text is converted into this meaning representation based on world and contextual knowledge (including pragmatics and discourse). 9.3 WordNet - An ontological database WordNet (Miller et al.) is an on-line lexical reference system in which nouns, verbs, and adjectives are organized into synonym sets, each representing one concept and linked by different relations. This database, while not computationally oriented, can be used as the basis for seeding data into an ontology of the type described in the previous two sections. In addition, several principles and observations that emerged in its construction are important in the design and implementation of a computational ontology. Each synonym set consists of a set of words and a periphrastic expression (definition). The overall structure of WordNet is a directed graph of nodes, with each node representing a concept, and with primitive nodes labeled by several lexicalizations (words), corresponding to the structure described in Litkowski. -- Nouns Miller describes the nouns in WordNet. He notes that nouns are usually conveniently defined with a superordinate and distinguishing features. WordNet is organized into a lexical inheritance hierarchy of synonym sets linked by pointers to the superordinate and by pointers involving the distinguishing features of attributes (modification), parts (meronymy), and functions (predication), although only the meronymic relations are implemented. The nouns are partitioned into 25 "beginners" as follows: {act, action, activity} {natural object} {animal, fauna} {natural phenomenon} {artifact} {person, human being} {attribute, property} {plant, flora} {body, corpus} {possession} {cognition, knowledge} {process} {communication} {quantity, amount} {event, happening} {relation} {feeling, emotion} {shape} {food} {state, condition} {group, collection} {substance} {location, place} {time} {motive} These categories can be grouped into a hierarchy using {thing, entity} as the top, and {living thing, organism} and {non-living thing, object} as the second level. The hierarchies involved in these groups seldom go more than 10 or 11 levels deep, most of them technical. Distinguishing features are attached to each level and are inherited. Attributes are given by adjectives, parts by nouns, and functions by verbs. Short explanatory phrases are attached to each synonym set. Attributes associated with a noun are reflected in the adjectives that can normally modify it. Several part-whole relations are observable, including component-object, member- collection, portion-mass, stuff-object, feature-activity, place- area, and phase-process. A functional feature of a noun describes what instances of the concept normally do or what is normally done with or to them; nouns play various semantic roles as arguments of the verbs with which they co-occur (instruments, materials, products, containers, etc.). Functional information should be included by pointers to verb concepts. (See also the discussion of lexical relations in section 10 of this chapter.) -- Adjectives Gross and Miller describe the adjectives in WordNet. Adjectives modify, modulate, or elaborate the meanings of nouns and verbs. In WordNet, it is assumed that adjectives modifying verbs are simple adjectives that have -ly added as suffix. Syntactically, adjectives can appear immediately before the noun they modify (attributive or prenominal) or in the predicate of a sentence after a copular verb (predicative position). An adjective is usually marked when it can fill only particular positions. A qualifying adjective (heavy, old) gives a value to a particular attribute of a noun (weight, age). The organizing principle of adjectives is the antonymy relation; this occurs within a dimension based upon the attribute. In WordNet, synonym sets for adjectives are related to one another on the basis of a "focal adjective", one that has a direct antonym with another adjective. Other adjectives that seem to have no direct antonym are related to particular focal adjectives by a 'similarity' pointer. A large class of nonpredicative adjectives (musical, atomic) seem to play a role similar to that of a modifying noun. These relational adjectives can be joined to nouns but not with qualifying adjectives, are not gradable, cannot be nominalized, and do not have direct antonyms. These adjectives are maintained in WordNet by pointers to the corresponding nouns. Gradation is an important semantic relation organizing lexical memory for adjectives; however, this relation is not included in WordNet. Since most attributes have an orientation, they tend to be anchored at a point of origin which is the expected or default value; deviation from this default is called the marked value of the attribute; it is not coded in WordNet. Color adjectives can be graded, nominalized, and conjoined with other qualifying adjectives. Only one color attribute is coded in WordNet--light/dark and white/black; the opposition chromatic/achromatic is used to introduce the names of color. Finally, adjectives are selective about the nouns they modify (except in figurative or idiomatic use), so that the noun must have the attribute whose value is expressed by the adjective; these are assumed to be computed as needed and are not prestored in WordNet. (Further discussion of these issues is presented below under lexical relations.) Qualifying adjectives are coded in WordNet into bipolar clusters, with the direct antonyms listed as the "head synonym set" of two clusters, one headed by each, giving pointers to the similarity set of each. Each of the members of the similarity set has its own synonym set, with a link back to the direct antonym set. If an adjective has a syntactic limitation (to prenominal, immediately postnominal, or predicative position), a code for this limitation is given. -- Verbs Fellbaum describes the semantic network of English verbs in WordNet. Verbs change meaning easily, based on the nouns with which they co-occur or by different elaborations of one or two common core components shared by most senses of the verb. Verbs are divided into 15 groups, largely on the basis of semantic criteria, including verbs of bodily care and functions, change, cognition, communication, competition, consumption, contact, creation, emotion, motion, perception, possession, social interaction, weather verbs, and verbs referring to states. These groups reflect the major conceptual categories event and state. These groups derive their names from the topmost verbs, or 'unique beginners,' which head these groups. These topmost verbs resemble 'core components', the unelaborated concepts from which the verbs constituting the semantic field are derived via semantic relations. Within each verb group are hundreds of 'synsets' or closely synonymous sets of verbs, often with a periphrastic expression (rather than a lexicalized synonym) that gives subtle meaning differences and selectional restrictions. The expression shows the way in which the verb has become lexicalized by showing constituents that have been conflated in the verb. WordNet is not organized with sets of components, but rather through semantic relations linking verbs to each other. (This relation can be expressed explicitly; the components can probably also be expressed explicitly. Together, this explication can serve to act as seeds into an explicit computational inheritance hierarchy. As given, WordNet cannot be used computationally.) Some semantic relations (or primitive components) that are embodied in WordNet include cause, opposition, path, and manner. The components constitute subpredicates and correspond to root verbs, or topmost 'unique beginners,' heading semantic fields. (A given verb may have two such components.) This componential analysis can be viewed in terms of entailment, in that a verb V1 that is a component of another verb V2 must be entailed by V2. The semantic relations among verbs in WordNet all interact with entailment. Lexical entailment is a semantic relation between two verbs V1 and V2 that holds when the sentence Someone V1 logically entails the sentence Someone V2. Lexical entailment is a unilateral relation. Negation reverses the direction of entailment. The converse of entailment is contradiction. The entailment relation between verbs resembles meronymy between nouns, but meronymy is better suited to nouns than to verbs. Any acceptable statement about part-relations among verbs always involves the temporal relation between the activities that the two verbs denote. One activity or event is part of another activity or event only when it is part of, or a stage in, its temporal realization. Some activities can be broken down into sequentially ordered subactivities. These are complex activities that are said to be mentally represented as scripts. They tend not to be lexicalized. The analysis into lexicalized sub-activities is not available for the majority of simple verbs in English. Yet there are some and the reason lies in the kinds of entailments that hold between the verbs. The sets of verbs related by entailment have in common that one member temporally includes the other. A verb V1 will be said to include a verb V2 if there is some stretch of time during which the activities denoted by the two verbs co-occur, but no time during which V2 occurs and V1 does not. If there is a time during which V1 occurs but V2 does not, V1 will be said to properly include V2. The sentence frame used to test hyponymy between nouns, An x is a y, is not suitable for verbs. The semantic distinction between two verbs is different from the features that distinguish two nouns in a hyponymic relation. For verbs, lexicalization involves many kinds of semantic elaborations across different semantic fields. The many different kinds of elaborations that distinguish a 'verb hyponym' from its superordinate have been merged into a manner relation (dubbed troponymy). The troponymy relation between two verbs can be expressed by the formula To V1 is to V2 in some particular manner. 'Manner' is interpreted here very loosely. Troponyms can be related to their superordinates along many semantic dimensions. Subsets of particular kinds of manners tend to cluster within a given semantic field. Troponymy is a particular kind of entailment, in that every troponym V1 of a more general verb V2 also entails V2. The activities referred to by a troponym and its more general superordinate are always temporally co-extensive. Troponymy therefore represents a special case of entailment: pairs that are related by troponymy are also always temporally co-extensive and related by entailment. Verbs related by entailment and proper temporal inclusion cannot be related by troponymy. Verb Taxonomies Verbs cannot easily be arranged into the kind of tree structures onto which nouns are mapped. Within a single semantic field it is frequently the case that not all verbs can be grouped under a single unique beginner; some semantic fields must be represented by several independent trees. Motion verbs, for example, have two top nodes, {move, make a movement} and {move, travel}. Verb hierarchies tend to have a more shallow, bushy structure than nouns; in few cases does the number of hierarchical levels exceed four. Moreover, virtually every verb taxonomy shows a bulge, that is, a level far more richly lexicalized than the other levels in the same hierarchy. In most hierarchies, the level below the most richly lexicalized one has few members. For the most part, they tend not to be independently lexicalized, but are compounded from their superordinate verb and a noun or noun phrase. As one descends in a verb hierarchy, the variety of nouns that the verbs on a given level can take as potential arguments decreases. This seems to be a function of the increasing elaboration and meaning specificity of the verb. Opposition Relations between Verbs After synonymy and troponymy, opposition is the most frequently coded semantic relation. Much of the opposition is based on the morphological markedness of one member of an opposed pair. Some pairs are conceptually opposed, but are not direct antonyms. Many deadjectival verbs formed with a suffix such as -en or -ify inherit opposition relations from their root adjectives. These are, for the most part, verbs of change and would decompose into "become + adjective" or "make + adjective." A variety of negative morphological markers attach to verbs to form their respective opposing members. The semantics of this morphological opposition is not simple negation. Some pairs are gradables (terms are points on a scale) that can be modified by degree adverbs. Some direct antonyms are associated with each other rather than with verbs that are synonyms of their respective opposites and that express the same concept as that opposite. These pairs are illustrative of an opposition relation that is found between co- troponyms (troponyms of the same superordinate verb). They constitute an opposing pair because the direction of motion, upward or downward, is opposed or the manners are opposed (slow or fast) in ways that distinguish each troponym from its superordinate. Converses are opposites that do not have a common superordinate or entailed verb; they occur within the same semantic field: they refer to the same activity, but from the viewpoint of different participants. Most antonymous verbs are stative or change-of-state verbs that can be expressed in terms of attributes. There are many opposition relations among stative verbs: live/die, exclude/include, differ/equal, wake/sleep. Many verb pairs are not only in an opposition relation, but also share an entailed verb (hit and miss entail aim). These verbs are not related by temporal inclusion. The relation between the entailing and the entailed verbs is one of backward presupposition, where the activity denoted by the entailed verb always precedes in time the activity denoted by the entailing verb. Entailment via backward presupposition also holds between certain verb pairs related by a result or purpose relation. A verb V1 that is entailed by another verb V2 via backward presupposition cannot be said to be a part of V2. (The set of verbs related by entailment can be classified exhaustively into two mutually exclusive categories on the basis of temporal inclusion.) Some opposition relations interact with the entailment relation in a systematic way. One member of these pairs constitutes a 'restitutive'; this kind of opposition also always includes entailment, in that the restitutive verb always presupposes what one might call the 'deconstructive' one (damage/repair). Many reversive un- or de- verbs also presuppose their unprefixed, opposed member (tie/untie). The Causal Relation The causative relation picks out two verb concepts, one causative (like give), the other what might be called the 'resultative' (like have). The subject of the causative verb usually has a referent that is distinct from the subject of the resultative; the subject of the resultative must be an object of the causative verb, which is therefore necessarily transitive. The causative member of the pair may have its own lexicalization, distinct from the resultative. English does not have many lexicalized causative-resultative pairs; it has an analytic, or periphrastic, causative, formed with cause to/make/let/have/get to, that is used productively. A periphrastic causative is not semantically equivalent to a lexicalized causative, but refers to a more indirect kind of causation than the direct, lexicalized form. WordNet recognizes only lexicalized causative-resultative pairs. The synonyms of the members of such a pair inherit the cause relation, indicating that this relation holds between the entire concepts rather than between individual word forms only. Unlike entailment, the causation relation is not inherited by the troponyms. Causative verbs have the sense of cause to be/become/happen/have or cause to do. They relate transitive verbs to either states or actions. In both cases, causation can be seen as a kind of change. Many verbs clearly have the semantics of such a causative change, but they do not have lexicalized resultatives. There are many verbs in English that have both a causative and an anticausative usage. Most of them cluster in WordNet among the verbs of change, where many verbs alternate between a transitive causative form and an intransitive anticausative (or unaccusative, or inchoative) form. Most anticausative verbs imply either an animate agent or an inanimate cause. A few verbs are compatible only with an inanimate cause. The causative relation also shows up systematically among the motion verbs. Causation is a specific kind of entailment: if V1 necessarily causes V2, then V1 also entails V2. The entailing verb denotes the causation of the state or activity identified by the entailed verb. The entailment between these verbs is also characterized by the absence of temporal inclusion. But unlike backward presupposition, the entailed verb precedes the entailing verb in time: A must first bequeath something to B before B owns it. The causative relation is unidirectional. Syntactic Properties and Semantic Relations Considering research that analyzes the constraints on verbs' argument taking properties in terms of their semantic make-up, based on the assumption that the distinctive syntactic behavior of verbs and verb classes arises from their semantic components (see particularly the discussions of Levin in the previous sections), WordNet does not incorporate all of a speaker's knowledge about semantic and syntactic properties of verbs. To cover at least the most important syntactic aspects of verbs, therefore, WordNet includes for each verb synonym set one or several sentence frames, which specify the subcategorization features of the verbs in the set by indicating the kinds of sentences they can occur in. One can search among the verbs for the kinds of semantic-syntactic regularities found in the literature or search for all the synonym sets that share one or more sentence frames in common and compare their semantic properties; or one can start with a number of semantically similar verb synonym sets and see whether they exhibit the same syntactic properties. An exploration of the syntactic properties of co-troponyms occasionally provides the basis for distinguishing semantic subgroups of troponyms. 10 LEXICAL RELATIONS The type hierarchies and ontologies described in the previous sections instantiate the principle relations between entries in a computational lexicon. In addition, by following principles of lexical inheritance, these hierarchies will reduce duplication in representation. By caching the top-level nodes of these hierarchies, a fully-specified entry can be quickly reconstructed during computation. These hierarchies, however, do not exhaust the range or relations that might exist between lexical entries and that might be used during computation. Several formalisms are explored in this section to provide further options for incorporation into the lexicon. (The user is also strongly encouraged to consider the extensive discussions of relations in Evens, Cruse, and Sowa. Relations play a key role in both computational linguistics and knowledge representation.) 10.1 Semantic networks and conceptual graphs The type hierarchies and ontologies represent an initial linking of the entries into a semantic network. Further linking and specification of a semantic network in DIMAP can be achieved using the role field of the entries. Some of this linking was described in the first section of this chapter in connection with the word aphasia. A further demonstration of the linking is given in the entry for #action, where cases associated with the entry are identified as roles and where the links are equated to selectional restrictions. -- Conceptual graphs Polovina and Heaton provide a simplified introduction to conceptual graphs based on the work of Sowa (1984). Conceptual graphs are based upon the general form: CONCEPT_1 RELATION CONCEPT_2, which may be read as "A RELATION of a CONCEPT_1 is a CONCEPT_2". Thus, [Mammal] (part) [Trunk] reads "A part of a mammal is a trunk." All concepts have referents, which refer to a particular individual of that concept. The concept [Mammal: Clyde] indicates that Clyde is a mammal. A concept that appears without an individual referent has a generic referent, [: *]. The generic concept [Trunk] may take up an individual referent with a unique number referent "[Trunk: #1234]." Larger graphs may be constructed. Sowa presents a conceptual catalog including such relations as agent, object, instrument, part, and material. In representing a complex conceptual graph, it may be necessary for a concept to have several relations attached to it, with the second concepts having further relations. Thus, we can have the graph: [Spoon] - (instrument)  [Eat] - (object) [Walnut] (part) [Shell: *n] (agent) [Monkey], (material) [Shell: *n]. This represents A monkey eating a walnut with a spoon made out of the walnut's shell. The hyphen indicates that the relations of a concept are continued on a subsequent line. The comma terminates the part of a graph that relates to the last hyphen. Any part of the graph following the comma relates directly back to the hyphen before the last hyphen. The * is a coreferent marker for concepts. The period terminates the whole graph. Note that some of this conceptual graph is factual and not definitional. The factual part should not be represented in a dictionary. Rather, we might expect that the concepts that make up this graph would be composed during the process of representing a sentence or other phrase. In DIMAP, the basic conceptual graph that is considered definitional would have an entry for concept_1. This entry would have a ROLE relation, with a link to concept_2. In conceptual graphs, type labels fall into a type hierarchy. Thus, Mammal < Animal. These relationships are represented as before, using the SUPERCONCEPT links in DIMAP and can fit within an ontology described in the previous sections. These conceptual graphs have certain properties (based on the lattice that is formed) and are suitable for further computation to identify generalizations, combination, and inference (for which, see references). Velardi et al. use conceptual graphs to provide the formalism for collocative meaning representation. Conceptual meaning provides the cognitive content of words--it can be expressed by features or primitives (as described in previous sections). Collocative meaning provides the associations between words or word classes--describing the uses for a word, but not attempting an explanation of word associations in terms of meaning relations between a lexical item and other items or classes. Collocative meaning corresponds to the syncategorematic distinction in psychology: "almost entirely defined by their pattern of use." Humans may more naturally describe word senses with their syntactic feature and structure characteristics and relations with other words than with conceptual kindship and other internal features. In principle, the inferential power of collocative, or surface, meaning representation is lower than for conceptual meaning. Nonetheless, representing word sense and sentences with surface semantics is useful for many NLP applications. Velardi et al.'s system is able to acquire syncategorematic concepts, by learning and interpreting patterns of use from text exemplars. The input to the system is -- a list of syntactic collocates (subject-verb, verb-object, noun-preposition-noun, noun-adjective, etc. extracted through morphologic and syntactic analysis of the selected corpus) - This can be accomplished with syntactic parsing of sentence parts and using some context-dependent heuristics to cut sentences into clauses. -- a semantic bias - The semantic bias is the kernel of any learning algorithm, as no system can learn much more than what it already knows. This consists of: a domain-dependent concept hierarchy - This is a many-to- many mapping from words to word sense names and an ordered list of conceptual categories, to whatever extent a type hierarchy has been developed. a set of domain-dependent conceptual relations, and a many- to-many mapping between syntactic relations and the corresponding conceptual relations a set of coarse-grained selectional restrictions on the use of conceptual relations, represented by concept-relation- concept (CRC) triples The system produces two types of output: 1. a set of fine-grained CRCs, that are clustered around concepts or around conceptual relations; and 2. an average-grained semantic knowledge base, organized in CRC triples. Fine-grained CRC are those in which concepts directly map into content words (e.g. [COW]  (PATIENT)  [BREED]). These CRCs are true because they are observed in the domain subworld. Average- grained CRC are those in which concepts are ancestors of content- word concepts. These CRC are typically true, but they may have a limited number of exceptions observed in the domain sublanguage (e.g. [ANIMAL]  (PATIENT)  [BREED] is typically true, even though breeding mosquitoes is quite odd). Coarse-grained CRC are those in which concepts are at a higher level in the taxonomy (e.g. [ACTION] (BENEFICIARY) [ANIMATE_ENTITY]). They state necessary, but not sufficient, conditions on the use of conceptual relations. An algorithm is given to acquire syncategorematic knowledge on concepts. This algorithm is based on machine learning principles of the type enunciated by Langley. Several research problems are described, along with an evaluation of the adequacy of the technique. The research is compared to research on concept formation and lexical acquisition. To represent CRCs in DIMAP, for example, [ACTIVITY] (LOCATION) [PLACE] [CHANGE] (FINAL_STATE) [PRODUCT] [ARTIFACT] (MATTER) [MATTER] [ACTIVITY] (FIGURATIVE_LOCATION) [AMBIT] [FARMING] (LOCATION) [GREENHOUSE] [AGRICULTURAL_ACTIVITY] (LOCATION) [BUILDING_FOR_CULTIVATION] the ROLE field should be used. The first member of each CRC would be the main entry, with the relation coded as the role name, and the third member of the relation would be would be entered as a role link. 10.2 Collocational functions The collocational relations identified by Velardi et al. focus on the semantic categories of their arguments. A slightly different view focuses on the relations themselves. Mel' uk, in developing the combinatory facet of the explanatory and combinatory dictionary is concerned with identifying the lexical collocations of the headword of a lexical entry. These collocational functions include both syntagmatic and paradigmatic information (about synonyms, superordinates, antonyms, and conversives) and information about derivatives and compounds. The arguments of these lexical functions are themselves lexical units and therefore will also be dictionary entries. The lexical combinatoric describes the syntax and meaning of those idiomatic or semi-idiomatic expressions containing the lexeme. There are about 50 elementary lexical functions (with specific syntactic roles) the terms of which, taken either alone or in combination, can express the meanings of many semi-idiomatic expressions. Lexical functions also include a set of "substitution functions," which express semantic or syntactic relations between lexemes. Furthermore, there functions can have semantic constraints (selectional restrictions) on the arguments. The functions describe relations to other lexemes. Such functions include the ISA, AKO, and instance links. In DIMAP, these functions can be represented in ROLE structures, with the function name identified as the role name, the argument to the function as the entry word, and the value of the function as the role value. The value of the function should be another entry in the lexicon. The lexical functions on the next page are in general use. Sample Lexical Functions A0 adjective derived from entry word A0(dog) = doggy A1, A2 typical adjective for numbered participant A1(suspicion) = full of Able1, Able2 ability of numbered participant Able1(fear) = fearful Able2(fear) = fearsome Adv0 adverb from entry word Adv0(happy) = happily Adv1, Adv2 adverb from numbered participant Adv1(fear) = fearfully Anti antonym (exact or near) Anti(happy) = sad Bon standard praise for entry Bon(advice) = sound Caus cause Caus(sit) = set Centr center of Centr(city) = heart Cont continue Cont(go) = keep, keep on Contr non-antonymic contrast Contr(chair) = table Convijk conversive (opposite where participants switch roles) Conv321(buy) = sell Culm culmination of Culm(ability) = peak Degrad degradation of Degrad(marriage) = fall apart Epit standard epithet (representing a part of entry) Epit(body) = physique Excess excessive functioning of Excess(eyelid) = flutter Fact012 verb meaning "the realization of," with the entry as grammatical subject and the participants as objects Fact0(suspicion) = confirm Figur metaphor of the entry Figur(love) = fire Fin stopping of Fin(fly) = land Func012 verb which takes the entry as subject of first participant, second participant, etc. Func1(idea) = come to Gener generic word Gener(blue) = color Germ the core of Germ(problem) = crux Imper the command associated with Imper(care) = Watch out! Involv verb meaning non-participant involvement Involv(scent) = fill Incep the beginning of Incep(fly) = take off Laborij verb which takes the participants ij as subject and object and the entry as secondary object Labor12(esteem) = hold (i.e., x holds y in esteem) LabRealij verb meaning "the realization of," with the first two participants as subject and object and the entry as secondary object (a combination of Labor and Real) LabReal12(mind) = bring to (x brings y to mind) Liqu the elimination of Liqu(group) = disband Locin preposition for "in" Locin(house) = in Locab preposition for "from" Locad preposition for "to" Magn intensity Magn(hatred) = deep Manif is manifest in, with the entry as subject Manif(tear) = well up Minus less of Minus(wind) = slacken Mult a regular aggregate of Mult(paper) = ream Nocer to harm, injure, or impair Nocer(access) = cut off Obstr to function with difficulty Obstr(justice) = obstruct Oper123 verb which takes numbered participant as subject and entry as object Oper1(party) = throw Perm permit or allow Perm(go) = let Plus more of Plus(joy) = grow Pos123 positive attributes of numbered participants Pos1(game) = skilled Pred copula for nouns and adjectives Pred(prey) = fall, be Propt preposition for "because of" Propt(greed) = out of Prox to be on the verge of Prox(disaster) = on the brink of Qual123 highly probable qualities of numbered participants Qual1(theft) = sneaky Real123 verb meaning to realize with entry as object and numbered participants as subject Real1(ambition) = realize S0 noun for entry S0(hate) = hatred S123 typical noun for numbered participant S2(crime) = victim Sinst typical instrument Sloc typical location Smed typical means Smod typical mode Sres typical result Sloc(house) = yard Sing one instance of Sing(paper) = sheet Son to emit a typical sound Son(frog) = croak Sympt to be a physical symptom of Sympt(fire) = smoke Syn synonym and near synonym Syn(happy) = glad V0 verb for entry V0(sing) = song Ver true, correct, or proper Ver(ruling) = fair In the functions followed by a number (0, 1, 2, or 3), the numbered participant corresponds to an argument identified in the definition of the entry lexeme. The entry for smell (senses 6 and 8) in the sample dictionary DICT4 show examples of the lexical functions gener and magn. As we have seen, many of these lexical functions can be represented in DIMAP using the SUPERCONCEPT, FEATURE, and INSTANCE structures. However, these functions provide further insights into the use of language and the relationships that may exist between lexical items. These functions have not yet been incorporated well into computational models, being viewed primarily as capturing relationships between lexical items. Their potential is considerable. Moreover, just as agent was made a lexical entry in Carlson and Nirenburg's ontology, so too can these lexical functions be made into lexical entries. 10.3 Lexical subordination and qualia structures Lexical items are frequently related to one another by derivation. The process of lexical extension involves the ability of a lexical item in one semantic class to take on an extended use in a second existing class. The second use generally is in some specific special relationship to the first. Thus, a verb of attachment is used as a verb of creation; this might be tied to the fact that the process of attaching can be the means of creating something. This is an instance of a more general pattern of extension that arises when the action denoted by verbs in one class is a means of achieving the action denoted by verbs in the second class. Thus, verbs of gesture frequently show extended uses as verbs of expressing feelings. Levin and Rapoport introduce the notion of lexical subordination, which they describe as responsible for extended meanings of words, where an existing verb productively takes on a new and predictable meaning, that is, the new meaning may be considered to have been derived from an existing meaning of the verb. Some examples are -- resultative construction: the state denoted by an adjective holds of a noun phrase as a result of the action denoted by the verb, including transitive verbs (hammer the metal flat), unergative intransitive verbs (laugh herself silly), and certain verb-particle constructions (scrape the putty off); -- conflation of cause, manner, motion, and path components: the extension involves "causing a change of location" (John floated a bottle into the cave); -- gesture-expression: expressing by means of a gesture, with the transitive use of a class of typically intransitive verbs (smiled her thanks); and -- "one's way" and "a hole" constructions: results brought about by the action denoted by their verb (explained his way past the guard and kicked a hole in the fence). These constructions do not simply involve a verb plus preposition, particle, adjective, or noun. Rather, both lexical units make independent contributions to the meaning of the construction, compositionally. Lexical subordination operates at the level of lexical conceptual structure. It takes a verb in its original, or basic, sense and subordinates it under a lexical predicate. In the 'result' construction, the new representation involves the addition of a variable in the new verb sense, a variable not present in the lexical conceptual structure of the original, unsubordinated verb. (The reason the term 'subordination' is used is that the new lexical conceptual structure contains the original sense as a subordinate clause.) Pustejovky builds upon these notions of lexical subordination by proposing structured forms (or templates) that embody many of the derivational forms within a single sense, rather than having several senses each with its own set of features. The particular part that is extracted from the single sense depends on the syntactic and semantic context. Thus, when considering the several senses of break, one allowing an agent subject, another having a theme subject, and another allowing an instrument subject, he takes advantage of the fact that each sense involves the central notion of "breaking". To accomplish this, Pustejovsky would give each lexical item a qualia structure, specifying four aspects of its meaning: -- the relation between it and its constituent parts (constitutive role); -- that which distinguishes it within a larger domain (its physical characteristics) (formal role); -- its purpose and function (telic role); and -- whatever it brings about (agentive role). This minimal semantic distinction is given expressive force when combined with a theory of event types (the event structure). Since the lexical semantic representation of a word is not an isolated expression, but is in fact linked to the rest of the lexicon, the semantics for a lexical item is integrated through the different qualia associated with a word (the lexical inheritance structure). Finally, the part of the meaning of a word is translated from its underlying semantic representations into expressions that are utilized by the syntax (the argument structure). Argument Structure The construction of a lexical entry begins with a simple listing of the parameters or arguments associated with a predicate. To this extent, the entry would be constructed in a way similar to that described earlier in this chapter. However, because several senses are to be folded into one, the argument structure will become more sophisticated. Event Structure One level of semantic description involves an event-based interpretation of a word or phrase, recursively defined on the syntax, so that it is also a property of phrases and sentences. There are three classes of events: states (eS), processes (eP), and transitions (eT). These events may be decomposed into other events, as needed (hence, a subeventual analysis). Qualia Structure Many senses for a word are derived from a base sense, hence implying a richer notion of compositionality. The expressions that behave as arguments to a function are not simple, passive objects, but are active in the semantics. Certain complements add to the basic meaning by virtue of what they denote through a process of semantic type coercion (a semantic operation that converts an argument to the type that is expected by a function, where it would otherwise result in a type error). Thus, processes can shift their event type to become a transition event, process verbs can participate in a resultative construction, or a subpart or related part of an object can stand for the object itself (metonymy). There is a system of relations that characterizes the semantics of nominals (nouns), where the qualia structure of the noun determines its meaning as much as the list of arguments determines a verb's meaning. Constitutive Role: the relation between an object and its constituents, or proper parts (material, weight, parts and component elements); Formal Role: that which distinguishes the object within a larger domain (orientation, magnitude, shape, dimensionality, color, position); Telic Role: purpose and function of the object (purpose that an agent has in performing an act, built-in function or aim that specifies certain activities); and Agentive Role: factors involved in the origin or "bringing about" of an object (creator, artifact, natural kind, causal chain). If we begin with a verb's lexical entry specifying the type of its complements, we can search the lexical entry of the noun complements for values matching the specified type. Each of the qualia roles can be viewed as a partial function from a noun denotation into its subconstituent denotations. When one of these functions is applied, it returns the value of a particular qualia role. If the complement does not match and needs to be coerced, a type coercion dictates to the complement that it must conform to its type specification and so the qualia roles are searched for an appropriate type. If there is none, a type error is produced. There may be several readings available, sometimes resulting in ambiguity. An example is provided by the words novel and began in DICT4. A sentence like John began a novel is problematic for the word began, which requires that its object describe a transition event. However, the word novel does not fit this type and so we would require a different sense of the word began; this sense does not exist. By enlarging the representation for the word novel to include different types of roles, the problem can be solved. The entry for novel would have the following components novel ($var0) const = narrative ($var0) form = book ($var0) telic = read (T, $var1, $var0) agentive = (*OR* artifact ($var0) write (T, $var1, $var0)) There are several ways of viewing a novel, as constituting a narrative, in the form of a book, to be read or written, and as an artifact of someone's effort. Only when viewed as something to be read or written does novel fit the context. (The 'T' in the telic and agentive positions indicate that these are transition events.) The entry for began would thus have a theme feature with the selectional restriction that it be a transitional event. A more complex example is provided by the entry for the word bake, which contains an object which may change the sense of bake that is selected. For sense 1, the superconcept link would be in the hierarchy involving an agent causing a change of state (when the object is a potato); for sense 2, the superconcept link would be to verbs of creation (when the object is a cake). It is only when the verb interacts with the possible roles of the object that it is determined that a potato is changed or a cake is created. Once semantic weight is given to lexical items other than verbs, the semantic distinctions that are possible are quite wide- ranging. We can think of certain modifiers as modifying only a subset of the qualia for a noun. Pustejovsky distinguishes the following systems and the paradigms that lexical items fall into: -- count/mass alternations; -- container/containee alternations -- figure/ground reversals -- product/producer diathesis -- plant/fruit alternations -- process/result diathesis -- object/place reversals -- state/thing alternations -- place/people. Lexical Inheritance The flexibility that arises when a word's meaning can be generated by composition can be placed within a global knowledge base, capturing the inheritance relations between concepts and how the concepts are integrated into a coherent expression in a given sentence. There are two inheritance mechanisms: fixed and projective. The first includes a fixed network of relations, which is traversed to discover existing related and associated concepts (e.g. hyponyms and hypernyms). The projective inheritance operates generatively from the qualia structure of a lexical item to create a relational structure for ad hoc categories. The latter can deal with what is usually assumed to be commonsense knowledge. With a fixed inheritance structure, we can identify a sequence Q1, P1, ... , Pn as an inheritance path, which can be read as the conjunction of ordered pairs { x1,y1 }. The conclusion space of a set of sequences è is the set of all pairs Q,P such that a sequence Q,...,P appears in è. The traditional is-a relation relates the pairs by a generalization operator, óG (a lattice), as well as other relations. In addition to these fixed relational structures, we can dynamically create arbitrary concepts through the application of certain transformations on lexical meanings. For example, for any predicate, Q (the value of a qualia role), we can generate its opposition, ªQ. A projective transformation, ã, on a predicate Q1 generates a predicate, Q2, such that ã(Q1) = Q2, where Q2 è. The set of transformations includes: ª (negation), ó (temporal precedence), ò (temporal succession), = (temporal equivalence), and act (an operator adding agency to an argument). The space of concepts traversed by the application of such operators will be related expressions in the neighborhood of the original lexical item. A series of applications of transformations, ã1, ... ,ãn, generates a sequence of predicates, Q1, ... , Qn , called the projective expansion of Q1, P(Q1). The projective conclusion space, P(èR), is the set of projective expansions generated from all elements of the conclusion space, è on role R of predicate Q. That is, P(èR) = { P(Q1), P(Qn) Q1, ... , Qn èR}. For example, we can have the concept of "being confined" with its opposite "not being confined"; these concepts can be related temporally, with an operator arising from the transition event of "escaping". Thus, "the prisoner escaped" has a closer association than "the prisoner ate". "Escaping" will fall within the conclusion space for the telic role of prisoner. Generating the projective conclusion space as a graph, we can take those graphs that result in no contradictions to be the legitimate semantic interpretations of the entire sentence. 10.4 Lexical rules Flickinger describes lexical rules relating lexical entries both for inflection and for derivation. The present sample dictionary does not include the lexical rules, but they can easily be added using typed feature structures (see Copestake and Briscoe). Atkins uses the term LINK-RULE, while Levin talks about meaning extension. Both assume that the basic sense and the derived sense exist within the dictionary. Copestake and Briscoe want to formalize these notions by calling them lexical rules and making them a component of a unification-based lexicon employing (default) inheritance and typed feature structures. In many cases, there might not be a derived sense in the dictionary, but rather the derivation exists through some sort of coercion during syntactic and semantic interpretation (for example, when a metaphorical interpretation is adopted). Even in these cases, lexical rules characterize what is occurring. Lexical rules can cover a variety of situations: derivational morphological processes, change of syntactic class (conversion), argument structure of the derived predicate, affixation, and metonymic sense extensions. Establishment of lexical rules within the lexicon must also take into account possible blocking which might occur (where a lexeme already exists that expresses the sense that would otherwise be derived). Thus, lexical rules should "express sense extension processes, and indeed derivational ones, as fully productive processes which apply to finely specified subsets of the lexicon, defined in terms of both syntactic and semantic properties expressed in the type system underlying the organization of the lexicon." Copestake and Briscoe present a lexical representation system using typed feature structures, which are necessary to formalize the notions of an ontology, as described earlier. Feature structures must be well-formed with respect to types. Particular features will only be appropriate to specified types and their subtypes. Types are hierarchically ordered. Constraints can be associated with types to allow non-default inheritance. Default inheritance consists of default unification of feature structures ordered by an inheritance hierarchy. The type system constrains both default inheritance and lexical rule application. The type system defines a partial ordering on the types, thus identifying which types are consistent. Only feature structures with a common subtype can be unified; if two types are unordered in the hierarchy, they are inconsistent. Every consistent set of types has a unique greatest lower bound. Thus, when two feature structures of types a and b are unified, the type of the result will be a b, which must be unique if it exists. If a b does not exist, the types are inconsistent and unification fails. Every type must have a feature structure to provide the constraints necessary to identify the type and establish the range of features that are appropriate for that type. This makes a well- formed feature structure. Constraints are inherited by all subtypes of a type, but a subtype may introduce new features (which will be inherited by all its subtypes). The constraints for a type must be mutually consistent. This inheritance of constraints allows concise definitions of all lexical entries. A type is a bare atom naming the feature structure (and would be entered as an ontological entry in DIMAP--see DICT4), as the word #artifact in (1). The "artifact" type has a telic feature whose value is a feature structure of type formula. An atomic type would have only a name, but no features. A feature value can be either atomic or represent a feature structure, perhaps just through the mention of the type, as the word formula in (1). (1) [artifact TELIC = formula] (2) [physobj FORM = physobj PHYSICAL-STATE = solid] To represent the structure in (1), we would have one sense, with no part of speech, and with only an entry in the feature slot, with feature name telic and feature value formula. In (2), the same principles of representation hold. Here, solid is an atomic feature structure and physform is a complex feature structure. We can give further structure to physform, as in (3) and (4). (3) [ind_obj FORM = [phsyform SHAPE = individuated] ] (4) [ substance FORM = [ phsyform SHAPE = unindividuated] ] There are several options for representing these structures in DIMAP: (1) They can be represented directly. That is, there would be entries for #ind_obj and #substance, each with one sense having the feature form, the former with the value "+physform (shape individuated)" and the latter with the value "physform (shape unindividuated)". (2) The two different values of shape can be entered in (the feature component of) two different senses of an entry for #physform. Then, in the feature component of #ind_obj and #substance enter the values of form as "physform 1" and "physform 2", respectively. In this schema, we would have to know that the entry at #physform contains information necessary to expand (or fully specify, to use Flickinger's terminology) the entries for #ind_obj and #substance. This option might have the advantage that the expansion can be accomplished recursively. That is, if the entry #physform has feature values that are also non-atomic, a program developing the fully-specified form for #ind_obj and #substance would automatically continue to move up the hierarchy specified through these links. (3) Use ISA links. If we merely specify the value of form as "+physform" and then link this sense hierarchically to #physform 1, we could pick up the nesting chain necessary to build the fully-specified entry. (4) Use role links. Instead of using the feature component, the role component can be used more directly. In this case, form would be the role name, #physform would be the role value, and the particular sense could also be identified. It seems that options 1 and 2 are the better choices. In selecting an option, it is necessary to ensure that you have a clean structure that will permit unification and other operations with feature structures. These will need to deal with several idiosyncrasies of feature structures. For example, in (5) and (6), we would have to recognize scalar as referring to whole integers, gender as being dichotomous with values "M" and "F" (or three-valued), and boolean as having the values "0" or "1" (or "+" or "-"). (5) [creature AGE = scalar SEX = gender] (6) [animal EDIBLE = boolean] In (7), (8), (9), and (10), the feature structures of the top level of a lexical hierarchy are presented. The relations identified by ' ' would be represented using the SUPERCONCEPT link in DIMAP. (See entries in DICT4 for #lex-sign, #noun, #count-noun, and #mass-noun. In these entries, note particularly the SUPERCONCEPT and FEATURE values and how generic values such as string or boolean can serve as type constraints. RQS stands for "relativized qualia structure," nomrqs stands for "nominal relativized qualia structure.") (7) lex-sign <= top [lex-sign ORTH = string] (8) noun <= lex-sign [noun COUNT = boolean RQS = nomrqs] (9) count-noun <= lex-sign [noun COUNT = +] (10) mass-noun <= lex-sign [noun COUNT = -] Thus, a lexical entry for the word haddock (see DICT4) would need to identify links to appropriate places in hierarchies that should be unified, to result in the expanded feature structure shown in (11). In terms of the feature structures noted above, all that we would have to give would be pointers to #count-noun and #animal, in order to obtain the fully-specified form in (11). (11) [count-noun ORTH = "haddock" SYNTAX = [COUNT = +] RQS = [animal SEX = gender AGE = scalar EDIBLE = boolean PHYSICAL-STATE = solid FORM = [physform SHAPE = individuated] ] ] A lexical rule is a feature structure of type lexical-rule, specified as follows: (12) [lexical_rule 0 = lex-sign 1 = lex-sign] Every lexical rule must have the features 0 and 1, each of which must have a value which is of type #lex-sign. A new lexical sign is generated by taking a lexical entry which satisfies the specifications in the <1> feature path (that is, unifies with that feature structure) and creating the feature structure in the <0> feature path. For example, a rule for #grinding (making an individuated object into a substance) can be specified as follows (13) grinding <= lexical_rule [grinding 1 = [count-noun ORTH = $var0 RQS = ind_obj] 0 = [mass-noun ORTH = $var0 RQS = substance] ] Specific contexts (predicational and syntactic) will force the application of the lexical rule (coercion). Moreover, the typed framework of the lexicon allows us to identify those lexical items to which a lexical rule can apply. For example, we can have the more specific 'grinding' lexical rule shown in (14). This rule would be identified in DIMAP as #animal-grinding with a SUPERCONCEPT link to #grinding. The RQS (relativized qualia structure) in (14) is consistent with and more specific than the RQS structures in (13). If we next take the entry for haddock shown in (11), we see that the #animal-grinding rule is applicable to it. To achieve the applicability, we unify #animal-grinding with #grinding, with the primary effect being to add the RQS of #ind_obj to that of #animal. (Since these two stand in an ISA relationship, the unification is immediate, except that the more specific entry may have default values that override what may be present in entries higher in the hierarchy.) The entry at (11) then easily unifies to give the result in (15). (14) animal-grinding [grinding 1 = [RQS = [animal EDIBLE = +] 0 = [RQS = food_substance] ] (15) [mass-noun ORTH = "haddock" COUNT = - RQS = [ food_substance TELIC = [ formula PRED = eat] ] ] The result of applying a lexical rule to a specific sense is notated sense + rule_name, as in lamb_2 < lamb_1 + #animal_grinding This sense extension can be represented directly in DIMAP, if desired. This can be handled like the representation for past tenses of verbs, using ISA links. In this case, "lamb" (sense 2) would have two ISA links, one to lamb (sense 1) and one to #animal_grinding (which presumably has only one sense). Alternatively, there is no need to mention such links explicitly, but rather assume that they would be brought into play during processing. If an individuated sense of lamb or haddock did not satisfy the context, the lexical rules associated with those entries could be evoked. REFERENCES AND BIBLIOGRAPHY Ahlswede, T. (1985). "A Toolkit for Lexicon Building." Proceedings of the 23rd Annual Conference of the Association for Computational Linguistics, 268-75. Allen, J. (1987). Natural Language Understanding. Menlo Park, CA: The Benjamin/Cummings Publishing Company, Inc. Amsler, R. E. (1980). The Structure of the Merriam-Webster Pocket Dictionary, TR-164. Austin, TX: Department of Computer Science, University of Texas. Atkins, B. T. S. (1991). "Building a Lexicon: The Contribution of Lexicography." International Journal of Lexicography, 4(3), 167-204. Atkins, B. T. S., J. Kegl, and B. Levin. (1988). "Anatomy of a Verb Entry: from Linguistic Theory to Lexicographic Practice." International Journal of Lexicography, 1(2), 84-126. Boguraev, B. K. (1991). "Building a Lexicon: The Contribution of Computers." International Journal of Lexicography, 4(3), 227- 60. Bresnan, J. (ed). (1982). The Mental Representation of Grammatical Relations. Cambridge, MA: MIT Press. Briscoe, T. and A. Copestake. (1991). "Sense Extensions as Lexical Rules." Proceedings of the IJCAI Worksop on Computational Approaches to Non-Literal Language (Also as ESPRIT BRA-3030 ACQUILEX Working Paper No. 22), . Carlson, L. and S. Nirenburg. (1990). World Modeling for NLP, Technical Report CMU-CMT-90-121. Pittsburgh, PA: Carnegie Mellon University, Center for Machine Translation. Chodorow, M. S. and R. J. Byrd. (1985). "Extracting Semantic Hierarchies from a Large On-Line Dictionary," Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics. Chicago, IL: Association for Computational Linguistics, 299-304. Copestake, A. A. and E. J. Briscoe. (1991). "Lexical Operations in a Unification Based Framework." Proceedings of the ACL SIGLEX Workshop on Lexical Semantics and Knowledge Representation, 88- 101. Cruse, D. A. (1986). Lexical Semantics. Cambridge: Cambridge University Press. Evens, M. and R. N. Smith. (1978). "A lexicon for a computer question-answering system." American Journal of Computational Linguistics, Microfiche 81, 1-99. Evens, M. W. (ed.). (1988). Relational Models of the Lexicon: Representing Knowledge in Semantic Networks. Cambridge: Cambridge University Press. Fellbaum, C. (1990). "English Verbs as a Semantic Net." International Journal of Lexicography, 3(4), 278-301. Fikes, R. E. and T. Kehler. (1985). "The Role of Frame-Based Representation in Reasoning." Communications of the ACM, 28(9), 904-20. Flickinger, D. P. (1987). Lexical Rules in the Hierarchical Lexicon, PhD Dissertation. Stanford, CA: Stanford University. Flickinger, D., C. Pollard, and T. Wasow. (1985). "Structure- sharing in lexical representation," Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics. Chicago, IL: Association for Computational Linguistics, 262-7. Frawley, W. (1988). "Relational models and metascience," in Evens, M. W. (ed), Relational Models of the Lexicon: Representing Knowledge in Semantic Networks. Cambridge: Cambridge University Press, 334-72. Gross, D. and K. Miller. (1990). "Adjectives in WordNet." International Journal of Lexicography, 3(4). Holub, A. I. (1990). Compiler Design in C. Englewood Cliffs, NJ: Prentice Hall. Ilson, R. and I. A. Mel' uk. (1989). "English BAKE Revisited (BAKE-ing an ECD)." International Journal of Lexicography, 2(4), 326-45. Jackendoff, R. (1983). Semantics and Cognition. Cambridge, MA: The MIT Press. Jackendoff, R. (1987). "The Status of Thematic Relations in Linguistic Theory." Linguistic Inquiry, 18, 369-411. Jackendoff, R. (1990). Semantic Structures. Cambridge, MA: The MIT Press. Levin, B. (1991a). "Building a Lexicon: The Contribution of Linguistics." International Journal of Lexicography, 4(3), 205-26. Levin, B. (forthcoming). "Approaches to Lexical Semantic Representation," in D. Walker, A. Zampolli, and N. Calzolari (ed), Automating the Lexicon, I: Research and Practice in a Multilingual Environment. Oxford: Oxford University Press. Levin, B. and T. R. Rapoport. (1988). "Lexical Subordination." Proceedings of the 24th Annual Meeting of the Chicago Linguistic Society, Part One: The General Session , 275-289. Litkowski, K. C. (1978). "Models of the Semantic Structure of Dictionaries." American Journal of Computational Linguistics, (Mf.81), 25-74. Mel' uk, I. (1988). "Semantic Description of Lexical Units in an Explanatory Combinatorial Dictionary: Basic Principles and Heuristic Criteria." International Journal of Lexicography, 1(3), 165-188. Meyer, I., B. Onyshkevych, and L. Carlson. (1990). Lexicographic Principles and Design for Knowledge-Based Machine Translation, Technical Report CMU-CMT-90-118. Pittsburgh, PA: Carnegie Mellon University, Center for Machine Translation. Miller, G. A. (1990). "Nouns in WordNet: A Lexical Inheritance System." International Journal of Lexicography, 3(4). Miller, G. A., R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. (1990). Five Papers on WordNet, CSL Report 43. Princeton, N.J.: Cognitive Science Laboratory. Nirenburg, S. and C. Defrise. (forthcoming). "Practical Computational Linguistics," in R. Johnson and M. Rosner (ed), Computational Linguistics and Formal Semantics. Cambridge: Cambridge University Press. Pereira, F. C. N. and S. M. Shieber. (). Prolog and Natural- Language Analysis. (CSLI Lecture Notes 10). Stanford, CA: Center for the Study of Language and Information. Pollard, C. and I. A. Sag. (1987). Information-Based Syntax and Semantics: Volume 1 - Fundamentals. (CSLI Lecture Notes, No. 13). Menlo Park, CA: Center for the Study of Language and Information. Polovina, S. and J. Heaton. (1992). "An Introduction to Conceptual Graphs." AI Expert, 7(5), 36-43. Proudian, D. and C. Pollard. (1985). "Parsing Head-Driven Phrase Structure Grammar." Proceedings of the 23rd Annual Conference of the Association for Computational Linguistics, 167-71. Pustejovsky, J. (1991). "The Generative Lexicon." Computational Linguistics, 17(4), 409-41. Schank, R. and R. Abelson. (1977). Scripts, Plans, Goals, and Understanding. Hillsdale, NJ: Lawrence Erlbaum Associates. Schank, R. C. (1972). "Conceptual dependency: A theory of natural language understanding." Cognitive Psychology, 3(4), 552-631. Schank, R. C. (1973). "Identification of Conceptualizations Underlying Natural Language," in R. C. Schank and K. M. Colby (ed), Computer Models of Thought and Language. San Francisco, CA: W. H. Freeman. Sowa, J. F. (1984). Conceptual Structures: Information Processing in Mind and Machine. Menlo Park, Calif.: Addison-Wesley. Sowa, J. F. (ed). (1991). Principles of Semantic Networks: Explorations in the Representation of Knowledge. San Mateo, Calif.: Morgan Kaufmann. Velardi, P., M. T. Pazienza, and M. Fasolo. (1991). "How to Encode Semantic Knowledge: A Method for Meaning Representation and Computer-Aided Acquisition." Computational Linguistics, 17(2), 153-70.