(Under development - latest 9/7/00)
Output from DIMAP's parsing functions (Search|Parse Definitions and Resources|Parse Text) is presented as a bracketed parse tree in list format (with balanced parentheses), with constituent phrases indented below each non-terminal node. The terminal nodes identify the part of speech and the root word.
The grammar is defined in the file "st" and can be modified by adding rules that fit the style shown there (a starting node, a condition to be satisfied, and a terminal node). There are many intermediate nodes shown in the set of grammar rules, but only the following will appear as non-terminals in the final parse output: ADV, ADVS, AP, AS, AUC, CONJ, ENBR, HUN, NBR, NP, NPN, NPP, NPPM, NPPX, NPX, OCON, PART, PAUX, PH, PHN, PHP, PHX, PN, PRP, PS, PSV, PU, PX, PY, QUOTE, QWHICH, SBC, SCON, SEN, STUB, SUBC, SUBJ, TMP, TODO, THAT, VGER, VINF, VP, VPAP, VTO, WHAT, WHICH, WHO, WTHAT, WTODO
| Constituent Name | Description |
| ADV | An adverb (adv) or interjection (interj) |
| ADVS | One or more adverbs (adv) or interjection (interj) in an (SBC) |
| AP | A predicative adjective phrase following a linking verb, a complex transitive verb + noun + adjective, or a complex transitive verb + noun + "as" + noun or adjective, consisting of possibly leading adverbs and adjectives that can be used predicatively or a verb present or past participle that can be used as an adjective. |
| AS | An "as" phrase following a complex transitive verb + noun + "as" + noun or adjective, consisting of "as" followed by AP or NP |
| AUC | An auxiliary verb phrase, consisting of an auxiliary verb (v-aux), a (v-be), a (v-have), a (v-do), and possibly including the literals "have", "be", "been", "being", "to" |
| CONJ | A conjunctive phrase, which may consist of any other type of phrase |
| ENBR | A number phrase, consisting of digits (NUMBER), perhaps including a comma (for thousands) or a decimal point |
| HUN | A phrase consisting of number words (tens, tys, and teens, perhaps including the literal "hundred", but not including hyphenated numbers like "eighty-five", which are handled by a hyphenation routine) |
| NBR | A number phrase, which may consist of ENBR phrases, HUN phrases, and the literals "billion", "million", "thousand", and "hundred" |
| NP | A noun phrase, which may come in a large number of varieties: (1) a determiner (det) or determiner and number (NBR) followed by NPA, (2) a possessive pronoun (prpos) followed by NP3, (3) "all" and a possessive pronoun followed by NPA, (4) a number (NBR) and a MORE word followed by NPA, (5) "the", an ordinal (nbrth), and a number (NBR) followed by NPA, (6) a QUOTE, (7) a question word (qmod) followed by a determiner and a noun phrase, (8) a pronoun that takes a following "of" (pron-of), the "of", and a noun phrase, (9) a pronoun (pron) followed by a leading noun phrase (NPN), (10) a gerundial (VGER), (11) "too", an adjective (adj), and "a" followed by a noun phrase, (12) a noun phrase beginning with a number or a dollar sign and a number followed by a noun phrase, and (13) any other noun phrase |
| NPN | Leading noun phrases in a larger noun phrase (NP), such as conjoined nouns or noun genitive determiner phrases |
| NPP | Leading "noun phrases" in a larger noun phrase (NP), such as nouns, abbreviations, instransitive present-participle verbs, transitive past-participle verbs, and letters |
| NPPM | A noun phrase post modifier, which may consist of appositives (NPPX), a parenthesized expression, or a reflexive pronoun (refl) |
| NPPX | An appositive, which may be a noun phrase (NP), a what clause (WHAT), a "to" clause (TODO), a prepositional phrase (PRP), a WTHAT clause (WTHAT), a VPAP clause (VPAP), a VGER phrase (VGER), or a NAMELY word followed by a noun phrase (NP) |
| NPX | Clausal noun phrases (beginning with "to do", "what", verb gerundials, and "that") |
| OCON | A coordinating conjunction "phrase", consisting of an optional comma and a coordinating conjunction (oconj) |
| PART | A particle phrase, consisting only of a particle (part) |
| PAUX | An interruption, such as a string of adverbs, a reduced subordinate clause (SBC, including "as" and "than" clauses), or a regular subordinating clause (SUBC), following an auxiliary phrase (AUC) |
| PH | The main sentence phrase |
| PHN | Phrases after "as", "than", the complementizer of a relative clause (THAT, WHAT, WHO, WHICH), a subordinating conjunction introducing full subordinate clauses (fsuc), which may begin with an optional SBC, ADV, or PRP, consisting of a noun phrase and a verb clause (PHX) |
| PHP | Full sentence phrases, not including the final punctuation, which may include a leading subordinate clause (SUBC), adverbial phrase (ADV), or prepositional phrase (PRP), followed by the sentence phrase itself (PH) |
| PHX | Continuing a phrase after an opening complementizer (see PHN) consisting of a noun phrase (NP) followed by PS |
| PN | An interruption, such as a string of adverbs, a reduced subordinate clause (SBC, including "as" and "than" clauses), or a regular subordinating clause (SUBC), following another interruption (PU) |
| PRP | A prepostional phrase, which may begin with a FROM word (for double prepositions) and may continue in several ways (including phrases without a leading preposition), including: (1) "a", a unit of time measurement (meatime) word, and "ago", (2) a date phrase (TMP) without a leading preposition, (3) a WHEN word, (4) a weekday word possibly followed "night", "afternoon", "morning", or "evening", (5) "on" followed by a date phrase (TMP) or phrase type 4, (6) "in" followed by a year (NUMBER) or a month with a year (NUMBER), (7) a word that may introduce relative time phrases (reltime) followed by phrase type 4, a month, or a unit of time measurement (meatime) word, (8) an ALL word (all, each), with a word representing time periods (tper), "the" and "time" or a NUMBER, or "the" and a word representing time periods (tper), (9) a fillchk(prep), and finally, (10) a preposition followed by a noun phrase (NP), a gerundial (VGER), or a (WHAT) clause. |
| PS | Auxiliary continuations to WHAT, TODO, and THAT clauses |
| PSV | The general non-terminal for what follows a verb in a sentence, to cover cases where the non-terminals generated by the verb's subcategorization patterns are not satisfied, particularly covering verb conjunctions |
| PU | An interruption, such as a string of adverbs, a reduced subordinate clause (SBC, including "as" and "than" clauses), or a regular subordinating clause (SUBC), following a noun phrase (NP), particularly question noun phrases |
| PX | A phrase after the "to" in a TODO clause, consisting of a v-have or v-be followed by PU, an ADV followed recursively by PX, or a VP |
| PY | A gerund post modifier (an auxiliary, v-be, followed by PU, or a VP) |
| QUOTE | A quoted phrase, beginning with a backquote adn terminated by a single quote |
| QWHICH | A question determiner (qwhich) element followed by a noun phrase (NP) |
| SBC | A reduced subordinate clause, typically a sentence modifier, not containing a finite verb |
| SCON | A conjunction phrase, consisting of (1) a comma and a conjunction, (2) a conjunction by itself, (3) a comma, the literals "that" and "is", and a comma, (4) the literals "as", "well", and "as", and (5) the literals "or" and "else" |
| SEN | The start state for the grammar, the main non-terminal, consisting of a full sentence (PHP) followed by an ending punctuation (epunct), a colon followed by a noun phrase (NP), or a dash. |
| STUB | A non-terminal, not included in the grammar, but rather indicating that its constituents could not be characterized as a non-terminal in the grammar. Its constituents are interpreted as well as possible by the grammar. This is the mechanism for ensuring that the parser is robust. |
| SUBC | A subordinate clause, consisting of subordinating conjunctions followed by other specific other phrase types (fsuc and PHN, isuc and VINF, psuc and VPAP, gsuc and PY, tsuc and TODO, tsuc2 and an NP and a TODO, or "but" and PRP) |
| SUBJ | A noun phrase (NP) or noun-equivalent that has been relabeled as the subject of a sentence or clause |
| TMP | A date phrase consisting of a month, a numerical day, a comma, and a year |
| TODO | An infinitive verb, usually consisting of "to" and a verb in the infinitive, possibly preceded by an adverb (ADV) |
| THAT | A "that" clause, consisting of "that" and a relative clause (PHN), including cases without the "that" |
| VGER | A gerundial phrase, consisting of possibly leading adverbs (ADV) and a verb clause beginning with a present-participle verb (PX) |
| VINF | A verb phrase (VP) with the verb in the infinitive |
| VP | A verb followed by a conjunction (SCON) and the usual verb complements (PSV) |
| VPAP | A verb phrase (VP) in which the verb is a past participle |
| VTO | A prepositional phrase following a double transitive verb consisting of "to" followed by a noun phrase (NP) |
| WHAT | A who phrase (WHO) followed a complementizer (PHN) or by "whether" or "if" followed by a TODO clause |
| WHICH | A restrictive relative pronoun (QTHAT) or "which" determiner noun phrase (QWHICH) followed by a (PHN) |
| WHO | A "who" noun phrase, beginning with a question-how (qhow), a preposition (prep), the word "how", or another question word (qwho, qwhich) |
| WTHAT | A "that" appositive clause (or a 'that'-less clause) which may or may not have an overt subject, consisting an optional "that" and a complementizer (PHN) or a noun phrase (NP) followed by auxiliary continuations (PS) |
| WTODO | An infinitive clause (TODO) as a post-modifier to a noun, a number and a percent sign or a MEASIZE word, or a bare number |
These are parts of speech used in the parsing dictionary, frequently identifying a closed class of words. In some cases, literals matched as part of a grammar rule appear in the part of speech position in a leaf node. The following are the parts of speech currently used: abbr, adj, adv, clet, conj, det, epunct, FROM, fsuc, gsuc, interj, isuc, MANY, MEASIZE, meatime, month, MORE, NAMELY, nbrth, noun, NUMBER, oconj, part, prep, pron, pron-of, proper, prpos, psuc, qhow, qmod, QTHAT, qwhich, qwho, refl, reltime, SIZE, teens, tens, tper, tsuc, tsuc2, tys, v-aux, v-be, v-do, v-have, verb, weekday.