Refining the frame element hierarchy: Application of initial steps

Several proposed steps for refining the frame element hierarchy have now been performed. At present, there are 1011 frame elements in the frame element digraph with 261 primitives (compared with an original 476 primitives), consisting of 202 primitives not used as hypernyms and 59 used as hypernyms in deriving 750 frame elements. The current hierarchy is more rigorous, so that it is possible to trace each frame element back to its primitive. This post describes the steps that have been taken thus far.

I describe the application of each change proposed in an earlier post, with the approximate number of affected entries in the frame element dictionary. While each change was performed for all cases to which it applied, some changes gave rise to others of a type that had already been completed. So, the number of cases for each type of change is only approximate.

  1. Removal of multiple senses: This affected entries that had two senses arising from different capitalization in the frame element names, such as Affected_Party and Affected_party. In these cases, the two senses were combined, any entries using these alternate capitializations as hypernyms were made consistent to a single spelling, and the number of occurrences for the two senses in frames was combined.
  2. Misspellings and editorial variation: These changes were applied to misspellings (depictive and depicitive) and editorial variation (Duration_of_end_state and Duration_of_endstate, Entity_1 and Entity1). For cases involving editorial variations, a choice was made as to the “more correct” underlying form. However, four frame elements were deleted (Re-encoding, Sub-region, State-of-affairs, and Hot/Cold_source) after combining them with more standard forms (Re_encoding, Sub_region, State_of_affairs, and Hot_cold_source); these were deleted because the program used to create the digraph image was separating the dashed forms into two nodes.
  3. Frame elements with a number: Although 48 frame elements had been identified with a number “1” or “2” in their names (e.g., Entity_1 and Entity_2), changes were made to only 15, using a base form without a number. In the remaining cases, the entries already contained a hypernym link, so these were not changed.
  4. Plural forms: There are 51 frame elements in a plural form (ending in “s” or “a”). Of these, 21 did not have a singular form (e.g., Tools). Another 19 had a singular form, but already had a hypernym induced from the frame-to-frame relations (e.g., Members was already linked to Individuals). Only 10 frame elements were linked to a singular frame element (e.g., Recipients linked to Recipient).
  5. Underscore hypernyms: Approximately 500 frame elements have underscores in their names (e.g., Dangerous_entity and Location_of_protagonist). These frame elements can be interpreted as hypernymic (e.g., Dangerous_entity is a kind of Entity and Location_of_protagonist is a kind of Location). Approximately 150 distinct potential hypernyms were identified and each was examined. Two criteria were used for the addition of a hypernym link: (a) the frame element had no existing hypernym and (b) the putative hypernym is a frame element. For example, of 18 frame elements ending in _entity, Entity was added as a hypernym link in 14. In all, 167 frame elements were given 82 hypernym links.
  6. Circularity elimination: In the first digraph generated from the frame element dictionary, there were 6 strong components affecting 13 frame elements. Because of the above changes, an additional 4 strong components emerged. In a separate post, I describe how each of the circularities was broken.

As indicated above, the application of these changes has resulted in the current digraph with 1011 frame elements in the frame element digraph containing 261 primitives (compared with an original 476 primitives), consisting of 202 primitives not used as hypernyms and 59 used as hypernyms in deriving 750 frame elements.

When viewing the current frame element hierarchy, many of the links induced by the frame-to-frame relations seem awkward. The next steps will focus on reducing the 261 primitives to a much smaller set, on the order of 20 core frame elements. This will be done by a closer examination of the frame element definitions, as they are given in the characterization of each frame in which they appear. This effort will be guided by the semantic relation inventory developed by O’Hara and Wiebe (2009).

