Analyzing the frame element digraph: Initial steps

The main objectives of analyzing the frame element digraph, whose derivation was described in a previous post, are to identify the primitive frame elements and to show the derivational hierarchy of each of the other frame elements.  There are 1015 frame elements in the frame element dictionary. Based on the hypernym relationships, this yields a digraph of 1028 nodes with 476 primitives. The primitives fall into three classes:

  • 400 frame elements with no hypernyms and not used as the hypernym for any other frame element
  • 70 frame elements with no hypernyms and that are used as the primitives in defining other frame elements
  • 6 strong components consisting of 2 or 3 frame elements (constituting circular definitional paths) and that are used in defining other frame elements

The 76 base nodes in the digraph are used in defining 552 other frame elements.

As can be seen, there appears to be a basic inconsistency between the number of frame elements and the number of nodes in the digraph. This inconsistency is not the result of a faulty algorithm for creating the digraph, but from various problems in the underlying data. A large number of problems arise from simple editorial inconsistency. The most notable one affecting this digraph analysis is different capitalization in the frame element names in the frame-to-frame relations from which the dictionary was developed.  For example, the frame-to-frame relations use both Affected_Party and Affected_party. This gives rise to two senses in the frame element dictionary for 15 entries, when there should be only one sense for each frame element.

Beyond this, there are several other editorial differences that affect the total number of frame elements as well as the paths between them, including misspellings (depictive and depicitive) and editorial variation (Duration_of_end_state and Duration_of_endstate, Entity_1 and Entity1).

After considering editorial variations, the substance of analyzing and modifying the digraph begins. Essentially, this consists in making changes to the hypernymic link within each frame element. It is important to observe that such a change is a local decision, as opposed to making some global design change. Making a local change immediately reverberates throughout the full digraph. Rerunning the digraph analysis within the DIMAP frame element dictionary only takes a few seconds. Thus, the primary task is to clearly lay out steps and rationales for making changes to the hypernymic links.

In making changes to the frame element dictionary, a primary rule of thumb is that the list of frame elements should not change. When I originally developed the digraph analysis technique for analyzing defining paths in a dictionary, I conceived it as a mechanism that could be used by lexicographers to make changes in the dictionary. This requires that a change be made to the underlying dictionary or dictionary source rather than to the derived data used for the digraph analysis. For example, the problem with the entry depicitive should not be solved by deleting this entry, but rather by creating a hypernymic link from it to depictive. When the underlying data from the FrameNet frame-to-frame relations is corrected, a new frame element dictionary will not contain the misspelled frame element. That is, the digraph analysis will highlight where changes to the underlying data are needed.

As indicated, every hypernymic link change made to the frame element dictionary needs to be clearly documented, so that changes can be re-applied if changes are made to the underlying data and so that the validity of the changes can be assessed (by others). Some initial ideas for changes are the following:

  • Analyze the six circular strong components to eliminate them (e.g., Purpose and Reason need to be separated into two nodes to eliminate the circularity).
  • For entries with editorial variations, create appropriate links that tie these frame elements to a base form.
  • There are 48 frame elements with a “1” or “2” in their names. Most of these have a corresponding frame element without  a number; the ones with numbers can be linked to those without the number. E.g., Entity_1 and Entity_2 can be linked to Entity.
  • Many frame elements are plurals (e.g., Entities) and there is another frame element (e.g., Entity) in a singular form. Create a link from the plural form to the singular form (with an implicit “singular_of” relation from the plural to the singular),
  • Examine the definitions of the frame elements within each of their frames (e.g., there are 125 frame element definitions for Agent), particularly for the 400 frame elements that currently have no hypernymic links. This analysis can include a simple examination of the frame element names (e.g., Focal_entity can be considered in relation to Entity).
  • Examine the mapping developed by O’Hara & Wiebe (2009) to create links between frame elements

The implementation of these analysis steps will be described in future posts, along with their effect on the resulting digraph. It is likely that further steps will emerge. If you have any suggestions, I look forward to considering them.

You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 Comments »

 
 

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>