In a recent posting to CORPORA on the topic of semantic primitives, John Sowa says,
The so-called primitives are the result of analysis by adults who have learned how to write dissertations about language. I believe there are no primitives that are truly primitive in the sense that they cannot be analyzed in different ways by different adults with different biases.
While I won’t argue with John, I do believe such statements can have a discouraging effect on useful research. Throughout the 1970s and 1980s, research on machine-readable dictionaries (MRDs) was quite the rage. However, in 1991, Jean Veronis and Nancy Ide wrote a paper, “An Assessment of Semantic Information Automatically Extracted from Machine Readable dictionaries.” They concluded that 55 to 70 percent of the data was garbled in some way. This paper had a similar discouraging effect on MRD research. I have been engaged in MRD research for 40 years and would like to suggest that the search for primitives is not without value.
I recently developed an overview of the tasks in SemEval (the series of semantic evaluations conducted under the auspices of the ACL SIGLEX). The nice thing about this exercise was that it put semantic analysis into a larger perspective, where it becomes clearer where things are lacking. The overview groups the tasks into dictionary issues and issues involving how sentence and textual elements fit together, the fruits of which are then available for application areas. After the first Senseval (the precursor to SemEval) was conducted, with a focus on word-sense disambiguation (WSD), the question was raised as to what purpose WSD served. The same question can be asked about all the other tasks. Attempting to answer this question may help to identify needed further tasks in SemEval, but also may help to identify how the various pieces of information may be used in different application areas. In what follows, I offer some opinions, particularly trying to identify other research that is relevant to the SemEval tasks.
The Number Sense: How the Mind Creates Mathematics (1999) and Reading in the Brain (2009), by Stanislas Dehaene, provide insights that can aid in the construction of computational lexicons. Dehaene describes how both reading and mathematics recruit structures of the brain that evolved for other purposes (the neuronal recycling hypothesis). There is a visual recognition process that progressively extracts graphemes, syllables, prefixes, suffixes, word roots, and numbers. After this process, two routes in parallel activate speech creation and look-up in a mental lexicon. For both reading and mathematics, the processes are different from the computational processes implemented in computers (e.g., mathematical algorithms and parsing). Rather than attempting to optimize computational mechanisms for such processes, we can take a slightly different route by following the steps used by the brain to perform these tasks, i.e., accessing fragments of meaning in the mental lexicon.
I have been involved in the development of a frame element hierarchy or taxonomy, based on FrameNet’s frame-to-frame relations and frame element definitions. Since I know that this taxonomy is not perfect and can be improved, I need to consider the types of operations that might be involved in making changes. Although this may seem a trivial task, a substantial amount of rigor needs to be maintained. Many other systems (particularly ontologies) also involve some sort of hierarchical relationships, principally the ISA relationship. The operations I consider will embrace these as well.