Lexicologic Insights from Cognitive Neuroscience

The Number Sense: How the Mind Creates Mathematics (1999) and Reading in the Brain (2009), by Stanislas Dehaene, provide insights that can aid in the construction of computational lexicons. Dehaene describes how both reading and mathematics recruit structures of the brain that evolved for other purposes (the neuronal recycling hypothesis). There is a visual recognition process that progressively extracts graphemes, syllables, prefixes, suffixes, word roots, and numbers. After this process, two routes in parallel activate speech creation and look-up in a mental lexicon. For both reading and mathematics, the processes are different from the computational processes implemented in computers (e.g., mathematical algorithms and parsing). Rather than attempting to optimize computational mechanisms for such processes, we can take a slightly different route by following the steps used by the brain to perform these tasks, i.e., accessing fragments of meaning in the mental lexicon.

Dehaene suggests that the number sense can be viewed as a sense just like the sense of smell or taste and can be found in many other animals besides humans. Although the number sense gave rise to mathematics, it is crucially different from the rigor and logic of mathematics; that is, the mind-computer metaphor does not hold up. If we look at how mathematics is performed, from a cognitive neuroscientific perspective, we can observe a combination of serial and parallel processing. This processing does not mirror the kind of processing used in looking up a word in an alphabetic dictionary and obtaining a definition. If we take this alternative route, we will not follow the logic of such things as ontologies, which have the same fatal flaws as the logical incompleteness of mathematics.

Dehaene describes the use of electroencephalograms (EEGs) to discern the order in which brain regions are activated. He particularly examines mathematical tasks, such as comparing two numbers to see which is larger. The main steps are planning, sequential ordering, decision making, and error correction; these steps are under the control of “executive areas” of the brain, which calls the necessary modules into play.

Dehaene’s methods provide the following general scenario: Some task is to be performed and different modules in the brain are activated in some order. Ten or twenty areas in the brain are activated in tasks such as reading words, examining their meaning, viewing a scene, or performing a calculation. Each region performs an elementary operation such as constructing a pronunciation or identifying the part of speech. The modules are generally very specific and very fine-grained. Our task is to identify and characterize the distinct fragments of meaning. Initially, we do not know exactly what fragments there are and we run the risk of imposing preconceptions into the process.

Dehaene suggests that reading first involves a “letterbox” which performs some recognition of a word (graphemes, syllables, prefixes, suffixes, word roots, and numbers). In examining the processing of strings, Dehaene identified a top-down sequence in assessing the meaning of a word. He used the strings EIGHTEEN, EINSTEIN, EXECUTE, and EKLPSGQI. Initially, visual areas are activated and then by a quarter-second, actual words were discriminated from meaningless words. At this point, activations differed for number words, proper nouns, or verbs and other words. There was a difference in response for major categories, i.e., access to the meaning of a word. In The Number Sense, Dehaene in 1997 summarizes these findings to indicate that the full characterization of all these nuances was only beginning; he noted that the number of questions could go on and on.

After presenting the details of observations, Dehaene goes on to consider the nature of mathematics in light of these discoveries. He concludes first that the brain-computer metaphor is not a good model of the data: the brain is not a logical machine. Dehaene goes on to examine axiomatic systems in mathematics (e.g., attempts by logicians such as Peano, Frege, and Russell to build a consistent basis for mathematics), leading up to Gödel’s Incompleteness Theorem. He concludes that mathematics has been subject to evolution, with increasing efficiency in its ability to express mathematical ideas. He suggests that this results from mathematical intuitions (e.g., Chinese words for numbers are much shorted to pronounce than English numbers, leading to greater efficiency in making simple calculations). The point of all this is that it is very easy to get hung up on and locked into mathematical formalisms, which may become problematic because of some ultimate inconsistency.

Mathematicians, particularly around the early 1900s, were very keen on building a logical structure for mathematics. Most notable was Bertrand Russell’s Principia Mathematica. These efforts were derailed by Gödel’s incompleteness theorems, which carried over to Turing machines and the advent of the digital computer. Lessons from these efforts should carry over to those who are attempting to develop ontologies. They will always be incomplete. In addition, the efforts to develop ontologies may obscure important aspects of our attempts to build dictionaries. They are focused too much on hierarchical representations (i.e., following the hypernymic backbone) and do not take into account all the many activations that may occur when we are confronted with bringing to bear knowledge about a word. (See the Suggested Upper Merged Ontology, SUMO, the Cyc ontology, WordNet, and the Semantic Web.)

The problem is that there are too many pieces of information associated with a word: all the context that needs to be brought to bear (i.e., corpus linguistics), syntactic knowledge, semantic knowledge, relations with other parts of the lexicon, culture, etc. In the parser I use, the lexicon is designed for rapid access to syntactic information. It uses a hashing technique to access a word (i.e., it does not proceed by alphabetic lookup) and stores the word’s information in lists. These lists are nested, with syntactic categories at the first level and possibly other information as sublists providing limited amounts of context, subcategorization patterns, or various irregularities.

There is a growing field of computational neurolinguistics (see upcoming workshop), as well as attention being paid to optimal organization of the lexicon (another upcoming workshop). At the moment, it seems that cognitive neuroscience is focused primarily on comparing models of the neural activity with various language resources. Important studies in this area include Mitchell et al. (2008) and Murphy et al. (2009). The latter study particularly makes use of EEG data, but it was designed primarily to determine whether a priori semantic features were correlated with activation of particular brain regions. There are many fragments of meaning associated with individual words, so this kind of study is only a first step.

As I continue to investigate developments in these areas, I will be attempting to identify mechanisms that can be used in the design of computational lexicons. As an example of this, consider the first step of recognizing words, the hashing step I mentioned above. In my parser, the look-up phase computes a hash value for each word and accesses the location of its definition in the dictionary. The first step is to create an intermediate memory of all the parts of speech associated with the word, including the possibility that the string is merely a meaningless string and not a word at all. An important question is whether this is the most efficient access. It is this kind of question that will be informed by findings in cognitive neuroscience. I will draw upon these findings in later posts.

You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.


  • Ken says:

    I agree with these comments. Even as I was finalizing my post, not only was I lumping quite distinct systems, but also appearing to be critical of them. I knew that I needed to correct this perception; I am grateful for this comment which does so very effectively.

    To address the why question, first of all I should apologize for focusing on ontological hierarchies. I was trying to capture Dehaene’s concern about reliance on formal system; he showed this was counterintuitive for elementary mathematics, and particularly excoriated the set-theoretic basis of “new math” which began in the 1970s.

    I’m hoping for a data-driven approach to building computational lexicons and I’m afraid that beginning from systems like SUMO, WordNet, and Cyc, we might miss some important fragments of meaning. I’ve spent considerable time in pursuing a data-driven organization of FrameNet’s frame elements (see my most recent post), which still has many problems. But, this approach contrasts with all the theorizing about semantic roles during the last 40+ years. I wonder whether a study like Murphy et al. (2009) works from a priori notions (i.e., tools vs. mammals) and doesn’t capture some very fine-grained fragments of meaning.

  • Adam Pease says:

    I think it’s not accurate to lump together formal ontology, the semi-formal taxonomic information of the semantic web and WordNet. You state,

    “They will always be incomplete.” Of course, but so will just about anything, including dictionaries.

    “In addition, the efforts to develop ontologies may obscure important aspects of our attempts to build dictionaries.”

    I wonder why. They are different efforts with different goals.

    “They are focused too much on hierarchical representations (i.e., following the hypernymic backbone) and do not take into account all the many activations that may occur when we are confronted with bringing to bear knowledge about a word.”

    Formal ontologies such as SUMO are not focused on the hierarchy as less formal “ontologies” or frame systems are. While representation languages such as OWL retain the focus on hierarchy that description logic entails, ontologies that use at least first order logic have no such orientation or limitation. In addition, formal ontologies do not focus on words but on concepts. We have related but kept distinct the concepts in SUMO from the synsets in WordNet to help make exactly that distinction.


Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>