ACL SIGLEX Resource Links

Special Interest Group on the Lexicon of the Association for Computational Linguistics

Treebanks

Databanks of text containing part of speech tags and labeled constituent structures (e.g., noun phrase, adverbial phrase, coordinate clause).


  • Treebanks available from the Lancaster University Centre for Computer Corpus Research on Language (UCREL):
  • ICAME Treebanks: Treebanks available from ICAME (International Computer Archive of Modern and Medieval English) in Bergen, Norway.
  • Two morphologically analyzed and disambiguated Turkish texts (about 12,000 words) are now available online. The morphological parse is presented in a hierarchical fashion with the inflectional features after the last derived form shown at the top-most level, and the nesting levels indicating the derivations in the lexical form. The disambiguation process also preprocesses the morphologically analyzed to group all lexicalized and non-lexicalized collocations. Send any comments and/or corrections to Kemal Oflazer (http://www.cs.bilkent.edu.tr/~ko/ko.html), Bilkent University Computer Engineering Department, Bilkent, ANKARA, 06533 TURKIYE
    To SIGLEX Resources Main Page