WPC, s{DnsRI9OKy'/2x^a~DgIF֚[=/4-*dMOMQ#ǫW08q5ޠ j /اHfE( 9SuO9Y۾-u4fP8uM*FX.C?=3 0*>!yrc)Qƥ{ZL+xA63|Ea B|| @lj)VYuiy#O#5)ؤS}gdX47Pk oSV&|Ɲn,2^G~@{O}#!U>N %E 0(K U5*s U>  N  0 0D V - fIKa(^  0N 0D= 0DM 0f^ 0<` 0L 0p]w@4  0c! 1u 0w 0p 0 B 0 0k 02 0  1K  !ac"w"""fV#fX#aZ# mn# #a#j# #& )&)&)&)&)&)&)&)&)&f/& 0D1&au&f&a&&&&&&& &&&&&&&&&&&& 0D& D3'''' Ai6' 0D''' C*(((( AQ(( 0D) AKX)X) 0J))))))))))))))))HP LaserJet IIIP,,,,,,0(9 Z6Times New Roman RegularX($\  `Times\  `&Times New Roman $ `ȿ(82$ !    0  XXX(#$  0   L  $  2  XX  WordNetdefinitionswerenotparsed.Inanexperiment,thesemanticrelationsidentifiablethroughparsingwere  frequentlyinconsistentwiththosealreadygiveninWordNet,soitwasdecidednottoconfoundthedisambiguation. #  $  3  XX  Severalotherfunctionswereimplementedonlyinstubformatthetimeofthetestruns,toevaluate:type  restrictions(e.g.,transitivity),presenceofaccompanyinggrammaticalconstituents(e.g.,infinitivephraseor x  complements),formrestrictions(suchasnumberandparticipial),grammaticalrole(e.g.,asamodifier),andselectionalrestrictions(suchassubject,object,modificand,andinternalarguments).x  $  5  XX  Entriesincludedallpartsofspeech;disambiguationwasrequiredtoidentifythepartofspeechaswell.TABLE C  $  6  XX  NotethatamappingfromWordNettoNODEislikelytogeneratesimilarmismatchstatistics.ם^+(O$(3$ !  (3$ !    $  1  XX  DIctionaryMAintenancePrograms,availablefromCLResearchathttps://www.clres.com.64heading 1    8.4 <DL!8  UK ?\  `Times?      US XXXS\  `&Times New RomanS  8.4 <DL!80o<:Default Para<:Bibliography      rr! j  UK ?\  `Times?    5+ 4 <DL!5   US XXXS\  `&Times New RomanS  6s4Body Text    8.T4 <DL!8 UK &&&   US XXX  8.T4 <DL!8 HEDEUO_@  SENSEVALWordSense @@\\ DisambiguationUsingaDifferentSense  @ InventoryandMappingtoWordNet   @ KennethC.Litkowski@%CLResearch@vv$9208GueRoad@pp!Damascus,MD20872@$ken@clres.com8XXdd8@zz& Abstract  (  6" ,X` XDXX60  InSENSEVAL2,CLResearch'sunsupervisedwordsensedisambiguationsystemofficiallyattainedacoarsegrainedprecisionof0.367forthelexicalsampletaskand0.460fortheallwordstaskusingtheWordNetsenseinventory.Subsequently,anexperimentinvestigatedtheviabilityofmappinganotherdictionary(theNewOxfordDictionaryofEnglish)intoWordNet,disambiguatingwiththisdictionary,andthenusingthemapstoproduceWordNetsenses.Theprecisionobtainedwiththisintermediatingdictionarywas0.402forthelexicalsampletaskand0.418fortheallwordstask,despiteconsiderablemismatchintheentriesandonlyabout70percentmapping(whichisinaccurate)forthesensesinthematchingentries.ResultssuggestthatdisambiguationusingthedictionarywasconsiderablybetteragainstitssenseinventorythanWordNet,withabetteropportunityforimprovementwithitslexicographicallybasedinformation.Detailsofthemappingprocessprovidesignificantinsightsintotheissueofreuseoflexicalinventories. (#(#  Introduction  *%)  X,'+  $ `   Thesignificanceofthesenseinventoryforwordsensedisambiguation(WSD)cannotbeoverstated.Foranynaturallanguageprocessing(NLP)applicationthatreliesonarepresentationofmeaning,theabilitytodisambiguateagain%  stthesenseinventroywillaffecthoweffectivetheendresultwillbe.Consideredasanendinitself,suchasinSENSEVAL2,theeffectivenessofWSDmaydependonthequalityofthesenseinventory(Kilgarriff,2001).However,amajordifficultyforsenseinventories(Atkins,1991)isthatnotwowillbesimilar,foravarietyofreasons,nottheleastofwhichiswhatcountsasasense(Kilgarriff,1997).Finally,sinceanysenseinventoryusedinWSDisultimatelyamachinereadabledictionary(MRD),wemustconsiderwhetherunsupervisedsystemsrelyingontheMRDscanachievethesameresultsasothermethods(Ide&Veronis,1993).WhileCLResearch'sparticipationinSENSEVAL2wasdesignedprimarilyto(1)extendWSD H (wordsensedisambiguation)techniquesfromSENSEVAL-1(Litkowski,2000)and(2)generalizeWSDmechanismstorelyonafulldictionaryratherthanasmallsetofentrieswhereindividualcraftingmightintrude,wealsowantedtoinvestigatehowwellWSDcouldbeperformedusinganMRDofatraditionaldictionary.Theavailabilityofareferencesetoftextsdisambiguatedbylexicographicallytrainedjudgesagainstawelldevelopedsenseinventory(WordNet)providedasuitableopportunity.Whileourinitialgoalsweremet(Litkowski,2001),wefoundthatwewereabletoachieveresultscomparabletoourofficialsubmissionevenwiththestepofusingafuzzyintermediary(i.e.,themappingbetweentheMRDandWordNet).CLResearch'sWSDfunctionalityisimplementedinDIMAP  #  1      ,designedprimarilyforcreationand h+&* 7> 7$ H 7  maintenanceoflexiconsforNLP.Inparticular,DIMAPisdesignedtomakeMRDstractableandtocreatesemanticnetworks(similartoWordNet(Fellbaum,1998)andMindNet(Richardson,1997))automaticallybyanalyzingandparsingdefinitions.Toplaceourmappingfindingsinperspective,wefirstdescribethedictionarypreparationtechniquesforWordNetandNODE(TheNewOxfordDictionaryofEnglish,1998)foruseinSENSEVAL2(section1).WethendescribetheWSDtechniquesusedinSENSEVAL2(sectiona2)andpresentourresults,includingthoseachievedthroughmapping(section23).Insection34,wedescribethemappingfromNODEtoWordNetandseveralinvestigationswewereabletoperforminseekingtounderstandourperformance.Insection45,wepresentourconclusionsandfuturestepsforimprovementcpresentourconclusionsandfuturestepsforimprovementonsider p howtheseeffortscorrespondtootherresearchandinsection56,presentourconclusionsandfuturestepsforimprovementconsiderhowtheseeffortscorrespondtootherresearch. 10  .DictionaryPreparation  H DIMAPisintendedtodisambiguateanytextagainstWordNetoranyotherdictionaryconvertedtoDIMAP,withaspecialemphasisoncorpusinstancesforspecificlemmas(thatis,lexicalsamples).Thedictionariesusedfordisambiguationoperateinthebackground(asdistinguishedfromtheforegrounddevelopmentandmaintenanceofadictionary),withrapidbtreelookuptoaccessandexaminethecharacteristicsofmultiplesensesofawordafterasentencehasbeenparsed.DIMAPallowsmultiplesensesforeachentry,withfieldsforthedefinitions,usagenotes,hypernyms,hyponyms,arbitraryothersemanticrelations,andfeaturestructurescontainingarbitraryinformation,anyofwhichcanbeusedindisambiguation.WordNetisalreadyintegratedinDIMAPinseveralways,butforSENSEVAL2,WordNetitwasentirely @-(, O O$ H O  convertedtoalphabeticformatforuseasthedisambiguationdictionary.Inthisconversion,allWordNetinformation(e.g.,verbframesandglosses)andrelationsareretained.Glossesareanalyzedintodefinition,examples,usageorsubjectlabels,andusagenotes(e.g., usedwith'of').Verbframesareusedtobuildcollocationpatterns,typicalsubjectsandobjects,andgrammaticalcharacterizations(e.g.,transitivity).WordNetfileandsensenumbersareconvertedintoauniqueidentifierforeachsense.SincetheglosseswereintendedonlytoserveasremindersforthoseconstructingWordNet(Miller,2001)andwerenotpreparedaccordingwellspecifiedguidelines,theanalysisintothedifferentcomponentsisfrequentlyinexact.Aseparate phrasedictionarywasconstructedfromallnounandverbmultiwordunits(MWUs),usingWordNet'ssenseindexfile.FornounMWUs,anentrywascreatedforthelastword(i.e.,thehead),withthefirstword(s)actingasa hynonymicindicator;anentrywasalsocreatedforthefirstword,withthefollowingword(s)actingasacollocationpattern(e.g., workofartisahyponymofartandacollocationpatternunderwork,written  ~ofart ).ForverbMWUs,an  entrywascreatedforthefirstword,withacollocationpattern(e.g., keepaneyeonisenteredasacollocationpattern  ~aneyeon underkeep).Indisambiguation,thisdictionarywas `  examinedfirstforamatch,withthefullphrasethenusedtoidentifythesenseinventoryratherthanasingleword.NODEwaspreparedinasimilarmanner,withseveraladditions.AconversionprogramtransformedtheMRDfilesintovariousfieldsinDIMAP,thenotabledifferencebeingthemuchricherdataandmoreformalstructure(e.g.,subjectlabels,lexicalpreferences,grammarfields,andsubsensing)containedinwelldefinedtaggedfields.Conversionalsoconsiderablyexpandedthe P-(, numberofentriesbymakingheadwordsofallvariantforms(fullyduplicatingtheotherlexicalinformationoftherootform)andphrasesrunontosinglelemmaentries.ForexampleE.g.,  (as)  happyasasandboy (or Larry or aclam )underhappywasconvertedintosixheadwords h (basedonthealternativesindicatedbytheparentheses),aswellasacollocationpatternforasenseunderhappy,written (as|?)~as(asandboy|Larry|aclam),withthetildemarkingthe  p targetwordandthequestionmmarkindicatinganull.NODEwasthensubjectedtodefinitionprocessingandparsing.Definitionprocessingconsistsoffurtherexpansionoftheprintdictionary:(1)grabbingthedefinitionsofcrossreferencesand(2)assigningpartsofspeechtophrasesbasedonanalysisoftheirdefinitions.Definitionparsingputsthedefinitionintoasentenceframeappropriatetothepartofspeech,makinguseoftypicalsubjects,objects,andmodificands.Thesentenceparsetreewasthenanalyzedtoextractvarioussemanticrelations,includingsynonyms,superordinatesorhypernyms,holonyms,meronyms,satellites,telicroles,andframeelements,withtheseelementsaddedtothedictionary.Afterparsingwascompleted,aphrasedictionarywasalsocreatedforNODE.   #  2         2.0  DisambiguationTechniques  @"  TheSENSEVALtaskswererunseparatelyagainsttheWordNetandNODEsenseinventoriesasdescribedabove,withtheWordNetresultssubmitted.Thelexicalsampleandallwordstextsweremodifiedslightly.SatellitetagswereremovedandentityreferenceswereconvertedtoanASCIIcharacter.Intheallwordstexts,contractionandquotationmarkdiscontinuitieswereundone. x+&* Thesechangesmadethetextsmorelikenormaltextprocessingconditions.Thetextswerenextreducedtosentences.Forthelexicalsample,asentencewasassumedtoconsistofthelastsinglelineinthetext(notalwaysthecase).Fortheallwordstexts,asentencesplitteridentifiedthesentences,whichwerenextsubmittedtotheparser.TheDIMAPparserproducedaparsetreeforeachsentence,withbottomupconstituentphraseswhenthesentencewasnotparsablewiththegrammar,allowingtheWSDphasetocontinue.ThefirststepintheWSDusedthepartofspeechofthetaggedwordtoselecttheappropriatesenseinventory.Nouns,verbs,andadjectiveswerelookedupinthephrasedictionary;ifthetaggedwordwaspartofanMWU,thewordwaschangedtotheMWUandtheMWU'ssenseinventorywasusedinstead.Thedictionaryentryforthewordwasthenaccessed.Beforeevaluatingthesenses,thetopicareaofthecontextprovidedbythesentencewas established(onlyforNODE),bytallyingsubjectlabelsforallsensesofallcontentwordsinthecontext.Eachsenseofthetargetwasthenevaluated.Sensesinadifferentpartofspeechweredroppedfromconsideration.Thedifferentpiecesofinformationinthesensewereassessed:collocationpatterns,contextualcluewords,contextualoverlapwithdefinitionsandexamples,andtopicalareamatches.Pointsweregiventoeachsenseandthesensewiththehighestscorewasselected;incaseofatie,thefirstsenseinthedictionarywasselected.   #  3       h+&* ЇCollocationpatterntesting(requiringanexactmatchwithsurroundingtext)wasgiventhelargestnumberofpoints(10),sufficientingeneraltodominatesenseselection.Contextualcluewords(aparticleorpreposition)wasgivenasmallscore(2points).Eachcontentwordofthecontextaddedtwopointsifpresentinthesense'sdefinitionorexamples,sothatconsiderableoverlapcouldbecomequitesignificant.Fortopictesting,asensehavingasubjectlabelmatchingoneofthecontexttopicareaswasawardedonepointforeachwordinthecontextthathadasimilarsubjectlabel(e.g.,iffourwordsinthecontexthadamedicalsubjectlabel,fourpointswouldbeawardediftheinstantsensealsohadamedicallabel). 3.SENSEVAL2Results  H AsshowninTable1,usingWordNetasthedisambiguationdictionaryresultedinanofficialprecisionof0.293atthefinegrainedleveland0.367atthecoarsegrainedlevel.Theofficialresultswereactuallyrecall,sinceoursystemerroneouslygeneratedaresultincaseswhereitshouldnothave(suchasthecaseswheretheassumptionaboutthelastsentencecontainingthetaggedwordwasnottrue);theactualprecisionwas0.311and0.390,respectively.SinceCLResearchdidnotusethetrainingdatainanyway,runningthetrainingdataalsoprovidedanothertestofthesystem.Theresultsareremarkablyconsistent,bothoverallandforeachpartofspeech.Sincesubmission,variousotherbugfixesandchangestothedisambiguationroutineshasincreasedtheprecisionto0.340and0.429forthetwograins.Itisexpectedthatfurtherimplementationofstubroutineswillincreaseourresults,althoughitisnotclearwhetherwecanattainthe0.67coarsegrainedprecisionattainedinSENSEVAL1(Litkowski,2000).(Kilgarriff, h+&* 2001)suggeststhatuseofWordNetasthesenseinventorycouldhaveaffectedperformancebyabout0.14.UsingNODEasthedisambiguationdictionaryandmappingitssensesintoWordNetsensesachievedcomparablelevelsofprecision,althoughrecallwassomewhatlower,asindicatedbythedifferenceinthenumberofitemsonwhichtheprecisionwascalculated.SenseswereidentifiedforallinstancesusingNODE,butonlyabout75%ofthesensesweremappedintoWordNetinthefirstrunimmediatelyaftertheofficialWordNetsubmission.Subsequentimprovementsinthemapping(describedinthenextsection)haveimprovedrecalltoabout87.5%withoutchangingprecision.*$ddd Xdd Xdd X(#(#, , , , , , , , , , , , ,% +  ;d& H  ;@..XXTable1.LexicalSamplePrecision `dK/H  d  `Run VdA2"ddd VAdjectives Hd3$"d HNouns Hd3$"d HVerbs Hd3$"d HTotal LB/"  d L ,d  ,Items Ad,!"d AFine Ad,!"d ACoarse Ad,!"d AItems Ad,! "d AFine Ad,!!"d ACoarse Ad,!""d AItems Ad,!#"d AFine Ad,!$"d ACoarse Ad,!%"d AItems Ad,!&"d AFine Ad,!'"d ACoarse @6,("  d @WordNetTest ?5 ) @768@?768 qg3*3 @768 @ -?0.354-?q0.354 yo;+3 -?0.354 -? -?0.354-?y0.354 rh;,3 -?0.354 -? @1726@r1726 lb4-3 @1726 @ oʡ?0.338oʡ?l0.338 mc5.3 oʡ?0.338 oʡ? jt?0.439jt?m0.439 lb5/3 jt?0.439 jt? @1834@l1834 lb403 @1834 @ ?0.225?l0.225 mc513 ?0.225 ? Q?0.305Q?m0.305 lb523 Q?0.305 Q? @4328@l4328 lb433 @4328 @ n?0.293n?l0.293 mc543 n?0.293 n? rh|?0.367rh|?m0.367 OE;53 rh|?0.367 rh|? OWordNetTest(R) ?5 j6 @768@?768 qg3j73 @768 @ 5^I ?0.4225^I ?q0.422 yo;j83 5^I ?0.422 5^I ? 5^I ?0.4225^I ?y0.422 rh;j93 5^I ?0.422 5^I ? @1726@r1726 lb4j:3 @1726 @  rh?0.397 rh?l0.397 mc5j;3  rh?0.397  rh? jt?0.503jt?m0.503 lb5j<3 jt?0.503 jt? @1834@l1834 lb4j=3 @1834 @ T㥛 ?0.252T㥛 ?l0.252 mc5j>3 T㥛 ?0.252 T㥛 ? ^I +?0.362^I +?m0.362 lb5j?3 ^I +?0.362 ^I +? @4328@l4328 lb4j@3 @4328 @ (\?0.340(\?l0.340 mc5jA3 (\?0.340 (\? ~jt?0.429~jt?m0.429 OE;jB3 ~jt?0.429 ~jt? ONODETest ?5 2C @z@420@z@?420 qg32D3 @z@420 @z@ ;On?0.288;On?q0.288 yo;2E3 ;On?0.288 ;On? ;On?0.288;On?y0.288 rh;2F3 ;On?0.288 ;On? @1403@r1403 lb42G3 @1403 @ |?5^?0.402|?5^?l0.402 mc52H3 |?5^?0.402 |?5^? sh|??0.539sh|??m0.539 lb52I3 sh|??0.539 sh|?? ȕ@1394ȕ@l1394 lb42J3 ȕ@1394 ȕ@ x&1?0.219x&1?l0.219 mc52K3 x&1?0.219 x&1? Q?0.305Q?m0.305 lb52L3 Q?0.305 Q? "@3217"@l3217 lb42M3 "@3217 "@ ʡE?0.308ʡE?l0.308 mc52N3 ʡE?0.308 ʡE? Q?0.405Q?m0.405 OE;2O3 Q?0.405 Q? ONODETest(R) ?5 JP @636@?636 qg3JQ3 @636 @ -?0.434-?q0.434 yo;JR3 -?0.434 -? -?0.434-?y0.434 rh;JS3 -?0.434 -? @1568@r1568 lb4JT3 @1568 @ x&?0.393x&?l0.393 mc5JU3 x&?0.393 x&? p= ף?0.520p= ף?m0.520 lb5JV3 p= ף?0.520 p= ף? Ԙ@1589Ԙ@l1589 lb4JW3 Ԙ@1589 Ԙ@ MbX?0.198MbX?l0.198 mc5JX3 MbX?0.198 MbX? Fx?0.273Fx?m0.273 lb5JY3 Fx?0.273 Fx? @3793@l3793 lb4JZ3 @3793 @ '1Z?0.318'1Z?l0.318 mc5J[3 '1Z?0.318 '1Z? |?5^?0.402|?5^?m0.402 OE;J\3 |?5^?0.402 |?5^? OWordNetTraining @6 ] @1533@@1533 rh4^3 @1533 @ \(\?0.365\(\?r0.365 yo;_3 \(\?0.365 \(\? \(\?0.365\(\?y0.365 rh;`3 \(\?0.365 \(\? @3455@r3455 lb4a3 @3455 @ K7A`?0.334K7A`?l0.334 mc5b3 K7A`?0.334 K7A`? "~j?0.444"~j?m0.444 lb5c3 "~j?0.444 "~j? N@3623N@l3623 lb4d3 N@3623 N@ x&1?0.219x&1?l0.219 mc5e3 x&1?0.219 x&1? A`"?0.299A`"?m0.299 lb5f3 A`"?0.299 A`"? @8611@l8611 lb4g3 @8611 @ 9v?0.2919v?l0.291 mc5h3 9v?0.291 9v? V-?0.369V-?m0.369 OE;i3 V-?0.369 V-? ONODETraining ?5 j @864@?864 qg3k3 @864 @ V-?0.116V-?q0.116 yo;l3 V-?0.116 V-? V-?0.116V-?y0.116 rh;m3 V-?0.116 V-? @@2848@@r2848 lb4n3 @@2848 @@ Cl?0.366Cl?l0.366 mc5o3 Cl?0.366 Cl? x?0.483x?m0.483 lb5p3 x?0.483 x? @2567@l2567 lb4q3 @2567 @ uV?0.227uV?l0.227 mc5r3 uV?0.227 uV? )\(?0.315)\(?m0.315 lb5s3 )\(?0.315 )\(? i@6249i@l6249 lb4t3 i@6249 i@ Dl?0.276Dl?l0.276 mc5u3 Dl?0.276 Dl? \(\?0.365\(\?m0.365KA?v3 \(\?0.365  \(\? K#XXΛ;#Fortheallwordstask,thedisambiguationresultsweresignificantlyhigherthanforthelexical !2x sample,withaprecision(andrecall)of0.460fortheWordNetcoarsegrainedlevel.ForNODE,about70%weremappedintoWordNet(indicatedbythereducednumberofitems),withprecisiononthemappeditemsonlyslightlyless.2  #  4      XX j'"~  j'"~ *I$/d d            % %$(#(#I, , , , +  6d! B)$ 6Table2.AllWords Rd=*B)$" d RRun Ad,! *Z%"d AItems Ad,! *Z%"d AFine Ad,! *Z%"d ACoarse ;1' *Z%" d ;WordNet @6 *"& R@2473R@@2473 lb4*"&3 R@2473 R@ w/?0.451w/?l0.451 mc5*"&3 w/?0.451 w/? q= ףp?0.460q= ףp?m0.460 OE;*"&3 q= ףp?0.460 q= ףp? ONODE @6 +& @1727@@1727 lb4+&3 @1727 @ 9v?0.4169v?l0.416 mc5+&3 9v?0.416 9v? n?0.418n?m0.418KA?+&3 n?0.418  n? KЇ#XX|d# 4.MappingProceduresandExplorations  @ ToinvestigatetheviabilityofmappingforWSD,subdictionarieswerecreatedforeachofthelexicalsamplewordsandforeachoftheallwordstexts.Forthelexicalsamplewords,thesubdictionariesconsistedofthemainwordandallentriesidentifiablefromthephrasedictionaryforthatword.Forexample,bar,inNODE,had13entrieswhere bar wasthefirstwordinan x  MWUand50entrieswhereitwastheheadnoun,comparedto16and40,respectively,inWordNet;forbegin,therewasonlyoneentryineachdictionary.Fortheall-wordstexts,alist 0 wasmadeofallthetaskwordstobedisambiguated(includingsomephrases)andasubdictionaryconstructedfromthislist.Forbothtasks,thecreationofthesesubdictionarieswasfullyautomatic;nohandmanipulationwasinvolved,exceptforcasesintheallwordsentrieswhereverbrootswereaddedwhenonlyaninflectedformappearedinthetext.*4$5N dd    $/(#(#,dd ,dd ,dd +  6d! @ 6Table3.LexicalSampleEntries Rd=*@" d RPartofSpeech Gd2!,|"d GWordNet Md8',|"d MNODE A7-,| " d A Adjectives E;  h! n@245n@E245 uk9 h"3 n@245 n@ n@244n@u244 SI? h#3 n@244 n@ S Nouns E; !T$ @@648@@E648 uk9!T%3 @@648 @@ ~@491~@u491 SI?!T&3 ~@491 ~@ S Verbs E; !@' P}@469P}@E469 uk9!@(3 P}@469 P}@ ؎@987؎@u987 SI?!@)3 ؎@987 ؎@ S Total F< ",* H@1362H@F1362 wm:",+3 H@1362 H@ @1722@w1722PFD",,3 @1722  @ P*3$8N dN ddd dd dd 4$5(#(#,dd ,dd ,dd +  6d! % . 6Table4.AllWordsEntries Rd=*% /" d RText Gd2!&!0"d GWordNet Md8'&!1"d MNODE A7-&!2" d A d00 E; x'"3 ~@493~@E493 uk9x'"43 ~@493 ~@ {@444{@u444 SI?x'"53 {@444 {@ S d01 E; d(#6 @496@E496 uk9d(#73 @496 @ p{@439p{@u439 SI?d(#83 p{@439 p{@ S d02 E; P)$9  @498 @E498 uk9P)$:3  @498  @ `|@454`|@u454 SI?P)$;3 `|@454 `|@ S Total F< <*%< <@1487<@F1487 wm:<*%=3 <@1487 <@ 0@13560@w1356PFD<*%>3 0@1356  0@ PAsshowninTable3,therewereconsiderabledifferencesinthenumberofentrywordsbetween -P(@ WordNetandNODE.Furthermore,thereweresomediscrepanciesbetweenthenumberofentriesforindividuallexicalsamplewordseveninWordNetcomparedtotheofficialnumberofentries(Kilgarriff,2001).Forexample,theentrylistfordayconstructedusingourmethodsfound136 ` MWUscomparedto82intheofficialinventory.The73dictionariesforthelexicalsamplewordsgaveriseto1372WordNetentriesand1722NODEentries.  #  5      ׀Only491entrieswerecommon(i.e.,nomappingswereavailableforthe   remaining1231NODEentries);881entriesinWordNetwerethereforeinaccessiblethroughNODE.Fortheentriesincommon,therewasanaverageof5.6senses,ofwhichonly64%weremappableintoWordNet.Theaprioriprobabilityofsuccessfulmappingintotheappropriate H WordNetsenseis0.064,thebaselineforassessingWSDviaanotherdictionarymappedintotheWordNetsensetaggedkeys.   #  6      ׀Eventhoughthisassumesaclearlyincorrectequallikelihoodof H eachsenseappearinginthelexicalsample,thelowprobabilitydiscouragedsubmittingarunusingNODEmappedintoWordNet. 4.1BasicMappingProcedures  X  TheNODEdictionariesweremappedintotheWordNetdictionariesusingaselectioninDIMAPforcomparingdefinitions.Thisfunctionalityallowsseveraloptions.Themappingcanbeperformedforentiredictionariesorforselectedentries;themappingcanbesavedtoafile,whereitwasusedwithperlscriptsthattooktheNODEWSDresultsandmappedthemintotheir )$( correspondingWordNetsensesintheSENSEVAL2format,suitableforscoringagainstthekeys.Themappingfunctionalityusesthreemeasuresoffit:(1)wordoverlapbetweendefinitions(withorwithoutastoplisttoexcludefrequentwords,andusingexactmatchesratherthanreducinginflectedformstotheirrootforms),(2)acomponentialanalysisthatexamineswhereagivensensefitswithinasemanticnetwork(i.e.,thehypernymsandothersemanticrelationsinWordNetandthecomparablyerelationsgeneratedduringprocessingandparsingNODEdefinitions),and(3)theeditdistancebetweentwodefinitions(i.e.,thenumberofinsertions,deletions,andreplacementsnecessarytoconvertonedefinitionintoanother).(See(Litkowski,1999)formoredetailsonthefirsttwomethodsand(Sierra&McNaught,2000)fordetailsonaligningdefinitionsusingeditdistance,whichhasbeenimplementedinDIMAP.)EditdistancewasnotusedintheNODEtoWordNetmapping.Themethodsfrom(Litkowski,1999)wererefinedconsiderablyinworkingwithadictionarypublisher(MacquariePtyLtdofAustralia)whohasasetof15dictionariesderivedfromitsmaindictionary(TheMacquarieDictionary,1997).Thesedictionaries,rangingfromthumbnailtochildren'stojuniorandconciseversions,weredevelopedatvarioustimesoverthepast20years,initiallybasedonpreviousversionsofthemaindictionary.Becausechangestothemaindictionarywerenotalwaysfiltereddowntothesmallerdictionaries,therewassomedriftinthephraseologyovertime.Weusedourmappingfunctionalitytocreatelinksbetweenthesesmallerdictionariesandthemaindictionarysothatatighterlinkagecouldbemaintained.Wewereabletodevelopthefunctionalitytoachievemappingsconsideredsufficientlyaccurate(around90percent)tomapallthedictionariesinpreparationforeditorialwork(Tardif,2000),usingwordoverlapwithastoplist. @-(, ЇSubsequently,wecombinedthecomponentialanalysismethodwiththewordoverlapmethod.Theextension,however,% ; islimitedtocaseswherenomappingisgeneratedbythewordoverlapmethod,sothatwehavecaseswhereamappingisachievedbycomponentialanalysisalone.ThiswasthecorefunctionalitythatwasusedinourinitialmappingbetweenNODEandWordNet,performedimmediatelyafterwehadsubmittedourofficialSENSEVAL2runs.*D$F" dN ddd dd dd 3$8(#(#,dd ,dd ,dd ,dd +  6d!   6Table5.LexicalSampleSenseMappings Rd=* " d RPartofSpeech Gd2! "d GTotal Md8' "d MMapped Md8' "d MPercent A7- " d A Adjectives E;   z@429z@E429 uk9 3 z@429 z@ j@215j@u215 rh9 3 j@215 j@  @@t; ?r50.1% PF< 3  @@ t; ? P Nouns E;   `@652`@E652 uk9 3 `@652 `@ Pz@421Pz@u421 rh9 3 Pz@421 Pz@  @@A2?r64.6% PF< 3  @@ A2? P Verbs F< p <@1679<@F1679 wm:p3 <@1679 <@ @1134@w1134 si:p3 @1134 @  @@"?s67.5% PF<p3  @@ "? P Total B8 \  0@B2760 oe6\3  0 @  0@o1770 oe6\ 3  0 @  @@B?o64.1%LB@\!3  @@  B? L*G$H" d" ddd dd dd dd D$F(#(#,dd ,dd ,dd ,dd +  6d!  p# 6Table6.AllWordsSenseMappings Rd=* p$" d RText Gd2! \%"d GTotal Md8' \&"d MMapped Md8' \'"d MPercent A7- \(" d A d00 F< H) @7195@F7195 wm:H*3 @7195 @ ƭ@3811ƭ@w3811 si:H+3 ƭ@3811 ƭ@  @@m?s53.0% PF<H,3  @@ m? P d01 F< 4- ӹ@6611ӹ@F6611 wm:4.3 ӹ@6611 ӹ@ P@3368P@w3368 si:4/3 P@3368 P@  @@DYrM?s50.9% PF<403  @@ DYrM? P d02 F<  1 @6175@F6175 wm: 23 @6175 @ @3027@w3027 si: 33 @3027 @  @@gz_?s49.0% PF< 43  @@ gz_? P Total B8  5  0@@B19981 oe6 63  0 @@  0@o10206 oe6 73  0 @  @@LZX?o51.1%LB@ 83  @@  LZX? LTables5and6showhowmanysensesofthecommonentriesweremappedfromNODEtoWordNet.Themuchlargernumberofsensesintheallwordscase,wherethenumberoftotalentrieswasaboutthesameasthelexicalsamplecase,reflectsthefactthatmanyofthewordswereverycommonwordswithmanysenses(e.g.,makeandgive)andinfact,werealsopresentin $X@ eachofthethreetexts.Ofthe1770mappingsforthelexicalsamplesenses,39%werebasedonthewordoverlapalone,20.5%onthecomponentialanalysisalone,16.7%usedwordoverlapandwereconfirmedbythecomponentialanalysis,and23.7%usedwordoverlapbutweredisconfirmedbythecomponentialanalysis.Noattempthasyetbeenmadetodeterminetheaccuracyofthemappings.Cursoryinspectionsuggeststhattheaccuracyismuchlessthan100 @-(J percent.Thelargepercentagewherewordoverlapandcomponentialanalysisdisagreedmaybeindicative,buttherearecaseswherethereisagreementfrombothmethodsandhandmappingwouldprovideadifferentmapping.ForWordNet,therewere2516sensesintheentriesthathadbeencreated(15.6%adjectives,33.3%nouns,and51.1%verbs).Thus,atleast746senseswereinaccessible;infact,thenumberissomewhatmore,sincenotinfrequently,severalNODEsensesmappedintoasingleWordNetsense,particularlyamongtheverbs.Manyofthesemappingsarenotdirectlyrelevanttothelexicalsampletask.Asindicatedabove,thesubdictionarycreationmadenodistinctionastopartofspeech.Forexample,theentriesfortheworksubdictionaryincludednounsensesfortheprimarywordandincludedentrieswhere H workwasanounconstituentofanMWU(suchassocialworkorworkpermit).Whilethe   absolutenumberoftheseentriesandsensesthatfalloutsidethepartofspeechoftheindividualtaskshasnotbeendeveloped,thesesituationsapplytobothWordNetandNODEanditisunlikelythatthepercentagesquotedabovewouldchangedramatically.Consideringthesemappingstatistics,withmanyWordNetentriesandsensesinaccessibleandmanymappingslikelytobeincorrect,theinitialresultsthatwereachievedseemquitesurprising.Therecallof75percentforthelexicalsampletaskand70percentfortheallwordstaskareisadirectreflectionofaninabilitytomapeitheranentryorasensethatresultedfromdisambiguationusingNODE.Theprecisionof40percentalsoreflectsinaccuratemappingsandsowouldlikelybeimprovedquiteabitthroughhandmanipulationofthemappings.Theremainingprecision @-(, wouldthenbeattributabletofailureofourdisambiguation.Ontheotherhand,thefactthatwewereabletoachievealevelofprecisioncomparabletowhatwasattainedusingWordNetsuggeststhemostfrequentsensesofthelexicalsamplewordswereabletobedisambiguatedandmappedcorrectlyintoWordNet. 4.2MappingattheEntryLevel   8  Thesignificantdiscrepancybetweentheentries(1231entriesinNODEnotinWordNetand871entriesinWordNetnotinNODE)inpartreflectstheusualeditorialdecisionsthatwouldbefoundinexamininganytwodictionaries.However,sinceWordNetisnotlexicographicallybased,manyofthedifferencesareindicativeoftheidiosyncraticdevelopmentofWordNet.Manyentriesarisefromtheneedsofplacingconceptsintoasemanticnetthatarenotactuallyrealizedincommonlanguage(e.g.,thesynset{animality,animalnature}asahyponymofnatureornaturalevent),   whereitisunlikelythatanimalnaturewouldoccurinnormallanguageuse.SinceWordNetused  freelyavailablesources,manyofthesemaybeoutdatedormayhavebeencompiledfromhighlytechnicalsourcesandnotreflectcurrentorcommonusage(e.g.,freepardonorthe14varietiesof X  yew).WordNetmayidentifyseveraltypesofanentity(e.g.,apricotbar,nougatbar,and 0"  chocolatebar),whereNODEmayuseonesense( anamountoffoodoranothersubstance $X" formedintoaregularnarrowblock)withoutcreatingseparateentriesthatfollowthisregularlexicalrule.NODE,ontheotherhand,isbasedmoreonlexicographicprinciplesandalsohasastrongcorpusbase.Thus,thereisanentryforheelbar( asmallshoporstallwhereshoesarerepaired, @-(, especiallywhilethecustomerwaits),whichwouldnotbeproductivelyformedbyanylexicalrule.Moresignificantly,thisdifferenceisreflectedinidiomaticverbphrases,whichaccountformanyoftheentriesinNODEnotinWordNet.Forthemostpart,verbphrasescontainingparticlesareequallypresentinbothdictionaries(e.g.,drawoutanddrawup),butNODEcontainsseveral 8  morenuancedphrases(e.g.,drawinone'shorns,drawsomeoneaside,keepone'sfigure,andpull  ` oneselftogether).NODEalsocontainsmanyidiomswhereanounisusedinaverbphrase(e.g.,  8  callitaday,keepone'smouthshut,andgobacktonature).   ThesediscrepanciesbetweenWordNetandNODEarenotuniquetoSENSEVAL2,butwouldexistwhenusinganytwosenseinventories,soitwouldbeusefultounderstandtheirimplicationsandmethodsfordealingwiththem.ForMWUentriesthatdonotreflectcommonusageorhavebeencreatedtoprovidenodesinasemanticnetwork,noproblemislikelytoariseinWSD,sincerealworldlanguageuseisunlikelytoarise.ThefactthatsuchentriesareinaccessibleintheNODEmappinghasnoeffectonWSDinNODEorWordNet.ForMWUsinWordNetthatareinaccessiblebecausetheyarenotpresentinthesourcedictionary(suchasforyeworbar),acorrectdisambiguationinNODEisnotgivencredit.Thissituationis 0"  actuallyamatterofgrain:Westernyewisaninstanceofyewandthiscouldbecapturedinthe $X" answerkeyeitherdirectlybymakingtheappropriatesenseofyewoneoftheanswersinthekey %0!$ (followedinacoupleofinstances,e.g.,forcommonsense),orpreferably,byidentifyingWestern '#& yewasasubsenseofyewinthesensehierarchiessothatatthecoarsegrain,yewwouldbea )$( correctanswer.Ingeneral,thissuggeststhatMWUsinadictionaryshouldpayparticularattentiontoseeingiftheMWUisaninstanceofitshead(usuallybyseeingiftheheadisthegenus @-(, termoftheMWU'sdefinition).ThepotentialnumberofthesecasesinSENSEVAL2waslessthan100,affectinganswersforbar,chair,channel,church,circuit,day,facility,holiday,  material,andafewotherisolatedcases.Theeffectonourscoreswouldhavequitesmall,since ` someoftheseMWUswerealsoinNODE.Thereversesituation,anMWUinNODEthatisnotpresentinWordNet,occurredmuchmoreoftenandhasadefiniteeffectonourscores.Inthesecases,sincethereisnomappingintoWordNet,ananswerisnotgenerated;thesecasesareasubsetofourNODEdisambiguatedanswersthatlessenedourrecall(1111casesintheoriginalrunwithNODE).Forexample,9of15unknownanswersfornaturerecognizedphrasesintheNODEdisambiguation(humannature, H naturetrail,bynature);theseanswersarecorrectlydisambiguatedinNODE(usuallywithsuch  p MWUshavingonlyonesense).Someofthesecases(about50)areduetothecorrespondingphrasesnothavingbeenidentifiedbytheSENSEVAL2lexicographers(e.g.,bynatureand   UnitedNations),butthereareatleastseveralhundredsuchMWUsrecognizedwhenusing  NODEasthesenseinventory.Thus,inthefirstinstance,wecansaythatourWSDwasprobablyabout5to10pointshigherthanourNODEtoWordNetscore.Further,sincewehadseveralbugsinconnectionwithourphraserecognitionroutines,wecanexpectafurtherfewpointsimprovementinthisarea.Inthesecondinstance,wewouldliketobeabletodomorethanjustmakeaclaimthatourdisambiguationwouldhavebeenhigher,sincetheproblemofdifferingMWUsetsisalwaysgoingtobeprevalentandproblematic.Foraphraselikenaturetrail,itispossibletodecomposethe h+&* phraseforanalysis.Inthissolution,thedefinition( asignpostedpaththroughthecountryside @-(, designedtodrawattentiontonaturalfeatures)wouldbeexaminedforitscorrespondencetobothconstituentwords.OneNODEdefinitionofnature( thecountryside,especiallywhen  picturesque)wouldhaveastrongmatchwiththedefinitionofnaturetrailandadefinitionof ` trail( abeatenpaththroughroughcountrysuchasawoodormoor)wouldmatch.Inthiscase, 8  sinceweareinterestedinmappingnature,wewouldusethemappingfortheidentifiedsenseas  ` thebasisforidentifyingaWordNetsense.Thiswouldyieldthecorrectanswer.(OneoftheSENSEVAL2lexicographersraisedaqueryaboutthefeasibilityofthisapproach, Eveninthecaseofgenuinecompounds,thedictionaryshouldfurtheridentifywhereoneorotherelementcanalsobeassignedtoamainsense(eg.'naturalhistory'tosense1of'natural'),asthatshouldaidthealgorithm'sglobal'understanding'ofthecontext.(Williams,2001))Wehavenotexploredtheextenttowhichourexamplewouldgeneralize;however,themappingapproachoutlinedheredoesseemworthyoffurtherexploration.Thecasewithverbphrasesmaybesomewhatmoreproblematicindevisingasuccessfulmappingstrategy,particularlywhentheyhavebecomesoidiomaticastohavelostanytietothewordsthatcomprisethem.Forthephrasekeepone'smouthshut( notsayanything,especiallynotreveala X  secret),therearesometiestovarioussensesofmouthasanorganofspeechthatmaygiveaclue 0"  toanappropriatesenseinthetargetdictionary.HHowever,ingeneral,thisisnotthecase. 4.3MappingattheSenseLevel  '#& Similarissuesariseinmappingindividualsenses:WordNetsensesmaybeinaccessiblesincethereisnoNODEsensethatmapsintothemandNODEsensesmaynotmapintoanyWordNetsense. @-(, Additionalcomplexitiesarisewhenwhenthemappingisincorrect;inthesecases,theidentifiedWordNetanswersarewrong,evenwhenthedisambiguationinNODEmayhavebeencorrect.Weexaminedourmappingsindetailtoseewhereimprovementsmightbepossible.Wefoundabuginourmappingroutineforadjectivesthatdidnotpickupproperlythosemappingswhereanadjectivesensewasasatelliteofanotheradjective;thisaccountedforaconsiderableportionofourincreaseinrecall(from55%to83%)andprecision(from0.288to0.434)intherevisedrunshowninTable1.Wewereabletomakeseveralotherchangesthatimprovedtheoverallrecallfrom74.3%to87.6%.WeobservedmanycaseswheretherewasonlyonesenseforanentryinWordNet,butthemappingdidnotyieldanyhitsbasedonthewordoverlaporthecomponentialanalysis.AlmostallthesecaseswereforMWUs;thesechangesimprovedrecall.Asnotedearlierindescribingtheentryforhappy,manydefinitionsinNODEhadanassociatedcollocationpattern.Wehadnot  takenthisintoaccountinourinitialmapping.WemodifiedourmappingsforthesesensessincetheyrequiredthattheexactpatternbepresentinWordNet.Inalargenumberofcases,thisactuallyremovedmappingsthathadbeenbasedontheheadword.Forexample,undercall,one 0"  senserequiredthecollocationcallcollect;theoriginalmappingpickedasenseofcallinWordNet $X" thatpertainedtocallingbytelephone;thismappingwasundone.Thereviewofthesecasesledtotheremoval,theaddition,ortherevisioninthemapping.Itisdifficulttoassesstheoveralleffectofthesechanges.Insomecases,itappearsthataremovalorrevisionmayhavechangedtheWordNetsensefromacorrectonetoanincorrectone.Ingeneral,though,therewasanimprovementinrecall.,butitappearsthattheseinstancesmayaccountforthesilightlylower @-(, precision..Anotherchangeinvolvedcaseswherenomappinghadbeenmade.NODEhasashallowhierarchy,sothatthereisamainsenseandperhapsseveralsubsenses.Althoughthewordingofthesubsensesmaybearnoclearrelationtothesupersense,wechangedthe nomappingtothemappingofthesupersensewhenthesupersensehadamapping.Thisisanalogoustousingacoarsegrainedanswer.Insomeinstances,whereasupersensehadnomappingbuthadonlyonesubsenseorallitssubsenseshadthesamemapping,weusedthismappingforthesupersenseaswell.Overall,thishadtheeffectofimprovingrecallandprecision.Weexperimentedonlyalittlewithhandmapping.Mappingforadjectiveswasparticularlydifficult,sincetheirdefinitionswereveryshortandhencehavingalowerlikelihoodoffindingexactwordinginWordNet.Forsimple,only1of12NODEdefinitionswasmappedintooneof   the7WordNetdefinitions.Ourinitialrecallwasonly16outof66andthisonemappinghadthebugforadjectivesmentionedabove,soourprecisionwas0.00.Aftercorrectingthebugandmakinghandmappings,ourrecallwentto65outof66andourprecisionincreasedto0.258.Formaterial,theprimarysensewasnotmapped;ourinitialrecallwasonly25of69,withcoarse 0"  grainedprecisionof0.232;makingthisonechangeinthemapincreasedrecallto100percentandprecisionto0.594.ThesecasesarestronglyindicativethatourdisambiguationwithNODEwasmuchhigherthanafterthemapping.Ofmostsignificancetothesensemappingistheclassicalproblemof lumping (attachingmore h+&* significancetosimilaritiesthantodifferences)and splitting (attachingmoreimportancetoto H-(, differencesthantosimilarities).AsinglesenseinNODEmaycorrespondtoseveralsensesinWordNet(e.g.,NODEhasonesenseofyewforboththetreeandthewood,whileWordNethas  two);severalsensesinNODEmaycorrespondtoasinglesenseinWordNet.Thelatterproblemgenerallydoesnotaffectourresults,sinceitwillbescoredascorrectregardlessofwhichsenseinNODEisidentified.WhenanNODEdefinitioncorrespondstomorethanonesenseinWordNet,wemaydisambiguatecorrectlyinNODE,butreceivenoscoresincewehavemappedintothewrongdefinition;ifthedefinitionsinWordNethavebeenrelatedhierarchically,wemayreceivecreditatthecoarsegrain,butnotatthefinegrain.Otherwise,itwouldbenecessaryforthelexicographertohavetaggedmorethanonesenseasbeingcorrect(e.g.,whenitappearedthatbothsensesofyewwereactivatedbyagivencontext).Inthecaseofgraceful,therewasonlyone H senseinNODE( havingorshowinggraceorelegance)thatdidnotmapintooneofthetwosensesinWordNet( characterizedbybeautyofmovement,style,formetc.;notawkwardand suggestingtaste,ease,andwealth).Asaresult,despitehavingdisambiguatedcorrectlyinNODE,ourprecisionandrecallinWordNetwas0.00.ChoosingthefirstsenseinWordNetwouldgiveaprecisionof0.793;choosingthesecondsense0.310;andchoosingbothsenses0.552(inthreeinstances,thelexicographerschosebothsensesascorrect).Toexaminethisissueinmoredetail,weperformedamappingfromWordNettoNODEfortheworddevelop.NODEhas8definitionsandWordNethas21.ThefirstdefinitioninNODEisa %0!$ supersensewith5subsenses;theothertwoaremajorsenses.Wewereabletomap7ofthe8sensesintoWordNet.InWordNet,thereare8majorsenses,withonehaving6subsensesandanother4subsenses.Wewereabletomap13ofthe21sensesintoNODE.Byperformingthisreversemapping,weareabletoidentifyallsensesofWordNetthathadpositivescoresinthe @-(, mappingfromNODEtoWordNet,ratherthanjusttheonethatwasselected.Thus,insteadofusingonlyasinglesenseforthemapping,wecanuseallWordNetsensesthatmappedintothesameNODEsense.Inthisway,thefirstsenseofNODEwasmappedinto4WordNetsenses,anothersensewasmappedinto3WordNetsenses,twoothersensesmappedinto2WordNetsenses,threeothersinto1WordNetsense,andthefinalsensewashandmappedintoanotherWordNetsensethathadnotbeenmappedintoanyNODEsense.Sinceourmappingprocessisgenerallyintendedtooperateonthebasisofidentifyingcomponentsofmeaning,thereversemappingallowsustogroupsensestogetherthathavesimilarmeaningpotentials(see(Hanks,2000)forfurtherdetails).Usingthisonetomanymapping,theprocessbeginsbydisambiguatingthelexicalsamplefordevelopwithNODEtoidentifyasinglesenselocatedwithintheNODEhiearchy.Iftheselected H sensemapsintomorethanoneWordNetsense,themultipleWordNetsensesarereturnedastheanswerstobejudgedagainsttheanswerkey.When,forexample,4WordNetsensesaregivenastheanswer,eachisassumedtohaveaweightof0.25(unlessspecificallygivendifferentweights).Atthefinegrainedlevel,wewouldreceiveascoreofonly0.25ifoneofthemultipleanswersiscorrect,whereasifwehadonlyasingleanswerthatiscorrect,wewouldreceive1.00(see(Kilgarriff&Rosenzweig,2000)forSENSEVALscoring).Bydilutingouranswer,wereceivealowerscorewhenwehavethecorrectsense;however,whenwehaveawronganswer,wemayreceivepartialcreditforhavingmultipleanswers.Atthecoarsegrain,withmultipleanswers,wearemorelikelytohitupononeofthecorrectanswers,eventhoughourscoreforeachmaybedilutedsomewhat.UsingWordNet,wehad13and24outof69instancescorrectatthetwograins;usingNODEmappedintoWordNetwithonlyasinglesensechosen,wehad3and12 @-(, correct;usingNODEwithmultipleWordNetsenses,ourscoreswere3.5and11.1.Tentatively,then,itappearsasifthismethodofactivatingmultiplesensesdoesnotmakeasignificantdifference,butthisconclusionwarrantsfurtherinvestigation. 5.DiscussionandConclusions   ` Ingeneral,mappingfromNODEtoWordNethasbeenshowntobeviable.Theavailabilityofalargereferenceset(theSENSEVAL2corpora)hasenabledtheinvestigationofdisambiguationwithanothersenseinventory.WithscoresbasedonmappingthatarecomparabletothoseachievedusingWordNetasthesenseinventory,wecanbeconfidentthatourdisambiguationwiththisothersenseinventoryisbetter,perhapssignificantlyso.Wecanalsobeconfidentthatimprovingourdisambiguationwiththisothersenseinventoryislikelytoachieveevenbetterresults.Asmentionedearlier,wewereunabletoimplementmanyroutinesbecauseoftimeconstraints.ManyoftheseroutinesareintendedtotakeadvantageofdetailedlexicalinformationcontainedinNODE.Aswedeveloptheseroutines,wecanuseourexistingmappingtoconvertourdisambiguationsinNODEintoWordNetsensesandknowthatanyimprovementsinourscoreswillbelegitimate.Asindicatedabove,theseroutineswillexaminetyperestrictions(e.g.,transitivity),presenceofaccompanyinggrammaticalconstituents(e.g.,infinitivephraseorcomplements),formrestrictions(suchasnumberandparticipial),grammaticalrole(e.g.,asamodifier),andselectionalrestrictions(suchassubject,object,modificand,andinternalarguments).MuchofthisinformationiseithernotavailableinWordNet,availableonlyinanunstructuredway,onlyimplicitlypresent,oreveninconsistently.(Delfs,2001)describesa @-(, senseforbeginthathasaninfinitivecomplement,butitispresentonlyinanexamplesentenceand  notexplicitlyencodedwithaverbframecommonlyusedinWordNet.Similarly,fortrain,two  sentenceswere taggedtotransitivesensesdespitebeingintransitivebecauseagainweweredealingwithanimplieddirectobject,andthesemanticsofthesensethatwaschosenfit;wejustpretendedthattheobjectwasthere.InNODE,itisnotuncommonforaverbsensetobeleabelledexpelicitlyashavingbothatransitiveandanintransitiverealization.HOwowever,inInimplementingfurtherdisambiguationroutines,itwillbemuchmoredifficulttogleantheappropriatecriteriaforsenseselectioninWordNetwithanoutthisexplicitthisinformationthantoobtainitinNODEandmapitintoWordNet.HavingdemonstratedthefeasibilityofmappingforWSD,itispossibletoexaminemanyissuesincomparingentriesanddefinitionsacrosslexicalresources.Mostimportantly,usingSENSEVAL2data,itispossibletogaugechangesinmappingproceduresbythechangesintheWSDresults.Wehaveexaminedseveralaspectsofthemapping,andmanymorepossibleavenuesofinvestigationareavailable.Wecanexaminereasonsforfailureatboththeentryandsenselevelsandcaseswherethecomponentialanalysismethodgaveresultsdifferentfromwordoverlapmehtod.Wecanexploretheeffectofusingornotusingstoplists,reducinginflectedformstotheirrootforms,andtherelativeweightingofdifferentmethods(includingeditdistance,whichwasnotusedinthemapping).Forexample,byaligningdefinitionsbasedoneditdistance,itispossibletoexaminesubstitutionofsynonymsoralternativephraseologies. 6.OtherResearch  h+&*  @-(, Ѐ(Atkins,1991)characterizesthedifficultyincomparinglexicalresources.Sheidentifiessomeproblemswithcurrentdictionarydefinitions,particularlylumpingversussplitting,similarsenseoverlaps,ambiguityofcomponents,andfuzzysenseboundaries.Shesuggestsapproachestosensedifferentiationandidentifieselementsthatneedtobeaccountedfor,includingmodulation(thewayinwhichasenseismodifiedbycontext),regularpolysemy(presenceoflexicalrules),anduseofahierarchywithinthesensesofasingleword.ShedevelopedanexperimentalframeworkforimplementingthesenotionswithinasingledictionaryintheHectorprojectanddata(Atkins,1993),usedinSENSEVAL1andservingasaprecursortothedevelopmentofNODE(Hanks,2001).TheFrameNetproject(Fillmore&Baker,2001)alsotracesitsrootsinparttothiswork,specificallyinthecharacterizationofmeaningcomponents,implementedinframesemantics.Oureffortsatmappingbetweenlexicalresourcesisstronglymotivatedbythisearlierwork,specificallyattemptingtousecharacterizationsofcomponentsofmeaningandtheplacementofanentryandsensewithinasensehierarchyandsemanticnetworkasthebasisformapping.Whilethesecharacterizationsarestillatanearlystage,theyhavenowprovedsufficientforuseinlargescaleNLPapplications.AtarecentWordNetworkshoponresourceintegration,(Daude,etal.,2001),(Green,etal.,2001),(Burgun&Bodenreider,2001),and(Asanoma,2001)presentvariousinvestigationsformappingbetweenvariouslexicalresources.Ouralgorithmsaremostsimilartothoseof(Daude,etal.,2001)and(Asanoma,2001).WeinvestigatedourmappingbetweenWordNet1.6andWordNet1.7foroneofsubdictionariescreatedhere(call,with46entries).Asindicatedearlier, )$( ourmappingfirstexaminesdefinitions(glosses)andthenthestructuralrelationships.Ourmappingswerequiterobust,muchstrongerthanforNODEtoWordNet,asmightbeexpected @-(, sincethechangesbetweenthetwoversionsofWordNetwouldbeexpectedtobesmallrelativetotheoverallsize.Specifically,ourmappingidentifiedfiveadditionalentries(distresscall,call  attention,callintoquestion,calltheshots,andcallthetune).Therewere103senses,ofwhich ` 101weremappedand2sensesidentifiedthatwerenotinWordNet1.6.Almost90percentofthesensesmappedidenticallybothintheirglossesandintheirpositionsinthehierarchy.Theremainderweredisconfirmedbyourstructuralmapping,indicatingthatthesesenseshadonlybeenmovedintheirhierarchicalpositionbetweenWordNet1.6andWordNet1.7.Weexpectthattheseresultswouldholdingeneral,indicatingthatourmappingsbetweenthetwoversions(oranyapplicationswhereaconversionfromoneversiontotheotherwouldbeimportant)wouldbehighlyreliable. Summary  H UsingthegoldstandardsensedisambiguateddatasetprovidedbySENSEVAL2,wehaveshownhowitispossibletoachievewidergeneralityintheuseoflexicalresourcesratherthanrelyingsolelyonWordNet.Itisfirstnecessarytocarefullypreparethelexicalresourcesthatwillbeused,sothattheymaybemappedsuccessfullyintoWordNet.Analysisofthedisambiguationresultsafterthemappingcanprovideofthesourceoffailures,whethertheyresideinthedisambiguationitselforinthemapping.Suchananalysisprovidesadeeperunderstandingofthedisambiguationprocess.Inaddition,themappingsthemselvesprovidearichsourceofunderstandingaboutlexicalsemanticswhenexaminedwithinthecontextofaspecifictask,namely,disambiguation.Wehave @-(, beenabletoexaminemorecloselysemanticissuesthathavelongbeenthesourceofmuchdifficulty.Sinceourmethodshaveshownmanyavenuesforfurtherexploration,wecanexpecttocharacterizetheseissuesevenbetter.% :  Acknowledgements   ` IwishtothankOxfordUniversityPressformakingNODEavailable(PatrickHanksandRobScriven)andformanyusefuldiscussions(GlynnisChantrellandJudyPearsall).XX#XGMXX#@  % References  H     6 rr! ,X` XDX6 jUKUS.,XX  e   Asanoma,N.(2001,June3-4).AlignmentofOntologies:WordNetandGoi-Taikei.InWordNetandOtherLexical H Resources:Applications,ExtensionsandCustomizations.NAACL2001SIGLEXWorkshop.Pittsburgh,PA:  AssociationforComputationalLinguistics.e : ݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  d  Atkins,B.T.S.(1991).Buildingalexicon:Thecontributionoflexicography.InternationalJournalofLexicography,  4(3),167-204.d9݌ 8 Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX    Atkins,S.(1993).Toolsforcomputer-aidedlexicography:TheHectorproject.InPapersinComputational    Lexicography.COMPLEX'93.Budapest.݌ X"" Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX    Burgun,A.,&Bodenreider,O.(2001,June3-4).Comparingterms,conceptsandsemanticclassesinWordNetand #8$ theUnifiedMedicalLanguageSystem.InWordNetandOtherLexicalResources:Applications,Extensionsand x% & Customizations.NAACL2001SIGLEXWorkshop.Pittsburgh,PA:AssociationforComputationalLinguistics.U݌ 'X"( Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX    Daude,J.,Padro,L.,&Rigau,G.(2001,June3-4).ACompleteWn1.5toWn1.6Mapping.InWordNetandOther (#* LexicalResources:Applications,ExtensionsandCustomizations.NAACL2001SIGLEXWorkshop.Pittsburgh, (*x%, PA:AssociationforComputationalLinguistics.݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX    Delfs,L.(2001,6Sep).Verbkeys.݌ H-(0 Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX    @2Fellbaum,C.(1998).WordNet:Anelectroniclexicaldatabase.Cambridge,Massachusetts:MITPress.݌  Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  @  Fillmore,C.J.,&Baker,C.F.(2001,June3-4).FrameSemanticsforTextUnderstanding.InWordNetandOther r LexicalResources:Applications,ExtensionsandCustomizations.NAACL2001SIGLEXWorkshop.Pittsburgh, R PA:AssociationforComputationalLinguistics.@ ݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  J"  Green,R.,Pearl,L.,Dorr,B.J.,&Resnik,P.(2001,June3-4).LexicalResourceIntegrationacrossthe T  Syntax-SemanticsInterface.InWordNetandOtherLexicalResources:Applications,ExtensionsandCustomizations.  4  NAACL2001SIGLEXWorkshop.Pittsburgh,PA:AssociationforComputationalLinguistics.J"#݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  %  Hanks,P.(2000).DoWordMeaningsExist?ComputersandtheHumanities,34(1-2),205-15.%U&݌ 6  Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  '  Hanks,P.(2001,May29).WordSenseDisambiguation.'(݌ H  Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  )  Ide,N.,&Veronis,J.(1993).Extractingknowledgebasesfrommachine-readabledictionaries:Havewewastedour   time?KB&KS93.Tokyo.)*݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  Y,  Kilgarriff,A.(1997)."IDon'tBelieveinWordSenses."ComputersandtheHumanities,31(2),91-113.Y,.-݌  \ Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  .  Kilgarriff,A.(2001,July).EnglishLexicalSampleTaskDescription.InSENSEVAL-2.AssociationforComputational  LinguisticsSIGLEXWorkshop.Toulouse,France../݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  A1  Kilgarriff,A.,&Rosenzweig,J.(2000).FramewordandResultsforEnglishSENSEVAL.Computersandthe  p Humanities,34(1-2),15-48.A12݌  Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  3  Litkowski,K.C.(2000).SENSEVAL:TheCLResearchExperience.ComputersandtheHumanities,34(12),153 r  158.34݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  (6  Litkowski,K.C.(2001).UseofMachineReadableDictionariesforWord-SenseinSENSEVAL-2.InSENSEVAL-2. "$ AssociationforComputationalLinguisticsSIGLEXWorkshop.Toulouse,France.(66݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  8  Litkowski,K.C.(1999,21-22June).TowardsaMeaning-FullComparisonofLexicalResources.Associationfor &f!( ComputationalLinguisticsSpecialInterestGroupontheLexiconWorkshop.CollegePark,MD.89݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  ;  TheMacquarieDictionary(A.Delbridge,J.R.L.Bernard,D.Blair,S.Butler,P.Peters,&C.Yallop,Eds.)(3rd). h)$,  (1997).Australia:TheMacquarieLibraryPtyLtd.;X<݌ *H&. Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  7>  Miller,G.A.(2001,June).TowardWordNet2.WordNetandOtherLexicalResources:Applications,Extensionsand   Customizations.NAACL2001Workshop.Pittsburgh,PA.7> ?݌̌   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  @  TheNewOxfordDictionaryofEnglish(J.Pearsall,Ed.).(1998).Oxford:ClarendonPress.@A݌ R Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX   XXXX    6 rr! ,X` XDX6 jUKUS.,XX  ?C  Richardson,S.D.(1997).Determiningsimilarityandinferringrelationsinalexicalknowledgebase[Diss],NewYork,   NY:TheCityUniversityofNewYork.# C#?C3D݌ T  Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXXXXXX   #XXXX$E#    6 rr! ,X` XDX6 jUKUS.,XX  |F  Sierra,G.,&McNaught,J.(2000).ExtractingSemanticClustersfromMRDsforanOnomasiologicalSearch  f  Dictionary.InternationalJournalofLexicography,13(4),264-86.|FQG݌   Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  *I  Tardif,R.(2000,Oct.20).Mappeddatabases.*II݌ h  Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX       6 rr! ,X` XDX6 jUKUS.,XX  CK  Williams,J.(2001,June29).TheHectordata.CKL݌ *z  Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX   CKL݌ *z  Ќ   6 ,X` XDXrr6 jUS.,UK.,XXXXXX