The Evolution Of Machine-Tractable Dictionaries

نویسندگان

  • Cheng-ming Guo
  • Changning Huang
  • Junping Gong
  • Jin Li
چکیده

I, hltli~ttut:lion T i l e u s e l u l n e s s o f Mach ine -T rac tab le Dict ionar ies (MTDs) ill atlCtilllatic SCllSe taggillg el running text ix clearly demons t ra ted in Tong , el ill., (1993). While previous work (Yarowsky, 1992: Gale, el al., 1092, 1903) lelies heavi ly on tile role of statistics, TOllg'S sysienl illilkes use of (Ill(: MTD as well as m<~ Machine Readable l)iciionarles (MRI)s) in gin examplebased reasonillg process ill laggint~, sellsC Ilillllbers l() l lovel words, eompol lnds ~,\,ords, gilld pJllHSes in tile input text. The hit rate of correcl ,~,Cll:-;c lagging rtlnS ax high as 90%. The lewes! IlJl rtll¢ c:tcr iecordcd was 70%. The sysieill is eotlsidercd a necessaiy inechanisin for the construc ihm ()1 annotated Chinese ]~#ollilor C.'orlJora (SillCILIIC, If) t) I) I1o111 IlllllliIllL (;hilleSe tc',t . l'~girlier work (Wilks, et al., 1'090) identified a dislii101ion belwcel l MRI )s and NqTJ)s. Th is eal×'r recogll izes a further dist inction between two types el MT1)s> i.e., Type I and Type il MTDs. Type l MTI)s arc bui l t ()ll the def in i t ion pr i in i i i vcs (fi on0 pgiri i0ulal M R D wtlerea,,-; Type I1 MTI)s are btl i l t ',In the semanli0, prmli t ives of one |lgiiciellJar ilgiCtlr:.ll IgillpIlage. The MI 'D funct i tnl ing il l T,,mg's sysCenl belongs co die l ' ype I category. The evolut ion fr lmi Type I t(~ Type I1 MTI)s iopiesents all evoitiCiOli ()1 the i)r i init ives ti~c MT1)s arc constrtlcted o11. Th0 atithors bel ieve that, although it is leasible lo derive gi l lall lral sol (11 de l in i l ion pri lnil ixc~ fr(lill o11o parlicti[gir MRD, a descriptive set of Selllantl0 primitives (Wiiks, 1977) of Olle par i icuh i r imlura] ILingilat2e is prefcigibly deri~ed irl)i l l inol'e than t)llC ila[tll'aJ s(ltlrc0 Sllch :t.s :1 diction',u'y. T h i s p.os;ilip.n rcpresellis gi bgickoll frol l l the previous olle Ihgil ctailns tile dcr iva l ion of a natural set el senlanli0 piinlici ',es lrol l l oile part iet l iar MRI), par th :u lg i r i ) lhe Jz)#1~,l#1fHl I)i¢lionarv (!f ('olllettl])orarv l','ngli~h (I ~l)O(:t",) ((]uo. 1989). Three Type l M T i ) s 0olnplcJcd al TsiTI!.',hli~ (] l l i~,clsity w i l l bc presented. 'l hose include MINIIA)OCI); consii-tie/ed for I,oilgingin l ) ic i ioanries, lhc M T i ) versiou of 7he Mo&'r. (?hillese Di¢'liOllflr)~ ¢!/" Gem, ra l ('[lill~'A'e ( ' ] l t lraf ' l t ' rs ( M'['I)X I A N'I ' ()NG) (1 ;il, It)leT)> and Ihe M T I ) VClSi(lll oJ" 'lTle Ml l / l i Jhm' l iona l I)icliollarv o/ Moderll (.'hillese IVord~ (MTD I ) [J()(K)N(;NI",N(~) (l:eng & Zh(lu, I t)~O). l i f l m t s to dcri~e natural ~;0ts ~)1 semant ic pri lni l ivos l l l /nl Typt; I MTDs girt Ihcn des0rihcd These inclll(le work (Ill detccting 0Jr0 tl]ar dcl ill tiOllS wit ln n t)llC parliciilgir MT1), work on colnpl l lh lg the coinpai i l~i l i iy I/I word SellSCS beCw0cfi IW() IIW~noJillgtlal di0tiongirics, and the most recent step at der{vmg a ilatura] set o! Chincse senlaniic pr imi l ives Into1 Typ0 i M'I 'Ds (11 (Jhinese. The work ~as done on ~tii1.4 worksigii ions with the (7 eoinF, tlter lallgUgige.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine tractable dictionaries as tools and resources for NL prosessing

This paper discusses three different but related large-scale computational methods for the transformation of machine readable dictionaries (MRDs) into machine tractable dictionaries, i.e., MRDs converted into a format usable for natural language processing tasks. The MRD used is The Longman Dictionary of Contemporary English.

متن کامل

Building a Semantic-Primitive-Based Lexical Consultation System

The paper describes the design of semanticprimitive-based lexical consultation system and the possible processes which will be performed on a mahine-readable dictionary (MRD) and corpus to produce a machine-tractable dictionary (MTD) and tractable corpus automatically. Linguistic tools and reources are created during or after the processes.

متن کامل

A Historical Lexical Database of Swedish . The O . S . A Project

Large historical dictionaries have sometimes been called information graves because of the difficulty to perform systematic searches in the material. Recently, there have been efforts to make these dictionaries machine tractable. The O.S.A project is carrying out the computerization of the largest historical dictionary of Swedish, Svenska Akademiens ordbok (SAOB). This paper describes the main ...

متن کامل

Detection of inconsistencies in concept classifications in a large dictionaryLREC 2006 Proceedings

The EDR electronic dictionary is a machine-tractable dictionary developed for advanced computer-based processing of natural language. This dictionary comprises eleven sub-dictionaries, including a concept dictionary, word dictionaries, bilingual dictionaries, co-occurrence dictionaries, and a technical terminology dictionary. In this study, we focus on the concept dictionary and aim to revise t...

متن کامل

Example-Based Sense Tagging of Running Chinese Text

This paper describes a sense tagging technique for the automatic sense tagging of running Chinese text. The system takes as input running Chinese text, and outputs sense disambiguated text. Whereas previous work (Yarowsky, 1992; Gale, et al. , 1992, 1993) relies heavily on the role of statistics, the present system makes use of Machine Readable/Tractable Dictionaries (Wilks, et al. , 1990; Guo,...

متن کامل

Detection of inconsistencies in concept classifications in a large dictionary — Toward an improvement of the EDR electronic dictionary —

The EDR electronic dictionary is a machine-tractable dictionary developed for advanced computer-based processing of natural language. This dictionary comprises eleven sub-dictionaries, including a concept dictionary, word dictionaries, bilingual dictionaries, co-occurrence dictionaries, and a technical terminology dictionary. In this study, we focus on the concept dictionary and aim to revise t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994