Assigning Function Tags to Parsed Text

نویسندگان

  • Don Blaheta
  • Eugene Charniak
چکیده

It is generally recognized that the common nonterminal labels for syntactic constituents (NP, VP, etc.) do not exhaust the syntactic and semantic information one would like about parts of a syntactic tree. For example, the Penn Treebank gives each constituent zero or more ‘function tags’ indicating semantic roles and other related information not easily encapsulated in the simple constituent labels. We present a statistical algorithm for assigning these function tags that, on text already parsed to a simplelabel level, achieves an F-measure of 87%, which rises to 99% when considering ‘no tag’ as a valid choice.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Integrated Tool for Annotating Historical Corpora

E-Dictor is a tool for encoding, applying levels of editions, and assigning part-ofspeech tags to ancient texts. In short, it works as a WYSIWYG interface to encode text in XML format. It comes from the experience during the building of the Tycho Brahe Parsed Corpus of Historical Portuguese and from consortium activities with other research groups. Preliminary results show a decrease of at leas...

متن کامل

Chinese Function Tag Labeling

Function tag assignment has been studied for English and Spanish. In this paper, we address the question of assigning function tags to parsed sentences in Chinese. We show that good performance for Chinese function tagging can be achieved by using labeling method, extending the work of Blaheta (2004). In this method, the objects being modeled are syntax trees which require some mechanism to con...

متن کامل

بررسی مقایسه‌ای تأثیر برچسب‌زنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی

In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...

متن کامل

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

Comparative Study of Naïve Bayesian Classifier and Transformation - Based Learning for Myanmar Function Tagging

This paper describes the use of two machine learning techniques, Naive Bayesian classifier (NB) and transformation-based learning (TBL), to address the task of assigning function tags to Myanmar sentences. Function tagging is a process of assigning syntactic categories like subject, object, time and location to each word in the text document. It is an important step in Natural Language Processi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000