How Much Processing Is Required for Cross-Document Coreference?

نویسندگان

  • Amit Bagga
  • Breck Baldwin
چکیده

Cross-document coreference occurs when the same person, place, event, or concept is discussed in more than one text source. Computer recognition of this phenomenon is important because it helps break \the document boundary" by allowing a user to examine information about a particular entity from multiple text sources at the same time. Cross-document coref-erence was identiied as one of the potential tasks for the Sixth Message Understanding Conference (MUC-6) but was not included as a formal task because it was considered too ambitious(Grishman, 94). In this paper we describe and compare our position with that of the MUC-6 organizing committee regarding the amount of processing needed to resolve cross-document coreferences. We back our position by providing details of our cross-document coreference system .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Person Cross Document Coreference with Name Perplexity Estimates

The Person Cross Document Coreference systems depend on the context for making decisions on the possible coreferences between person name mentions. The amount of context required is a parameter that varies from corpora to corpora, which makes it difficult for usual disambiguation methods. In this paper we show that the amount of context required can be dynamically controlled on the basis of the...

متن کامل

Name Perplexity

The accuracy of a Cross Document Coreference system depends on the amount of context available, which is a parameter that varies greatly from corpora to corpora. This paper presents a statistical model for computing name perplexity classes. For each perplexity class, the prior probability of coreference is estimated. The amount of context required for coreference is controlled by the prior core...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

Enhancing Cross Document Coreference of Web Documents with Context Similarity and Very Large Scale Text Categorization

Cross Document Coreference (CDC) is the task of constructing the coreference chain for mentions of a person across a set of documents. This work offers a holistic view of using document-level categories, sub-document level context and extracted entities and relations for the CDC task. We train a categorization component with an efficient flat algorithm using thousands of ODP categories and over...

متن کامل

A Methodology for Cross-Document Coreference

Amit Bagga General Electric CRD, PO Box 8 Schenectady, NY 12301 [email protected] Alan W. Biermann Dept. of Computer Science Duke University Durham, NC 27708 [email protected] Cross-document coreference occurs when the same person, place, event, or concept is discussed in more than one text source. Computer recognition of this phenomenon is important because it helps break \the document boundary" ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995