Event Recognition from News Webpages through Latent Ingredients Extraction

نویسندگان

  • Rui Yan
  • Yu Li
  • Yan Zhang
  • Xiaoming Li
چکیده

We investigate the novel problem of event recognition from news webpages. “Events” are basic text units containing news elements. We observe that a news article is always constituted by more than one event, namely Latent Ingredients (LIs) which form the whole document. Event recognition aims to mine these Latent Ingredients out. Researchers have tackled related problems before, such as discourse analysis and text segmentation, with different goals and methods. The challenge is to detect event boundaries from plain contexts accurately and the boundary decision is affected by multiple features. Event recognition can be beneficial for topic detection with finer granularity and better accuracy. In this paper, we present two novel event recognition models based on LIs extraction and exploit a set of useful features consisting of context similarity, distance restriction, entity influence from thesaurus and temporal proximity. We conduct thorough experiments with two real datasets and the promising results indicate the effectiveness of these approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Event Extraction using Ontology Directed Semantic Grammar

The task of extracting and constructing knowledge base from news is still a subject of ongoing research. The obtained knowledge base is useful for many applications, such as a question answering system. Football news always gains enormous interest from many football fan clubs. Hence, there are needs to extract certain information from this news in timely fashion. This paper proposes a new appro...

متن کامل

A Two-Stage Extraction Method for Events and Their Relationship in Injurious Incidents Monitoring ?

The monitoring of injurious incidents such as violence, natural disaster and infectious disease outbreak play an important role to the national security and social stability. One feasible solution of the task is to gathering information from the latest news WebPages. This paper presents a two-stage method capable of accurately and efficiently extracting injurious incidents information and the r...

متن کامل

Mining Event Temporal Boundaries from News Corpora through Evolution Phase Discovery

Currently news flood spreads throughout the web. The techniques of Event Detection and Tracking makes it feasible to gather and structure text information into events which are constructed online automatically and updated temporally. Users are usually eager to browse the whole event evolution. With the huge quantity of documents, it is almost impossible for users to read all of them. In this pa...

متن کامل

News Thread Extraction Based on Topical N-Gram Model with a Background Distribution

Automatic thread extraction for news events can help people know different aspects of a news event. In this paper, we present a method of extraction using a topical N-gram model with a background distribution (TNB). Unlike most topic models, such as Latent Dirichlet Allocation (LDA), which relies on the bag-of-words assumption, our model treats words in their textual order. Each news report is ...

متن کامل

Enhancing Event Descriptions through Twitter Mining

We describe a simple IR approach for linking news about events, detected by an event extraction system, to messages from Twitter (tweets). In particular, we explore several methods for creating event-specific queries for Twitter and provide a quantitative and qualitative evaluation of the relevance and usefulness of the information obtained from the tweets. We showed that methods based on utili...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010