The Elements of Automatic Summarization

نویسندگان

  • Daniel Jacob Gillick
  • Thomas Griffiths
چکیده

The Elements of Automatic SummarizationbyDaniel Jacob GillickDoctor of Philosophy in Computer ScienceUniversity of California, BerkeleyProfessor Nelson Morgan, Chair This thesis is about automatic summarization, with experimental results on multi-document news topics: how to choose a series of sentences that best represents a col-lection of articles about one topic. I describe prior work and my own improvementson each component of a summarization system, including preprocessing, sentencevaluation, sentence selection and compression, sentence ordering, and evaluation ofsummaries. The centerpiece of this work is an objective function for summariza-tion that I call "maximum coverage". The intuition is that a good summary coversas many possible important facts or concepts in the original documents. It turnsout that this objective, while computationally intractable in general, can be solvedefficiently for medium-sized problems and has reasonably good fast approximate so-lutions. Most importantly, the use of an objective function marks a departure fromprevious algorithmic approaches to summarization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Systematic literature review of fuzzy logic based text summarization

Information Overloadrq  is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq    informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...

متن کامل

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

Bewt­e: Basic Elements with Transformations for Evaluation

This paper describes BEwT­E (Basic El­ ements with Transformations for Evalua­ tion), an automatic system for evaluating text summarization or machine transla­ tion tasks. BEwT­E is a new, more so­ phisticated implementation of the BE framework that uses transformations to match short, syntactically well­defined units called Basic Elements (BEs) that are lexically different yet semantically sim...

متن کامل

Automatic Text Summarization Using a Machine Learning Approach

In this paper we address the automatic summarization task. Recent research works on extractive-summary generation employ some heuristics, but few works indicate how to select the relevant features. We will present a summarization procedure based on the application of trainable Machine Learning algorithms which employs a set of features extracted directly from the original text. These features a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011