HMM Content Model for TAC2010 Summarization Challenge

نویسندگان

  • Darla Magdalena Shockley
  • Michael Strube
چکیده

We present the HITS submission for the 2010 TAC Guided Summarization Task. We focus on the main multi-document summarization task, rather than the update task. We implement a baseline extractive summarization system from the literature (Barzilay and Lee, 2004) which uses a Hidden Markov Model to assign sentences content or topic labels, predicts which topics most likely appear in the summary, and constructs the summaries from these topics. We find that this model performs more poorly than expected, as compared to results shown in previous work. These differences may be attributed to the changes we made to the algorithm to accommodate the multi-document summarization task and the lack of human-annotated domains for the training data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TMSP: Topic Guided Manifold Ranking with Sink Points for Guided Summarization

Guided summarization is an extension of query-focused multidocument summarization. We proposed a novel ranking algorithm, Topic Guided Manifold Ranking with Sink Points (TMSP) for guided summarization tasks of TAC2010. TMSP is a topic extended version of Manifold Ranking with Sink Points (MRSP), which handles the Update Summarization tasks of TAC2009 well. We adopt the TMSP and MRSP methods to ...

متن کامل

CLASSY Query-Based Multi-Document Summarization

Our summarizer is based on an HMM (Hidden Markov Model) for sentence selection within a document and a pivoted QR algorithm to generate a multi-document summary. Each year, since we began participating in DUC in 2001, we have modified the features used by the HMM and have added linguistic capabilities in order to improve the summaries we generate. Our system, called “CLASSY” (Clustering, Lingui...

متن کامل

Automatic Segmentation and Summarization of Spoken Lectures

The ever-increasing number of online lectures has created an unprecedented opportunity for distance learning. Most online lectures are presented as unstructured text, audio and/or video files which make it di cult for students to locate relevant lectures and browse through them. In this thesis, we investigated several automatic lecture segmentation and summarization algorithms. Automatic lectur...

متن کامل

Extractive Chinese Spoken Document Summarization Using Probabilistic Ranking Models

The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling fram...

متن کامل

Learning to Model Domain-Specific Utterance Sequences for Extractive Summarization of Contact Center Dialogues

This paper proposes a novel extractive summarization method for contact center dialogues. We use a particular type of hidden Markov model (HMM) called Class Speaker HMM (CSHMM), which processes operator/caller utterance sequences of multiple domains simultaneously to model domain-specific utterance sequences and common (domainwide) sequences at the same time. We applied the CSHMM to call summar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010