Centrality Measures for Non-Contextual Graph-Based Unsupervised Single Document Keyword Extraction

نویسنده

Natalie Schluter

چکیده

The manner in which keywords fulfill the role of being central to a document is frustratingly still an open question. In this paper, we hope to shed some light on the essence of keywords in scientific articles and thereby motivate the graph-based approach to keyword extraction. We identify the document model captured by the text graph generated as input to a number of centrality metrics, and overview what these metrics say about keywords. In doing so, we achieve state-of-the-art results in unsupervised non-contextual single document keyword extraction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effects of Graph Generation for Unsupervised Non-Contextual Single Document Keyword Extraction

Abstract. This paper presents an exhaustive study on the generation of graph input to unsupervised graph-based non-contextual single document keyword extraction systems. A concrete hypothesis on concept coordination for documents that are scientific articles is put forward, consistent with two separate graph models : one which is based on word adjacency in the linear text–an approach forming th...

متن کامل

Keyword and Keyphrase Extraction Using Centrality Measures on Collocation Networks

Keyword and keyphrase extraction is an important problem in natural language processing, with applications ranging from summarization to semantic search to document clustering. Graph-based approaches to keyword and keyphrase extraction avoid the problem of acquiring a large in-domain training corpus by applying variants of PageRank algorithm on a network of words. Although graph-based approache...

متن کامل

A Graph Degeneracy-based Approach to Keyword Extraction

We operate a change of paradigm and hypothesize that keywords are more likely to be found among influential nodes of a graph-ofwords rather than among its nodes high on eigenvector-related centrality measures. To test this hypothesis, we introduce unsupervised techniques that capitalize on graph degeneracy. Our methods strongly and significantly outperform all baselines on two datasets (short a...

متن کامل

Keyword Extraction from a Single Document Using Centrality Measures

Keywords characterize the topics discussed in a document. Extracting a small set of keywords from a single document is an important problem in text mining. We propose a hybrid structural and statistical approach to extract keywords. We represent the given document as an undirected graph, whose vertices are words in the document and the edges are labeled with a dissimilarity measure between two ...

متن کامل

Graph-Based Keyword Extraction for Single-Document Summarization

In this paper, we introduce and compare between two novel approaches, supervised and unsupervised, for identifying the keywords to be used in extractive summarization of text documents. Both our approaches are based on the graph-based syntactic representation of text and web documents, which enhances the traditional vector-space model by taking into account some structural document features. In...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Centrality Measures for Non-Contextual Graph-Based Unsupervised Single Document Keyword Extraction

نویسنده

چکیده

منابع مشابه

Effects of Graph Generation for Unsupervised Non-Contextual Single Document Keyword Extraction

Keyword and Keyphrase Extraction Using Centrality Measures on Collocation Networks

A Graph Degeneracy-based Approach to Keyword Extraction

Keyword Extraction from a Single Document Using Centrality Measures

Graph-Based Keyword Extraction for Single-Document Summarization

عنوان ژورنال:

اشتراک گذاری