lexical segmentation

نتایج جستجو برای: lexical segmentation

تعداد نتایج: 95920 فیلتر نتایج به سال:

St Reading Cross-genre Feature Comparisons for Spoken Sentence Segmentation 5

2007

SEBASTIEN CUENDET DILEK HAKKANI-TUR JAMES FUNG BENOIT FAVRE ELIZABETH SHRIBERG

Automatic sentence segmentation of spoken language is an important precursor to downstream natural language processing. Previous studies combine lexical and prosodic fea19 tures, but can impose significant computational challenges because of the large size of feature sets. Little is understood about which features most benefit performance, partic21 ularly for speech data from different speaking...

متن کامل

Segmenting Broadcast News Streams using Lexical Chains

2002

Nicola Stokes Joe Carthy Alan F. Smeaton

In this paper we propose a course-grained NLP approach to text segmentation based on the analysis of lexical cohesion within text. Most work in this area has focused on the discovery of textual units that discuss subtopic structure within documents. In contrast our segmentation task requires the discovery of topical units of text i.e. distinct news stories from broadcast news programmes. Our sy...

متن کامل

Unsupervised Topic Segmentation Based on Word Co- occurrence and Multi-Word Units for Text Summarization

2005

Gaël Dias Elsa Alves

Topic Segmentation is the task of breaking documents into topically coherent multi-paragraph subparts. In particular, Topic Segmentation is extensively used in Passage Retrieval and Text Summarization to provide more coherent results by taking into account raw document structure. However, most methodologies are based on lexical repetition that show evident reliability problems or rely on harves...

متن کامل

Semantic Feature Structure Extraction from Documents Based on Extended Lexical Chains

2017

Terry Ruas William Grosky

The meaning of a sentence in a document is more easily determined if its constituent words exhibit cohesion with respect to their individual semantics. This paper explores the degree of cohesion among a document's words using lexical chains as a semantic representation of its meaning. Using a combination of diverse types of lexical chains, we develop a text document representation that can be u...

متن کامل

Minimally Supervised Learning of Semantic Knowledge from Query Logs

2008

Mamoru Komachi Hisami Suzuki

We propose a method for learning semantic categories of words with minimal supervision from web search query logs. Our method is based on the Espresso algorithm (Pantel and Pennacchiotti, 2006) for extracting binary lexical relations, but makes important modifications to handle query log data for the task of acquiring semantic categories. We present experimental results comparing our method wit...

متن کامل

Do statistical segmentation abilities predict lexical-phonological and lexical-semantic abilities in children with and without SLI?

Journal: :Journal of Child Language 2013

متن کامل

Morphological pre-processing for Turkish to English statistical machine translation

2009

Arianna Bisazza Marcello Federico

We tried to cope with the complex morphology of Turkish by applying different schemes of morphological word segmentation to the training and test data of a phrase-based statistical machine translation system. These techniques allow for a considerable reduction of the training dictionary, and lower the out-of-vocabulary rate of the test set. By minimizing differences between lexical granularitie...

متن کامل

Comparative Discourse Analysis of Parallel Texts

Journal: :CoRR 1994

Pim van der Eijk

A quantitative representation of discourse structure can be computed by measuring lexical cohesion relations among adjacent text elements. These representations have previously been proposed to deal with sub-topic text segmentation. In a parallel corpus, similar representations can be derived for versions of a text in various languages. These can be used for parallel segmentation and as an alte...

متن کامل

Simultaneous Dialogue Act Segmentation and Labelling using Lexical and Syntactic Features

2009

Ramón Granell Stephen G. Pulman Carlos D. Martínez-Hinarejos

Segmentation of utterances and annotation as dialogue acts can be helpful for several modules of dialogue systems. In this work, we study a statistical machine learning model to perform these tasks simultaneously using lexical features and incorporating deterministic syntactic restrictions. There is a slight improvement in both segmentation and labelling due to these restrictions.

متن کامل

Unsupervised and Semi-supervised Myanmar Word Segmentation Approaches for Statistical Machine Translation

2013

Ye Kyaw Thu Andrew Finch Eiichiro Sumita Yoshinori Sagisaka

In statistical machine translation (SMT), word segmentation is generally a necessary step for languages that do not naturally delimit words. For many low-resource languages there are no word segmentation tools, and research on word segmentation for these languages is often quite scarce. In this paper, we study several plausible methods for Myanmar word segmentation for machine translation in or...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید