نتایج جستجو برای: lexical segmentation

تعداد نتایج: 95920  

2007
SEBASTIEN CUENDET DILEK HAKKANI-TUR JAMES FUNG BENOIT FAVRE ELIZABETH SHRIBERG

Automatic sentence segmentation of spoken language is an important precursor to downstream natural language processing. Previous studies combine lexical and prosodic fea19 tures, but can impose significant computational challenges because of the large size of feature sets. Little is understood about which features most benefit performance, partic21 ularly for speech data from different speaking...

2002
Nicola Stokes Joe Carthy Alan F. Smeaton

In this paper we propose a course-grained NLP approach to text segmentation based on the analysis of lexical cohesion within text. Most work in this area has focused on the discovery of textual units that discuss subtopic structure within documents. In contrast our segmentation task requires the discovery of topical units of text i.e. distinct news stories from broadcast news programmes. Our sy...

2005
Gaël Dias Elsa Alves

Topic Segmentation is the task of breaking documents into topically coherent multi-paragraph subparts. In particular, Topic Segmentation is extensively used in Passage Retrieval and Text Summarization to provide more coherent results by taking into account raw document structure. However, most methodologies are based on lexical repetition that show evident reliability problems or rely on harves...

2017
Terry Ruas William Grosky

The meaning of a sentence in a document is more easily determined if its constituent words exhibit cohesion with respect to their individual semantics. This paper explores the degree of cohesion among a document's words using lexical chains as a semantic representation of its meaning. Using a combination of diverse types of lexical chains, we develop a text document representation that can be u...

2008
Mamoru Komachi Hisami Suzuki

We propose a method for learning semantic categories of words with minimal supervision from web search query logs. Our method is based on the Espresso algorithm (Pantel and Pennacchiotti, 2006) for extracting binary lexical relations, but makes important modifications to handle query log data for the task of acquiring semantic categories. We present experimental results comparing our method wit...

2009
Arianna Bisazza Marcello Federico

We tried to cope with the complex morphology of Turkish by applying different schemes of morphological word segmentation to the training and test data of a phrase-based statistical machine translation system. These techniques allow for a considerable reduction of the training dictionary, and lower the out-of-vocabulary rate of the test set. By minimizing differences between lexical granularitie...

Journal: :CoRR 1994
Pim van der Eijk

A quantitative representation of discourse structure can be computed by measuring lexical cohesion relations among adjacent text elements. These representations have previously been proposed to deal with sub-topic text segmentation. In a parallel corpus, similar representations can be derived for versions of a text in various languages. These can be used for parallel segmentation and as an alte...

2009
Ramón Granell Stephen G. Pulman Carlos D. Martínez-Hinarejos

Segmentation of utterances and annotation as dialogue acts can be helpful for several modules of dialogue systems. In this work, we study a statistical machine learning model to perform these tasks simultaneously using lexical features and incorporating deterministic syntactic restrictions. There is a slight improvement in both segmentation and labelling due to these restrictions.

2013
Ye Kyaw Thu Andrew Finch Eiichiro Sumita Yoshinori Sagisaka

In statistical machine translation (SMT), word segmentation is generally a necessary step for languages that do not naturally delimit words. For many low-resource languages there are no word segmentation tools, and research on word segmentation for these languages is often quite scarce. In this paper, we study several plausible methods for Myanmar word segmentation for machine translation in or...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید