نتایج جستجو برای: lexical segmentation

تعداد نتایج: 95920  

1998
Helen Gaylard Peter Hancox

A major objection to top-down accounts of lexical recognition has been that they are incompatible with an account of acquisition, it being argued that bottom-up segmentation must precede lexical acquisition. We counter this objection by presenting a top-down account of lexical acquisition. This is made possible by the adoption of a flexible criterion as to what may constitute a lexical item dur...

1994
Chu-Ren Huang Keh-Jiann Chen Yun-yan Yang

This paper describes a characters-based Chinese collocation system and discusses the advantages of it over a traditiolml word-based systcm. Since wordbreaks are not conventionally marked in Chinese text corpora, a character-based collocation system has the dual advantages of avoiding pre-proccssing distortion and directly accessing sub-lexical information. Furthermore, word-based collocational ...

Journal: :Computer Speech & Language 2012
Camille Guinaudeau Guillaume Gravier Pascale Sébillot

Transcript-based topic segmentation of TV programs faces several difficulties arising from transcription errors, from the presence of potentially short segments and from the limited number of word repetitions to enforce lexical cohesion, i.e., lexical relations that exist within a text to provide a certain unity. To overcome these problems, we extend a probabilistic measure of lexical cohesion ...

2013
Anca Simon Guillaume Gravier Pascale Sébillot

A probabilistic segment model combining lexical cohesion and disruption for topic segmentation Identifying topical structure in any text-like data is a challenging task. Most existing techniques rely either on maximizing a measure of the lexical cohesion or on detecting lexical disruptions. A novel method combining the two criteria so as to obtain the best trade-off between cohesion and disrupt...

Journal: :Journal of Speech, Language, and Hearing Research 2000

2005
Shi Wuguang

Chinese word segmentation is a fundamental and important issue in Chinese information processing. In order to find a unified approach for Chinese word segmentation, the author develop a Chinese lexical analyzer PCWS using direct maximum entropy model. The paper presents the general description of PCWS, as well as the result and analysis of its performance at the Second International Chinese Wor...

2015
Yancui Li Hongyu Feng Wenhe Feng

This paper addresses Chinese discourse segmentation based on punctuation mark. Particularly, we propose various kinds of lexical, syntactic, position and punctuation features to train classifiers for Chinese discourse segmentation. Experimental results on CDTB (Chinese Discourse Treebank) show that our method based on punctuation mark is appropriate for Chinese discourse segmentation with 89.2%...

2011
Mimi Lu Cheung-Chi Leung Lei Xie Bin Ma Haizhou Li

This paper proposes to perform probabilistic latent semantic analysis (PLSA) for broadcast news (BN) story segmentation. PLSA exploits a deeper underlying relation among terms beyond their occurrences thus conceptual matching can be employed to replace literal term matching. Different from text segmentation, lexical based BN story segmentation has to be carried out over LVCSR transcripts, where...

2011
Yugo Murawaki Sadao Kurohashi

A key factor of high quality word segmentation for Japanese is a high-coverage dictionary, but it is costly to manually build such a lexical resource. Although external lexical resources for human readers are potentially good knowledge sources, they have not been utilized due to differences in segmentation criteria. To supplement a morphological dictionary with these resources, we propose a new...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید