نتایج جستجو برای: lexical segmentation
تعداد نتایج: 95920 فیلتر نتایج به سال:
Story segmentation of news broadcasts has been shown to improve the accuracy of the subsequent processes such as question answering and information retrieval. In previous work, a decision tree trained on automatically extracted lexical and acoustic features was trained to predict story boundaries, using hypothesized sentence boundaries to define potential story boundaries. In this paper, we emp...
This research aims at validating a methodology for the study of segmentation markers in large corpora. Two indices signalling a thematic break in a text are proposed. The first is based on the presence of a paragraph mark and employs the odds ratio to identify the best markers. The second takes into account lexical cohesion between sentences via an index resulting from latent semantic analysis....
This paper extends existing word segmentation models to take non-linguistic context into account. It improves the token F-score of a top performing segmentation models by 2.5% on a 27k utterances dataset. We posit that word segmentation is easier in-context because the learner is not trying to access irrelevant lexical items. We use topics from a Latent Dirichlet Allocation model as a proxy for...
In this article we address the task of automatic text structuring into linear and nonoverlapping thematic episodes at a coarse level of granularity. In particular, we deal with topic segmentation on multi-party meeting recording transcripts, which pose specific challenges for topic segmentation models. We present a comparative study of two probabilistic mixture models. Based on lexical features...
Topic Segmentation is the task of breaking documents into topically coherent multiparagraph subparts. In particular, Topic Segmentation is extensively used in Text Summarization to provide more coherent results by taking into account raw document structure. However, most methodologies are based on lexical repetition that show evident reliability problems or rely on harvesting linguistic resourc...
The Dotplotting method has been widely used for text segmentation for its merits in detecting lexical repetition in global context. However, a theoretical analysis of its segmentation criterion function finds several deficiencies. The original function can not make full use of the text structure features and does not suit the text segmentation task very well. We propose an improved model (MMD m...
The increasing quantity of video material requires methods to help users navigate such data, among which topic segmentation techniques. The goal of this article is to improve ASRbased topic segmentation methods to deal with peculiarities of professional-video transcripts (transcription errors and lack of repetitions) while remaining generic enough. To this end, we introduce confidence measures ...
Infants develop different kinds of long-term linguistic representation as early as in their first year of life. We examined the interaction of early lexical access and prosodic processing. It is proposed that familiar word forms are stored in a protolexicon before linking any concepts to them, enabling early (proto)lexical segmentation from fluent speech. Additionally, previous results strength...
A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward architecture with no online feedback, and a lex...
This thesis examines an important issue in spoken word recognition; how the perceptual system segments connected speech into lexical units or words. Research on this topic has investigated the role of different sources of information in dividing up the speech stream: acoustic cues in the speech signal, statistical regularities in the structure of the language or through the identification of in...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید