نتایج جستجو برای: vocabulary coverage

تعداد نتایج: 111577  

2016
Matti Varjokallio Dietrich Klakow

This work explores the use of unsupervised morph segmentation along with statistical language models for the task of vocabulary expansion. Unsupervised vocabulary expansion has large potential for improving vocabulary coverage and performance in different natural language processing tasks, especially in lessresourced settings on morphologically rich languages. We propose a combination of unsupe...

2011
Katarina Heimann Mühlenbock Mats Lundälv

A corpus of easy-to-read texts in combination with a base vocabulary pool for Swedish was used in order to build a basic vocabulary. The coverage of these entries by symbols in an existing AAC database was then assessed. We finally suggest a method for enriching the expressive power of the AAC language by combining existing symbols and in this way illustrate additional concepts.

2011
Markus Nußbaum-Thom Amr El-Desoky Mousa Ralf Schlüter Hermann Ney

Compound words are a difficulty for German speech recognition systems since they cause high out-of-vocabulary and word error rates. State of the art approaches augment the language model by the fragments of compounds in order to increase lexical coverage, lower the perplexity and out-of-vocabulary rate. The fragments are tagged in order to concatenate subsequent equally tagged fragments in the ...

2005

To find an easy-to-use, automated tool to identify technical vocabulary applicable to learners at various levels, nine statistical measures were applied to the 7.3-million-word ‘commerce and finance’ component of the British National Corpus. The resulting word lists showed that each statistical measure extracted a different level of specialized vocabulary as measured by word length, vocabulary ...

Farjami, Hadi,

This paper provides a fairly detailed corpus-based vocabulary profile of the Iranian EFL books used in public schools. To this end, the WordPerfect files of all the seven books were converted to text format to get rid of the formatting features and be compatible with the software used for analysis. The software tools used were the Compleat Lexical Tutor suite, version 6.2 (Cobb, 2011), AntConc ...

2006
Mark Davies

Certainly one of the primary goals in developing materials for second language learners should be to create materials that reflect vocabulary and grammar that these learners are likely to encounter in the “real world”. There is little to be gained from having students memorize long lists of vocabulary in a textbook, if the learners never again encounter these words once they venture out in the ...

1999
Ciro Martins João Paulo da Silva Neto Luís B. Almeida

To achieve an acceptable degree of generalization, current speech recognition systems work with large vocabularies, which, among other e ects, result in higher search spaces and consequently lower system performance. For highly in ectional languages, such as the Portuguese, a much larger vocabulary is required for the same tasks coverage and a much larger text corpus for extraction of word-base...

2007
Gopala Krishna Anumanchipalli

This report investigates issues of lexical coverage in Indian languages. More specifically, a parallel analysis of Out-of-Vocabulary words is made in Telugu and Tamil. Although generic, this study is focussed on understanding the morphological aspects in these languages as necessary for speech recognition. The observations reveal that morphological analysis and preprocessing can increase the le...

2014
Irina Illina Dominique Fohr Georges Linarès

Proper names are usually key to understanding the information contained in a document. Our work focuses on increasing the vocabulary coverage of a speech transcription system by automatically retrieving new proper names from contemporary diachronic text documents. The idea is to use in-vocabulary proper names as an anchor to collect new linked proper names from the diachronic corpus. Our assump...

Journal: :Studies in health technology and informatics 2004
Seung-Bin Han Miyoung Kwak Seunghee Kim Sooyoung Yoo Heekyung Park Jeun Kijoo Jeapil Kim Myoungsun Choi Jinwook Choi

The Unified Medical Language System (UMLS) is a rich source of knowledge in the biomedical domain. In this paper, we evaluated the coverage of UMLS as compared with Korean medical terms and identified differences in concept representation between two vocabulary sets. We measured the concept coverage by mapping clinical terms extracted from the discharge records of Seoul National University Hosp...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید