نتایج جستجو برای: vocabulary coverage
تعداد نتایج: 111577 فیلتر نتایج به سال:
This work explores the use of unsupervised morph segmentation along with statistical language models for the task of vocabulary expansion. Unsupervised vocabulary expansion has large potential for improving vocabulary coverage and performance in different natural language processing tasks, especially in lessresourced settings on morphologically rich languages. We propose a combination of unsupe...
A corpus of easy-to-read texts in combination with a base vocabulary pool for Swedish was used in order to build a basic vocabulary. The coverage of these entries by symbols in an existing AAC database was then assessed. We finally suggest a method for enriching the expressive power of the AAC language by combining existing symbols and in this way illustrate additional concepts.
Compound words are a difficulty for German speech recognition systems since they cause high out-of-vocabulary and word error rates. State of the art approaches augment the language model by the fragments of compounds in order to increase lexical coverage, lower the perplexity and out-of-vocabulary rate. The fragments are tagged in order to concatenate subsequent equally tagged fragments in the ...
To find an easy-to-use, automated tool to identify technical vocabulary applicable to learners at various levels, nine statistical measures were applied to the 7.3-million-word ‘commerce and finance’ component of the British National Corpus. The resulting word lists showed that each statistical measure extracted a different level of specialized vocabulary as measured by word length, vocabulary ...
This paper provides a fairly detailed corpus-based vocabulary profile of the Iranian EFL books used in public schools. To this end, the WordPerfect files of all the seven books were converted to text format to get rid of the formatting features and be compatible with the software used for analysis. The software tools used were the Compleat Lexical Tutor suite, version 6.2 (Cobb, 2011), AntConc ...
Certainly one of the primary goals in developing materials for second language learners should be to create materials that reflect vocabulary and grammar that these learners are likely to encounter in the “real world”. There is little to be gained from having students memorize long lists of vocabulary in a textbook, if the learners never again encounter these words once they venture out in the ...
To achieve an acceptable degree of generalization, current speech recognition systems work with large vocabularies, which, among other e ects, result in higher search spaces and consequently lower system performance. For highly in ectional languages, such as the Portuguese, a much larger vocabulary is required for the same tasks coverage and a much larger text corpus for extraction of word-base...
This report investigates issues of lexical coverage in Indian languages. More specifically, a parallel analysis of Out-of-Vocabulary words is made in Telugu and Tamil. Although generic, this study is focussed on understanding the morphological aspects in these languages as necessary for speech recognition. The observations reveal that morphological analysis and preprocessing can increase the le...
Proper names are usually key to understanding the information contained in a document. Our work focuses on increasing the vocabulary coverage of a speech transcription system by automatically retrieving new proper names from contemporary diachronic text documents. The idea is to use in-vocabulary proper names as an anchor to collect new linked proper names from the diachronic corpus. Our assump...
The Unified Medical Language System (UMLS) is a rich source of knowledge in the biomedical domain. In this paper, we evaluated the coverage of UMLS as compared with Korean medical terms and identified differences in concept representation between two vocabulary sets. We measured the concept coverage by mapping clinical terms extracted from the discharge records of Seoul National University Hosp...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید