Language model selection based on the analysis of Japanese spontaneous speech on travel arrangement task
نویسندگان
چکیده
This paper deals with the issue of language model selection based on the analysis of data collection for spontaneous speech in Japanese in the travel arrangement task which contains five different subtasks. The procedure of transcription and segmentation of the Japanese spontaneous speech in Romanized transcription is described. The use of topic-dependent separated language model were evaluated in calculating the perplexity and applying it into Japanese speech recognition of the travel arrangement task corpus. The reduction of perplexity was shown and the increase of speech recognition was performed by use of the subtopic language model.
منابع مشابه
Data Collection and Transliteration of Japanese Spontaneous Database in the Travel Arrangement Task Domain
This paper describes the method to construct and transcribe Japanese spontaneous speech data for VERBMOBIL, the German research project of speech translation.. Spontaneous spoken dialogue database is the basis for developing speech and language processing for dialogue systems such as speech translation system. The extended data of human-to-human spoken dialogue in the scenario of travel arrange...
متن کاملToward translating Korean speech into other languages
This paper describes research activities of ETRI in multi-lingual spontaneous speech translation. We have developed Korean-toEnglish, Korean-to-Japanese speech translation system prototype that includes 5,000 word spontaneous Korean speech recognizer, Korean-English and Korean-Japanese translators, and Korean speech synthesizer with spontaneous prosody in the travel planning task. We utilize mu...
متن کاملThe relationship between task repetition and language proficiency
Task repetition is now considered as an important task-based implementation variable which can affect complexity, accuracy, and fluency of L2 speech. However, in order to move towards theorizing the role of task repetition in second language acquisition, it is necessary that individual variables be taken into account. The present study aimed to investigate the way task r...
متن کاملSelection of Multi-Word Expressions from Web N-gram Corpus for Speech Recognition
This paper proposes a method for constructing a statistical language model with multi word expressions (MWEs) selected from Google Japanese Web N-gram. MWEs are concatenated words that consist of idiomatic expressions or long-length morpheme sequences used frequently. In this paper a method for selecting the effective MWEs that improve the language model based on co-occurrence probabilities of ...
متن کاملThe Relationship between Iranian EFL Learners’ Ambiguity Tolerance and the Accuracy of Their Task-based Oral Speech
Various individual differences, including ambiguity tolerance (AT), have gained momentum because of the influence they can exert on the process and product of learning, and thereby, on various aspects of the learner’s interlanguage system such as accuracy of oral speech. The present study was undertaken to examine the extent to which Iranian EFL learners’ AT was significantly correlated with th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999