Target Structured Cross Language Model Refinement
نویسندگان
چکیده
The task of porting Automatic Speech Recognition (ASR) technology to many languages is hindered by a lack of transcribed acoustic data, which in turn prevents the development of accurate acoustic models necessary for the recognition task. To overcome this problem, recent research has sought to exploit the similarity of sounds across languages, and use this similarity to adapt models from one or more data rich languages for use in recognising data poor target languages. Pronunciation variation and cross language context mismatch combine to make the task more difficult then a monolingual ASR application. In this paper, we examine the utility of recent pronunciation modelling approaches and evaluate their performance on the Indonesian and Spanish languages. Finally, we introduce a novel technique which ensures that the state distributions developed using the source language data are more closely aligned with those in the target language, thus improving classification accuracy. This technique achieved an improvement in word recognition accuracy of 19.5% absolute percentage points, when compared to standard knowledge based cross lingual mapping approach.
منابع مشابه
A Refinement Framework for Cross Language Text Categorization
Cross language text categorization is the task of exploiting labelled documents in a source language (e.g. English) to classify documents in a target language (e.g. Chinese). In this paper, we focus on investigating the use of a bilingual lexicon for cross language text categorization. To this end, we propose a novel refinement framework for cross language text categorization. The framework con...
متن کاملPredicting Linguistic Structure with Incomplete and Cross-Lingual Supervision
Täckström, O. 2013. Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision. Acta Universitatis Upsaliensis. Studia Linguistica Upsaliensia 14. xii+215 pp. Uppsala. ISBN 978-91-554-8631-0. Contemporary approaches to natural language processing are predominantly based on statistical machine learning from large amounts of text, which has been manually annotated with the ling...
متن کاملNoun phrases as building blocks for cross-language Search Assistance
This paper presents a Foreign-Language Search Assistant that uses noun phrases as fundamental units for document translation and query formulation, translation and refinement. The system (a) supports the foreign-language document selection task providing a cross-language indicative summary based on noun phrase translations, and (b) supports query formulation and refinement using the information...
متن کاملIranian EFL Teachers’ Cultural Identity in the Course of their Profession
Grounded on Hofstede's (1986) dichotomous model of collectivism/individualism, this study explored Iranian English as a foreign language (EFL) teachers' cultural identity. A sequential mixed methods procedure was adopted to examine their cultural orientation and the impact of length of experience on their degree of propensity to absorb the target language culture. A total of 120 female and male...
متن کاملUsing Structured Queries for Disambiguation in Cross-Language Information Retrieval
Bilingual transthr dictionaries are an important resource for query translation in cross-language text retrieval. However, term translation is not an isomorphic process, so dictionary-based systems must address the problem of ambiguity in language translation. In this paper, we claim that boolea~l conjunction (the AND operator) provides siml)le and automatic disambiguation in the target languag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004