Target Structured Cross Language Model Refinement

نویسندگان

  • Terrence Martin
  • Kishan Thambiratnam
  • Sridha Sridharan
چکیده

The task of porting Automatic Speech Recognition (ASR) technology to many languages is hindered by a lack of transcribed acoustic data, which in turn prevents the development of accurate acoustic models necessary for the recognition task. To overcome this problem, recent research has sought to exploit the similarity of sounds across languages, and use this similarity to adapt models from one or more data rich languages for use in recognising data poor target languages. Pronunciation variation and cross language context mismatch combine to make the task more difficult then a monolingual ASR application. In this paper, we examine the utility of recent pronunciation modelling approaches and evaluate their performance on the Indonesian and Spanish languages. Finally, we introduce a novel technique which ensures that the state distributions developed using the source language data are more closely aligned with those in the target language, thus improving classification accuracy. This technique achieved an improvement in word recognition accuracy of 19.5% absolute percentage points, when compared to standard knowledge based cross lingual mapping approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Refinement Framework for Cross Language Text Categorization

Cross language text categorization is the task of exploiting labelled documents in a source language (e.g. English) to classify documents in a target language (e.g. Chinese). In this paper, we focus on investigating the use of a bilingual lexicon for cross language text categorization. To this end, we propose a novel refinement framework for cross language text categorization. The framework con...

متن کامل

Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision

Täckström, O. 2013. Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision. Acta Universitatis Upsaliensis. Studia Linguistica Upsaliensia 14. xii+215 pp. Uppsala. ISBN 978-91-554-8631-0. Contemporary approaches to natural language processing are predominantly based on statistical machine learning from large amounts of text, which has been manually annotated with the ling...

متن کامل

Noun phrases as building blocks for cross-language Search Assistance

This paper presents a Foreign-Language Search Assistant that uses noun phrases as fundamental units for document translation and query formulation, translation and refinement. The system (a) supports the foreign-language document selection task providing a cross-language indicative summary based on noun phrase translations, and (b) supports query formulation and refinement using the information...

متن کامل

Iranian EFL Teachers’ Cultural Identity in the Course of their Profession

Grounded on Hofstede's (1986) dichotomous model of collectivism/individualism, this study explored Iranian English as a foreign language (EFL) teachers' cultural identity. A sequential mixed methods procedure was adopted to examine their cultural orientation and the impact of length of experience on their degree of propensity to absorb the target language culture. A total of 120 female and male...

متن کامل

Using Structured Queries for Disambiguation in Cross-Language Information Retrieval

Bilingual transthr dictionaries are an important resource for query translation in cross-language text retrieval. However, term translation is not an isomorphic process, so dictionary-based systems must address the problem of ambiguity in language translation. In this paper, we claim that boolea~l conjunction (the AND operator) provides siml)le and automatic disambiguation in the target languag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004