Fast Calculation of Translation Model Score for Simultaneous Automatic Speech Recognition of Multilingual Audio Contents
نویسندگان
چکیده
This paper addresses automatic speech recognition (ASR) for multilingual audio contents, such as international conference recordings and broadcast news. For handling such contents efficiently, a simultaneous ASR is promising. Conventionally, ASR has been performed independently, namely, language by language, although multilingual speech, which consists of utterances in several languages representing identical meaning, is available. We previously proposed a bilingual speech recognition framework based on statistical ASR and machine translation in which bilingual ASR is performed simultaneously and complementarily. In this simultaneous recognition framework, ASR systems use not only acoustic and language model scores but also a translation model (TM) score. In this study, we investigate an efficient calculation method of TM scores. A TM score represents how a sentence corresponds to another sentence of different languages. In general, between different languages a word can be translated into various words. Moreover, word orders are different. Considering these characteristics, TM scores should be modeled statistically. In a statistical translation model, each word in source language is modeled to have a possibility to be translated into every word in target language. For instance, for the matching (alignment) of n-word sentences and m-word sentences, there are n to the m-th power word-alignments. For a strict calculation of statistical TM scores, first, we calculate the probability of each alignment and then calculate their sum. However, this calculation costs too much and is inadequate for a real-time system. In this study, we reduce the computational cost. Specifically, since for almost all alignments, their probabilities are much smaller compared with the highest alignment probability, we regard the highest alignment probability as a TM score. We compared TM score calculation methods for time and accuracy in a Japanese ASR using English information based on a bilingual recognition framework. We significantly reduced processing time for TM score calculation without any degradation of ASR accuracy.
منابع مشابه
Automatic speech recognition framework for multilingual audio contents
Automatic speech recognition (ASR) for multilingual audio contents, such as international conference recordings and broadcast news, is addressed. For handling such contents efficiently, a simultaneous ASR is promising. Conventionally, ASR has been performed independently, namely language by language, although multilingual speech, which consists of utterances in several languages representing th...
متن کاملDevelopment of the "VoiceTra" Multi-Lingual Speech Translation System
This study introduces large-scale field experiments of VoiceTra, which is the world’s first speech-to-speech multilingual translation application for smart phones. In the study, approximately 10 million input utterances were collected since the experiments commenced. The usage of collected data was analyzed and discussed. The study has several important contributions. First, it explains system ...
متن کاملA system for automatic broadcast news summarisation, geolocation and translation
An increasing amount of news content is produced in audiovideo form every day. To effectively analyse and monitoring this multilingual data stream, we require methods to extract and present audio content in accessible ways. In this paper, we describe an end-to-end system for processing and browsing audio news data. This fully automated system brings together our recent research on audio scene a...
متن کاملLecture Translator - Speech translation framework for simultaneous lecture translation
Foreign students at German universities often have difficulties following lectures as they are often held in German. Since human interpreters are too expensive for universities we are addressing this problem via speech translation technology deployed in KIT’s lecture halls. Our simultaneous lecture translation system automatically translates lectures from German to English in real-time. Other s...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010