Uncertainty Decoding in Automatic Speech Recognition

نویسنده

  • Reinhold Häb-Umbach
چکیده

The term uncertainty decoding has been phrased for a class of robustness enhancing algorithms in automatic speech recognition that replace point estimates and plug-in rules by posterior densities and optimal decision rules. While uncertainty can be incorporated in the model domain, in the feature domain, or even in both, we concentrate here on feature domain approaches as they tend to be computationally less demanding. We derive optimal decision rules in the presence of uncertain observations and discuss simplifications which result in computationally efficient realizations. The usefulness of the presented statistical framework is then exemplified for two types of realworld problems: The first is improving the robustness of speech recognition towards incomplete or corrupted feature vectors due to a lossy communication link between the speech capturing front end and the backend recognition engine. And the second is the well-known and extensively studied issue of improving the robustness of the recognizer towards environmental noise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uncertainty Decoding for Noise Robust Automatic Speech Recognition

This report presents uncertainty decoding as a method for robust automatic speech recognition for the Noise Robust Automatic Speech Recognition project funded by Toshiba Research Europe Limited. The effects of noise on speech recognition are reviewed and a general framework for noise robust speech recognition introduced. Common and related noise robustness techniques are described in the contex...

متن کامل

Uncertainty training and decoding methods of deep neural networks based on stochastic representation of enhanced features

Speech enhancement is an important front-end technique to improve automatic speech recognition (ASR) in noisy environments. However, the wrong noise suppression of speech enhancement often causes additional distortions in speech signals, which degrades the ASR performance. To compensate the distortions, ASR needs to consider the uncertainty of enhanced features, which can be achieved by using t...

متن کامل

Uncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling

Although deep neural network (DNN) based acoustic models have obtained remarkable results, the automatic speech recognition (ASR) performance still remains low in noise and reverberant conditions. To address this issue, a speech enhancement front-end is often used before recognition to reduce noise. However, the front-end cannot fully suppress noise and often introduces artifacts that are limit...

متن کامل

Investigations into Uncertainty Decoding Employing a Discrete Feature Space for Noise Robust Automatic Speech Recognition

This paper addresses the robustness of automatic speech recognition to environmental noise. In order to account for reliability of the clean feature estimate we employ the feature posterior density conditioned on observed noisy features to perform uncertainty decoding. We investigate two approaches to estimate the posterior using a discrete feature space, first conditioning only on the current ...

متن کامل

Issues with uncertainty decoding for noise robust automatic speech recognition

Interest is growing in a class of robustness algorithms that exploit the notion of uncertainty introduced by environmental noise. The majority of these techniques share the property that the uncertainty of an observation due to noise is propagated to the recogniser, resulting in increased model variances. Using appropriate approximations, efficient implementations may be obtained, with the goal...

متن کامل

A computational auditory scene analysis system for speech segregation and robust speech recognition

A conventional automatic speech recognizer does not perform well in the presence of multiple sound sources, while human listeners are able to segregate and recognize a signal of interest through auditory scene analysis. We present a computational auditory scene analysis system for separating and recognizing target speech in the presence of competing speech or noise. We estimate, in two stages, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008