Training log-linear acoustic models in higher-order polynomial feature space for speech recognition

نویسندگان

  • Muhammad Ali Tahir
  • Heyun Huang
  • Ralf Schlüter
  • Hermann Ney
  • Louis ten Bosch
  • Bert Cranen
  • Lou Boves
چکیده

The use of higher-order polynomial acoustic features can improve the performance of automatic speech recognition. However, the dimensionality of the polynomial representation can be prohibitively large, making the training of acoustic models using polynomial features for large vocabulary ASR systems infeasible. This paper presents an iterative polynomial training framework for acoustic modeling, which recursively expands the current acoustic features into their second-order polynomial feature space. In each recursion the dimensionality is reduced by a linear projection, such that increasingly higher order polynomial information is incorporated while keeping the dimensionality of the acoustic models constant. Experimental results obtained for a large-vocabulary continuous speech recognition task show that the proposed method outperforms conventional mixture models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative adaptation for log-linear acoustic models

Log-linear models have recently been used in acoustic modeling for speech recognition systems. This has been motivated by competitive results compared to systems based on Gaussian models, and a more direct parametrisation of the posterior model. To competitively use log-linear models for speech recognition, important methods, such as speaker adaptation, have to be reformulated in a log-linear f...

متن کامل

Log-Linear System Combination Using Structured Support Vector Machines

Building high accuracy speech recognition systems with limited language resources is a highly challenging task. Although the use of multi-language data for acoustic models yields improvements, performance is often unsatisfactory with highly limited acoustic training data. In these situations, it is possible to consider using multiple well trained acoustic models and combine the system outputs t...

متن کامل

Log-Linear Optimization of Second-Order Polynomial Features with Subsequent Dimension Reduction for Speech Recognition

Second order polynomial features are useful for speech recognition because they can be used to model class specific covariance even with a pooled covariance acoustic model. Previous experiments with second order features have shown word error rate improvements. However, the improvement comes at the price of a large increase in the number of parameters. This paper investigates the discriminative...

متن کامل

Discriminative speaker adaptation using articulatory features

This paper presents an automatic speech recognition system using acoustic models based on both sub-phonetic units and broad, phonological features such as Voiced and Round as output densities in a hidden Markov model framework. The aim of this work is to improve speech recognition performance particularly on conversational speech by using units other than phones as a basis for discrimination be...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013