تبدیل mllr

Fast model selection based speaker adaptation for nonnative speech

Journal: :IEEE Trans. Speech and Audio Processing 2003

Xiaodong He Yunxin Zhao

In this paper, the problem of adapting acoustic models of native English speech to nonnative speakers is addressed from a perspective of adaptive model complexity selection. The goal is to dynamically select model complexity for each nonnative talker so as to optimize the balance between model robustness to pronunciation variations and model detailedness for discrimination of speech sounds. A m...

متن کامل

Using HMM-based Classifier Adapted to Background Noises with Improved Sounds Features for Audio Surveillance Application

2008

Asma Rabaoui Zied Lachiri Noureddine Ellouze

Discrimination between different classes of environmental sounds is the goal of our work. The use of a sound recognition system can offer concrete potentialities for surveillance and security applications. The first paper contribution to this research field is represented by a thorough investigation of the applicability of state-of-the-art audio features in the domain of environmental sound rec...

متن کامل

A MAP-like weighting scheme for MLLR speaker adaptation

1999

Silke Goronzy Ralf Kompe

This paper presents an approach for fast, unsupervised, online MLLR speaker adaptation using two MAP-like weighting schemes, a static and a dynamic one. While for the standard MLLR approach several sentences are necessary before a reliable estimation of the transformations is possible, the weighted approach shows good results even if adaptation is conducted after only a few short utterances. Ex...

متن کامل

Evaluation of several Maximum Likelihood Linear Regression Variants for Language Adaptation

2008

Míriam Luján-Mares Carlos D. Martínez-Hinarejos Vicent Alabau Gonzalvo

Multilingual Automatic Speech Recognition (ASR) systems are of great interest in multilingual environments. We studied the case of the Comunitat Valenciana where the two official languages are Spanish and Valencian. These two languages share most of their phonemes, and their syntax and vocabulary are also quite similar since they have influenced each other for many years. We constructed a syste...

متن کامل

Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR

2001

Masatsune Tamura Takashi Masuko Keiichi Tokuda Takao Kobayashi

This paper describes a technique for synthesizing speech with an arbitrary speaker characteristics using speaker independent speech units, which we call “average voice” units. The technique is based on an HMM-based text-to-speech (TTS) system and MLLR adaptation algorithm. In the HMM-based TTS system, speech synthesis units are modeled by multi-space probability distribution (MSD) HMMs which ca...

متن کامل

Linear feature space projections for speaker adaptation

2001

George Saon Geoffrey Zweig Mukund Padmanabhan

We extend the well-known technique of constrained Maximum Likelihood Linear Regression (MLLR) to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data. We model the projected features with phone-dependent Gaussian distributions and also model the complement of the projected space with a single class-independent, speaker-specific Gaussian dist...

متن کامل

Unsupervised speaker adaptation based on sufficient HMM statistics of selected speakers

2001

Shinichi Yoshizawa Akira Baba Kanako Matsunami Yuichiro Mera Miichi Yamada Kiyohiro Shikano

This paper describes an efficient method for unsupervised speaker adaptation. This method is based on (1) selecting a subset of speakers who are acoustically close to a test speaker, and (2) calculating adapted model parameters according to the previously stored sufficient HMM statistics of the selected speakers’ data. In this method, only a few unsupervised test speaker’s data are required for...

متن کامل

MLLR transforms as features in speaker recognition

2005

Andreas Stolcke Luciana Ferrer Sachin S. Kajarekar Elizabeth Shriberg Anand Venkataraman

We explore the use of adaptation transforms employed in speech recognition systems as features for speaker recognition. This approach is attractive because, unlike standard framebased cepstral speaker recognition models, it normalizes for the choice of spoken words in text-independent speaker verification. Affine transforms are computed for the Gaussian means of the acoustic models used in a re...

متن کامل

Speaker Adaptation of Various Components in Deep Neural Network based Speech Synthesis

2016

Shinji Takaki SangJin Kim Junichi Yamagishi

In this paper, we investigate the effectiveness of speaker adaptation for various essential components in deep neural network based speech synthesis, including acoustic models, acoustic feature extraction, and post-filters. In general, a speaker adaptation technique, e.g., maximum likelihood linear regression (MLLR) for HMMs or learning hidden unit contributions (LHUC) for DNNs, is applied to a...

متن کامل

Within-class covariance normalization for SVM-based speaker recognition

2006

Andrew O. Hatch Sachin S. Kajarekar Andreas Stolcke

This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space. Our approach involves using principal component analysis (PCA) to split the original feature sp...

متن کامل