تبدیل mllr

Speaker and Noise Factorisation for Robust Speech Recognition

2012

Yongqiang Wang

Speech recognition systems need to operate in a wide range of conditions. Thus they should be robust to extrinsic variability caused by various acoustic factors, for example speaker differences, transmission channel and background noise. For many scenarios, multiple factors simultaneously impact the underlying “clean” speech signal. This paper examines techniques to handle both speaker and back...

متن کامل

Structural maximum a-posteriori linear regression for unsupervised speaker adaptation

2000

Tor André Myrvoll Olivier Siohan Chin-Hui Lee Wu Chou

In this paper we introduce an approach to transformation based model adaptation techniques. Previously published schemes like MLLR define a set of affine transformations to be applied on clusters of model parameters. Although it has been shown that this approach can yield good results when adaptation data is scarce, an inherent problem needs to be considered: the number of transformations used ...

متن کامل

Refining phoneme segmentations using speaker-adaptive context dependent boundary models

2005

Yong Zhao Lijuan Wang Min Chu Frank K. Soong Zhigang Cao

Consistent phoneme segmentation is essential in building high quality Text-to-Speech (TTS) voice fonts. In this paper we propose to adapt an existing well-trained Context Dependent Boundary Model (CDBM) for refining segment boundaries to a new speaker with limited, manually segmented data. Three adaptation approaches: MLLR, MAP, and a combination of the two, are studied. The combined one, MLLR+...

متن کامل

Factored MLLR Adaptation for HMM-Based Speech Synthesis in Naval-IT Fusion Technology

Journal: :The Journal of Korea Information and Communications Society 2013

متن کامل

Broadcast news transcription using HTK

1997

Philip C. Woodland Mark J. F. Gales David Pye Steve J. Young

This paper examines the issues in extending a large vocabulary speech recognition system designed for clean and noisy read speech tasks to handle broadcast news transcription. Results using the 1995 DARPA H 4 e v aluation data set are presented for dierent front-end analyses and use of unsupervised model adaptation using maximum likelihood linear regression (MLLR). The HTK system for the 1996 H...

متن کامل

A simulated-data adaptation technique for robust speech recognition

2006

Nattanun Thatphithakkul Boontee Kruatrachue Chai Wutiwiwatchai Sanparith Marukatat Vataya Boonpiam

This paper proposes an efficient acoustic model adaptation method based on the use of simulated-data in maximum likelihood linear regression (MLLR) adaptation for robust speech recognition. Online MLLR adaptation is an unsupervised process which requires an input speech with phone labels transcribed automatically. Instead of using only the input signal in adaptation, our proposed simulated data...

متن کامل

Phoneme Class Based Adaptation for Mismatch Acoustic Modeling of Distant Noisy Speech

2012

Seckin Uluskan John H. L. Hansen

A new adaptation strategy for distant noisy speech is created by phoneme class based approaches for context-independent acoustic models. Unlike the previous approaches such as MLLR-MAP adaptation which adapts acoustic model to the features, our phoneme-class based adaptation (PCBA) adapts the distant data features to our acoustic model which has trained on close microphone TIMIT sentences. The ...

متن کامل

Architectural Style Classification Using Multinomial Latent Logistic Regression

2014

Zhe Xu Dacheng Tao Ya Zhang Junjie Wu Ah Chung Tsoi

Architectural style classification differs from standard classification tasks due to the rich inter-class relationships between different styles, such as re-interpretation, revival, and territoriality. In this paper, we adopt Deformable Part-based Models (DPM) to capture the morphological characteristics of basic architectural components and propose Multinomial Latent Logistic Regression (MLLR)...

متن کامل

Bilinear transformation space-based maximum likelihood linear regression frameworks

2009

Hwa Jeon Song Yongwon Jeong Hyung Soon Kim

This paper proposes two types of bilinear transformation spacebased speaker adaptation frameworks. In training session, transformation matrices for speakers are decomposed into the style factor for speakers’ characteristics and orthonormal basis of eigenvectors to control dimensionality of the canonical model by the singular value decomposition-based algorithm. In adaptation session, the style ...

متن کامل

Rapid speaker adaptation using MLLR and subspace regression classes

2001

Kwok-Man Wong Brian Kan-Wing Mak

In recent years, various adaptation techniques for hidden Markov modeling with mixture Gaussians have been proposed, most notably MAP estimation and MLLR transformation. When the amount of adaptation data is limited, adaptation can be done by grouping similar Gaussians together to form regression classes and then transforming the Gaussians in groups. The grouping of Gaussians is often determine...

متن کامل