speaker transformation

Boosting Speaker Recognition Performance with Compact Representations

2011

Sibel Yaman Jason W. Pelecanos Mohamed Kamal Omar

This paper describes a speaker recognition system combination approach in which the compact forms of MAP adapted GMM supervectors are used to boost the performance of a highdimensional supervector-based system or a combination of multiple systems. The compact supervector representations are subjected to a diagonal transformation to emphasize those dimensions that describe significant speaker in...

متن کامل

Cross-variety speaker transformation in HSMM-based speech synthesis

2013

Markus Toman Michael Pucher Dietmar Schabus

We present and compare different approaches for crossvariety speaker transformation in Hidden Semi-Markov Model (HSMM) based speech synthesis that allow for a transformation of an arbitrary speaker’s voice from one variety to another one. The methods developed are applied to three different varieties, namely standard Austrian German, one Middle Bavarian (Upper Austria, Bad Goisern) and one Sout...

متن کامل

Prior parameter transformation for unsupervised speaker adaptation

2000

Guoqiang Li Limin Du Ziqiang Hou

In a strictly Bayesian approach, prior parameters are assumed known, based on common or subjective knowledge. But a practical solution for maximum a posteriori adaptation methods is to adopt an empirical Bayesian approach, where the prior parameters are estimated directly from training speech data itself. So there is a problem of mismatches between training and testing conditions in the use of ...

متن کامل

Temporal structure constrained transformation for speaker adaptation

2003

Eric H. C. Choi Trym Holter Julien Epps Arun Gopalakrishnan

In this paper we suggest that rather than modeling speaker mismatch as an affine transform of the entire feature vector, it can be modeled by an affine transform of the static coefficients with additional constraints imposed by the temporal relationships of the streams of coefficients. This results in the different streams sharing the same rotation matrix, and thus reduces the complexity and me...

متن کامل

Recursive Whitening Transformation for Speaker Recognition on Language Mismatched Condition

2017

Suwon Shon Seongkyu Mun Hanseok Ko

Recently in speaker recognition, performance degradation due to the channel domain mismatched condition has been actively addressed. However, the mismatches arising from language is yet to be sufficiently addressed. This paper proposes an approach which employs recursive whitening transformation to mitigate the language mismatched condition. The proposed method is based on the multiple whitenin...

متن کامل

Text-independent voice conversion using speaker model alignment method from non-parallel speech

2014

Peng Song Yun Jin Wenming Zheng Li Zhao

In this paper, we propose a novel voice conversion method called speaker model alignment (SMA), which does not require parallel training speech. Firstly, the source and target speaker models, described by Gaussian mixture model (GMM), are trained, respectively. Then, the transformation function of spectral features is learned by aligning the components of source and target speaker models iterat...

متن کامل

Recursive Whitening Transformation

2017

Suwon Shon Seongkyu Mun Hanseok Ko

Recently in speaker recognition, performance degradation due to the channel domain mismatched condition has been actively addressed. However, the mismatches arising from language is yet to be sufficiently addressed. This paper proposes an approach which employs recursive whitening transformation to mitigate the language mismatched condition. The proposed method is based on the multiple whitenin...

متن کامل

A new spectral transformation for speaker normalization

2003

Pierre L. Dognin Amro El-Jaroudi

This paper proposes a new spectral transformation for speaker normalization. We use the Bilinear Transformation (BLT) to introduce a new frequency warping resulting from a mapping of a prototype Band-Pass (BP) filter into a general BP filter. This new transformation called “Band-Pass Transform” (BPT) offers two degrees of freedom enabling complex warpings of the frequency axis and different fro...

متن کامل

Transformation Sharing Strategies for MLLR Speaker Adaptation

2007

Arindam Mandal Mari Ostendorf Andreas Stolcke Jeffrey Bilmes

Transformation Sharing Strategies for MLLR Speaker Adaptation Arindam Mandal Chair of the Supervisory Committee: Professor Mari Ostendorf Electrical Engineering Maximum Likelihood Linear Regression (MLLR) estimates linear transformations of automatic speech recognition (ASR) parameters and has achieved significant performance improvements in speaker-independent ASR systems by adapting to target...

متن کامل

Voice conversion with smoothed GMM and MAP adaptation

2003

Yining Chen Min Chu Eric Chang Jia Liu Runsheng Liu

In most state-of-the-art voice conversion systems, speech quality of converted utterances is still unsatisfactory. In this paper, STRAIGHT analysis-synthesis framework is used to improve the quality. A smoothed GMM and MAP adaptation is proposed for spectrum conversion to avoid the overly smooth phenomenon in the traditional GMM method. Since frames are processed independently, the GMM based tr...

متن کامل