hmm based speech enhancement

Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesis

2010

Hongwei Hu Martin J. Russell

This paper describes the use of non-linear formant trajectories to model speech dynamics. The performance of the non-linear formant dynamics model is evaluated using HMM-based speech synthesis experiments, in which the 12 dimensional parallel formant synthesiser control parameters and their time derivatives are used as the feature vectors in the HMM. Two types of formant synthesiser control par...

متن کامل

Combined speech enhancement and auditory modelling for robust distributed speech recognition

Journal: :Speech Communication 2008

Ronan Flynn Edward Jones

The performance of Automatic Speech Recognition (ASR) systems in the presence of noise is an area that has attracted a lot of research interest. Additive noise from interfering noise sources, and convolutional noise arising from transmission channel characteristics both contribute to a degradation of performance in ASR systems. This paper addresses the problem of robustness of speech recognitio...

متن کامل

Tree-structured noise-adapted HMM modeling for piecewise linear-transformation-based adaptation

2003

Zhipeng Zhang Kiyotaka Otsuji Sadaoki Furui

This paper proposes the application of tree-structured clustering to various noise samples or noisy speech in the framework of piecewise-linear transformation (PLT)-based noise adaptation. According to the clustering results, a noisy speech HMM is made for each node of the tree structure. Based on the likelihood maximization criterion, the HMM that best matches the input speech is selected by t...

متن کامل

Advances in subword-based HMM-DNN speech recognition across languages

Journal: :Computer Speech & Language 2021

We describe a novel way to implement subword language models in speech recognition systems based on weighted finite state transducers, hidden Markov models, and deep neural networks. The acoustic are built graphemes that no pronunciation dictionaries needed, they can be used together with any type of model, including character models. advantages short units good lexical coverage, reduced data s...

متن کامل

HMM-based Finnish text-to-speech system utilizing glottal inverse filtering

2008

Tuomo Raitio Antti Suni Hannu Pulakka Martti Vainio Paavo Alku

This paper describes an HMM-based speech synthesis system that utilizes glottal inverse filtering for generating natural sounding synthetic speech. In the proposed system, speech is first parametrized into spectral and excitation features using a glottal inverse filtering based method. The parameters are fed into an HMM system for training and then generated from the trained HMM according to te...

متن کامل

A Hybrid HMM/BN Acoustic Model for Automatic Speech Recognition

2005

Konstantin MARKOV Satoshi NAKAMURA

In current HMM based speech recognition systems, it is difficult to supplement acoustic spectrum features with additional information such as pitch, gender, articulator positions, etc. On the other hand, Bayesian Networks (BN) allow for easy combination of different continuous as well as discrete features by exploring conditional dependencies between them. However, the lack of efficient algorit...

متن کامل

Parameterization of vocal fry in HMM-based speech synthesis

2009

Hanna Silén Elina Helander Jani Nurminen Moncef Gabbouj

HMM-based speech synthesis offers a way to generate speech with different voice qualities. However, sometimes databases contain certain inherent voice qualities that need to be parametrized properly. One example of this is vocal fry typically occurring at the end of utterances. A popular mixed excitation vocoder for HMM-based speech synthesis is STRAIGHT. The standard STRAIGHT is optimized for ...

متن کامل

HMM-based visual speech synthesis using dynamic visemes

2015

Ausdang Thangthai Barry-John Theobald

In this paper we incorporate dynamic visemes into hidden Markov model (HMM)-based visual speech synthesis. Dynamic visemes represent intuitive visual gestures identified automatically by clustering purely visual speech parameters. They have the advantage of spanning multiple phones and so they capture the effects of visual coarticulation explicitly within the unit. The previous application of d...

متن کامل

Analysis of spectral enhancement using global variance in HMM-based speech synthesis

2014

Takashi Nose Akinori Ito

This paper analyzes the problem of the spectral enhancement technique using global variance (GV) in HMM-based speech synthesis. In the conventional GV-based parameter generation, spectral enhancement with variance compensation is achieved by considering a GV pdf with fixed parameters for every output utterances through the generation process. Although the spectral peaks of the generated traject...

متن کامل

Speech recognition with support vector machines in a hybrid system

2005

Sven E. Krüger Martin Schafföner Marcel Katz Edin Andelic Andreas Wendemuth

While the temporal dynamics of speech can be represented very efficiently by Hidden Markov Models (HMMs), the classification of speech into single speech units (phonemes) is usually done with Gaussian mixture models which do not discriminate well. Here, we use Support Vector Machines (SVMs) for classification by integrating this method in a HMM-based speech recognition system. In this hybrid SV...

متن کامل