mel frequency cepstral coefficient

Sinusoidal model parameterization for HMM-based TTS system

2010

Slava Shechtman Alexander Sorin

A sinusoidal representation of speech is an alternative to the source-filter model. It is widely used in speech coding and unit-selection TTS, but is less common in statistical TTS frameworks. In this work we utilize Regularized Cepstral Coefficients (RCC) estimated in mel-frequency scale for amplitude spectrum envelope modeling within an HMM-based TTS platform. Improved subjective quality for ...

متن کامل

Perceptually Based Pitch Scales in Cepstral Techniques for Percussive Timbre Identification

2009

William Brent

Different types of cepstral analysis are compared in the context of a percussion instrument classification external for Pd. For raw cepstrum, mel frequency cepstrum, DCT-based cepstrum, and bark frequency cepstrum, various parameter settings are applied to a standardized test. Significant score improvement can be seen when moving from cepstrum to mel cepstrum, and further improvement is achieve...

متن کامل

Audio-based Event Recognition System for Smart Homes

2017

Anastasios Vafeiadis Konstantinos Votis Dimitrios Giakoumis Dimitrios Tzovaras Liming Chen Raouf Hamzaoui

Building an acoustic-based event recognition system for smart homes is a challenging task due to the lack of high-level structures in environmental sounds. In particular, the selection of effective features is still an open problem. We make an important step toward this goal by showing that the combination of Mel-Frequency Cepstral Coefficients, ZeroCrossing Rate, and Discrete Wavelet Transform...

متن کامل

Modified Mfcc Methods Based on Kl- Transform and Power Law for Robust Speech Recognition

2014

JOHN SAHAYA RANI ALEX NITHYA VENKATESAN

This paper presents robust feature extraction techniques, called Mel Power Karhunen Loeve Transform Coefficients (MPKC), Mel Power Coefficients (MPC) for an isolated digit recognition. This hybrid method involves Stevens’ Power Law of Hearing and Karhunen Loeve(KL) Transform to improve noise robustness. We have evaluated the proposed methods on a Hidden Markov Model (HMM) based isolated digit r...

متن کامل

Speaker Recognition System Based On MFCC and DCT

2013

Garima Vyas Barkha Kumari

This paper examines and presents an approach to the recognition of speech signal using frequency spectral information with Mel frequency. It is a dominant feature for speech recognition. Mel-frequency cepstral coefficients (MFCCs) are the coefficients that collectively represent the shortterm power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a non linear m...

متن کامل

Evaluation of a speech recognition / generation method based on HMM and straight

2002

Toshio Irino Yasuhiro Minami Tomohiro Nakatani Minoru Tsuzaki H. Tagawa

We propose a method for integrating speech recognition and generation within a unified framework. The method consists of STRAIGHT, warped-frequency DCT, and an HMM engine. The warped-frequency DCT is used to derive a kind of mel-cepstral coefficient from the smoothed spectrum of STRAIGHT, which is known as a high-quality vocoder. This analysis/synthesis method has potential to improve the perfo...

متن کامل

On the complementarity of short-time fourier analysis windows of different lengths for improved language recognition

2014

Mireia Díez Mikel Peñagarikano Germán Bordel Amparo Varona Luis Javier Rodríguez-Fuentes

Previous works have shown that remarkable performance improvements can be attained in speaker and language recognition tasks by combining several heterogeneous systems that provide complementary information. In this work, the complementarity of several i-vector language recognition systems, using Mel-Frequency Cepstral-Coefficient (MFCC) features computed on ShortTime Fourier Analysis windows o...

متن کامل

Articulation Rate Filtering of CQCC Features for Automatic Speaker Verification

2016

Massimiliano Todisco Héctor Delgado Nicholas W. D. Evans

This paper introduces a new articulation rate filter and reports its combination with recently proposed constant Q cepstral coefficients (CQCCs) in their first application to automatic speaker verification (ASV). CQCC features are extracted with the constant Q transform (CQT), a perceptually-inspired alternative to Fourier-based approaches to time-frequency analysis. The CQT offers greater freq...

متن کامل

Similarity Measurement for Speaker Identification Using Frequency of Vector Pairs

2014

Inggih Permana Agus Buono Bib Paruhum Silalahi

Similarity measurement is an important part of speaker identification. This study has modified the similarity measurement technique performed in previous studies. Previous studies used the sum of the smallest distance between the input vectors and the codebook vectors of a particular speaker. In this study, the technique has been modified by selecting a particular speaker codebook which has the...

متن کامل

Sinusoidal model parameterization for HMM-based TTS system-Interspeech2010_v2.1.1

2010

Slava Shechtman Alex Sorin

A sinusoidal representation of speech is an alternative to the source-filter model. It is widely used in speech coding and unit-selection TTS, but is less common in statistical TTS frameworks. In this work we utilize Regularized Cepstral Coefficients (RCC) estimated in mel-frequency scale for amplitude spectrum envelope modeling within an HMM-based TTS platform. Improved subjective quality for ...

متن کامل