نتایج جستجو برای: mel frequency cepstral coefficient

تعداد نتایج: 644186  

2010
Slava Shechtman Alexander Sorin

A sinusoidal representation of speech is an alternative to the source-filter model. It is widely used in speech coding and unit-selection TTS, but is less common in statistical TTS frameworks. In this work we utilize Regularized Cepstral Coefficients (RCC) estimated in mel-frequency scale for amplitude spectrum envelope modeling within an HMM-based TTS platform. Improved subjective quality for ...

2009
William Brent

Different types of cepstral analysis are compared in the context of a percussion instrument classification external for Pd. For raw cepstrum, mel frequency cepstrum, DCT-based cepstrum, and bark frequency cepstrum, various parameter settings are applied to a standardized test. Significant score improvement can be seen when moving from cepstrum to mel cepstrum, and further improvement is achieve...

2017
Anastasios Vafeiadis Konstantinos Votis Dimitrios Giakoumis Dimitrios Tzovaras Liming Chen Raouf Hamzaoui

Building an acoustic-based event recognition system for smart homes is a challenging task due to the lack of high-level structures in environmental sounds. In particular, the selection of effective features is still an open problem. We make an important step toward this goal by showing that the combination of Mel-Frequency Cepstral Coefficients, ZeroCrossing Rate, and Discrete Wavelet Transform...

2014
JOHN SAHAYA RANI ALEX NITHYA VENKATESAN

This paper presents robust feature extraction techniques, called Mel Power Karhunen Loeve Transform Coefficients (MPKC), Mel Power Coefficients (MPC) for an isolated digit recognition. This hybrid method involves Stevens’ Power Law of Hearing and Karhunen Loeve(KL) Transform to improve noise robustness. We have evaluated the proposed methods on a Hidden Markov Model (HMM) based isolated digit r...

2013
Garima Vyas Barkha Kumari

This paper examines and presents an approach to the recognition of speech signal using frequency spectral information with Mel frequency. It is a dominant feature for speech recognition. Mel-frequency cepstral coefficients (MFCCs) are the coefficients that collectively represent the shortterm power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a non linear m...

2002
Toshio Irino Yasuhiro Minami Tomohiro Nakatani Minoru Tsuzaki H. Tagawa

We propose a method for integrating speech recognition and generation within a unified framework. The method consists of STRAIGHT, warped-frequency DCT, and an HMM engine. The warped-frequency DCT is used to derive a kind of mel-cepstral coefficient from the smoothed spectrum of STRAIGHT, which is known as a high-quality vocoder. This analysis/synthesis method has potential to improve the perfo...

2014
Mireia Díez Mikel Peñagarikano Germán Bordel Amparo Varona Luis Javier Rodríguez-Fuentes

Previous works have shown that remarkable performance improvements can be attained in speaker and language recognition tasks by combining several heterogeneous systems that provide complementary information. In this work, the complementarity of several i-vector language recognition systems, using Mel-Frequency Cepstral-Coefficient (MFCC) features computed on ShortTime Fourier Analysis windows o...

2016
Massimiliano Todisco Héctor Delgado Nicholas W. D. Evans

This paper introduces a new articulation rate filter and reports its combination with recently proposed constant Q cepstral coefficients (CQCCs) in their first application to automatic speaker verification (ASV). CQCC features are extracted with the constant Q transform (CQT), a perceptually-inspired alternative to Fourier-based approaches to time-frequency analysis. The CQT offers greater freq...

2014
Inggih Permana Agus Buono Bib Paruhum Silalahi

Similarity measurement is an important part of speaker identification. This study has modified the similarity measurement technique performed in previous studies. Previous studies used the sum of the smallest distance between the input vectors and the codebook vectors of a particular speaker. In this study, the technique has been modified by selecting a particular speaker codebook which has the...

2010
Slava Shechtman Alex Sorin

A sinusoidal representation of speech is an alternative to the source-filter model. It is widely used in speech coding and unit-selection TTS, but is less common in statistical TTS frameworks. In this work we utilize Regularized Cepstral Coefficients (RCC) estimated in mel-frequency scale for amplitude spectrum envelope modeling within an HMM-based TTS platform. Improved subjective quality for ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید