mel frequency cepstral coefficients mfcc

Digit Recognition based on Euclidean and DTW

2015

Sreeja Nair Milind Shah B. S. Atal A. A. M. Abushariah T. S. Gunawan O. O. Khalifa C. P. Lim S. C. Woo A. S. Loh

This paper describes the implementation of two isolated digit recognition techniques and is a comparison between the algorithms implemented. Any digit recognition comprises of mainly two stages feature extraction and similarity evaluation. Here, two feature extraction techniques, namely linear predictive cepstral coefficients (LPCC) and mel frequency cepstral coefficients (MFCC) are implemented...

متن کامل

Emotion recognition in spontaneous speech using GMMs

2006

Daniel Neiberg Kjell Elenius Kornel Laskowski

Automatic detection of emotions has been evaluated using standard Mel-frequency Cepstral Coefficients, MFCCs, and a variant, MFCC-low, calculated between 20 and 300 Hz, in order to model pitch. Also plain pitch features have been used. These acoustic features have all been modeled by Gaussian mixture models, GMMs, on the frame level. The method has been tested on two different corpora and langu...

متن کامل

Predicting Formant Frequencies from MFCC Vectors

2005

Jonathan Darch Ben P. Milner Xu Shao Saeed Vaseghi Qin Yan

This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCCs and formant frequencies using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method predict...

متن کامل

Performance Evaluation of Bangla Word Recognition Using Different Acoustic Features

2010

Nusrat Jahan Lisa Qamrun Nahar Eity Ghulam Muhammad Mohammad Nurul Huda Chowdhury Mofizur Rahman

This paper describes a medium size Bangla speech corpus preparation and the comparison of the performances of different acoustic features for Bangla word recognition. A small number of speakers are use for most of the Bangla automatic speech recognition (ASR) system, but 40 speakers selected from a wide area of Bangladesh, where Bangla is used as a native language, are involved here. In the exp...

متن کامل

Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech: Speech Understanding / Interaction

2004

David Chow Waleed H. Abdulla

Log area ratio coefficients (LAR) derived from linear prediction coefficients (LPC) is a well known feature extraction technique used in speech applications. This paper presents a novel way to use the LAR feature in a speaker identification system. Here, instead of using the mel frequency cepstral coefficients (MFCC), the LAR feature is used in a Gaussian mixture model (GMM) based speaker ident...

متن کامل

Emotion Recognition in Spontaneous Speech

2006

Daniel Neiberg Kjell Elenius Inger Karlsson Kornel Laskowski

Automatic detection of emotions has been evaluated using standard Mel-frequency Cepstral Coefficients, MFCCs, and a variant, MFCC-low, that is calculated between 20 and 300 Hz in order to model pitch. Plain pitch features have been used as well. These acoustic features have all been modeled by Gaussian mixture models, GMMs, on the frame level. The method has been tested on two different corpora...

متن کامل

Combining Evidence from Spectral and Source-Like Features for Person Recognition from Humming

2011

Hemant A. Patil Maulik C. Madhavi Keshab K. Parhi

In this paper, hum of a person is used in voice biometric system. In addition, recently proposed feature set, i.e., Variable length Teager Energy Based Mel Frequency Cepstral Coefficients (VTMFCC), is found to capture perceptually meaningful source-like information from hum signal. For person recognition, MFCC gives EER of 13.14% and %ID of 64.96%. A reduction in equal error rate (EER) by 0.2% ...

متن کامل

[Nonlinear acoustic analysis in the evaluation of occupational voice disorders].

Journal: :Medycyna pracy 2013

Ewa Niebudek-Bogusz Jacek Grygiel Paweł Strumiłło Mariola Sliwińska-Kowalska

BACKGROUND Over recent years numerous papers have stressed that production of voice is subjected to the nonlinear processes, which cause aperiodic vibrations of vocal folds. These vibrations cannot always be characterized by means of conventional acoustic parameters, such as measurements of frequency and amplitude perturbations. Thus, special attention has recently been paid to nonlinear acoust...

متن کامل

加成性雜訊環境下運用特徵參數統計補償法於強健性語音辨識 (Feature Statistics Compensation for Robust Speech Recognition in Additive Noise Environments) [In Chinese]

2007

Tsung-hsueh Hsieh Jeih-Weih Hung

In this paper, we propose several compensation approaches to alleviate the effect of additive noise on speech features for speech recognition. These approaches are simple yet efficient noise reduction techniques that use online constructed pseudo stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transforms for noise-corrupted speech features to...

متن کامل

improving the performance of mfcc for persian robust speech recognition

Journal: :journal of ai and data mining 2015

d. darabian h. marvi m. sharif noughabi

the mel frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. in this paper to achieve a satisfactorily performance in automatic speech recognition (asr) applications we introduce a noise robust new set of mfcc vector estimated through following steps. first, spectral mean normalization is a pre-processing which applies to t...

متن کامل