mel frequency cel cepstrum mfcc

Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification

2003

Ziyou Xiong Regunathan Radhakrishnan Ajay Divakaran Thomas S. Huang

We present a comparison of 6 methods for classification of sports audio. For the feature extraction we have two choices: MPEG-7 audio features and Mel-scale Frequency Cepstrum Coefficients(MFCC). For the classification we also have two choices: Maximum Likelihood Hidden Markov Models(ML-HMM) and Entropic Prior HMM(EP-HMM). EP-HMM, in turn, have two variations: with and without trimming of the m...

متن کامل

Anefficient Speechrecognition System

2013

Suma Swamy K. V Ramakrishnan

This paper describes the development of an efficient speech recognition system using different techniques such as Mel Frequency Cepstrum Coefficients (MFCC), Vector Quantization (VQ) and Hidden Markov Model (HMM). This paper explains how speaker recognition followed by speech recognition is used to recognize the speech faster, efficiently and accurately. MFCC is used to extract the characterist...

متن کامل

Robust speech/non-speech detection using LDA applied to MFCC

2001

Arnaud Martin Delphine Charlet Laurent Mauuary

In speech recognition, a speech/non-speech detection must be robust to noise. In this work, a new method for speech/nonspeech detection using a Linear Discriminant Analysis (LDA) applied to Mel Frequency Cepstrum Coefficients (MFCC) is presented. The energy is the most discriminant parameter between noise and speech. But with this single parameter, the speech/non-speech detection system detects...

متن کامل

MFCC and its applications in speaker recognition

2010

Vibha Tiwari

Speech processing is emerged as one of the important application area of digital signal processing. Various fields for research in speech processing are speech recognition, speaker recognition, speech synthesis, speech coding etc. The objective of automatic speaker recognition is to extract, characterize and recognize the information about speaker identity. Feature extraction is the first step ...

متن کامل

Robust speech/non-speech detection using LDA applied to MFCC for continuous speech recognition

2001

Arnaud Martin Géraldine Damnati Laurent Mauuary

Continuous speech recognition applications need precise detection because the number of words to recognize is unknown and vocabulary words can be short. The speech/non-speech detection must be robust to the boundary precision. In this work, a new approach to evaluate detection algorithm for continuous speech recognition is presented. The speech/non-speech detection using energy parameter combin...

متن کامل

Optimizing feature complementarity by evolution strategy: Application to automatic speaker verification

Journal: :Speech Communication 2009

Christophe Charbuillet Bruno Gas Mohamed Chetouani Jean-Luc Zarader

Conventional automatic speaker verification systems are based on cepstral features like Mel-scale Frequency Cepstrum Coefficient (MFCC), or Linear Predictive Cepstrum Coefficient (LPCC). Recent published works showed that the use of complementary features can significantly improve the system performances. In this paper, we propose to use an evolution strategy to optimize the complementarity of ...

متن کامل

In Search of a Perceptual Metric for Timbre: Dissimilarity Judgments among Synthetic Sounds with MFCC-Derived Spectral Envelopes

2012

HIROKO TERASAWA JONATHAN BERGER SHOJI MAKINO

This paper presents a quantitative metric to describe the multidimensionality of spectral envelope perception, that is, the perception specifically related to the spectral element of timbre. Mel-cepstrum (Mel-frequency cepstral coefficients or MFCCs) is chosen as a hypothetical metric for spectral envelope perception due to its desirable properties of linearity, orthogonality, and multidimensio...

متن کامل

Text-dependent speaker verification using feature selection with recognition related criterion

2004

Yaniv Zigel Arnon Cohen

Speaker verification and identification systems most often employ HMMs and GMMs as recognition engines. This paper describes an algorithm for the optimal selection of the feature space, suitable for these engines. In verification systems, each speaker (target) is assigned an “individual” optimal feature space in which he/she is best discriminated against impostors. Several feature selection pro...

متن کامل

Combination of temporal trajectory filtering and projection measure for robust speaker identification

2000

Kuo-Hwei Yuo Tai-Hwei Hwang Hsiao-Chuan Wang

This paper presents a method that combines the techniques of temporal trajectory filtering and projection measure for robust speaker identification. The proposed robust feature, called Relative Autocorrelation Sequence Mel-scale Frequency Cepstral Coefficients (RAS-MFCC), is derived based on filtering the temporal trajectories of short-time one-sided autocorrelation sequences. This filtering pr...

متن کامل

Blind source mobile device identification based on recorded call

Journal: :Eng. Appl. of AI 2014

Mehdi Jahanirad Ainuddin Wahid Abdul Wahab Nor Badrul Anuar Mohd Yamani Idna Idris Mohamad Nizam Ayub

Mel-frequency cepstrum coefficients (MFCCs) extracted from speech recordings has been proven to be the most effective feature set to capture the frequency spectra produced by a recording device. This paper claims that audio evidence such as a recorded call contains intrinsic artifacts at both transmitting and receiving ends. These artifacts allow recognition of the source mobile device on the o...

متن کامل