mel frequency cepstral coefficient mfcc

نتایج جستجو برای: mel frequency cepstral coefficient mfcc

تعداد نتایج: 644930 فیلتر نتایج به سال:

Speaker Recognition System Based On MFCC and DCT

2013

Garima Vyas Barkha Kumari

This paper examines and presents an approach to the recognition of speech signal using frequency spectral information with Mel frequency. It is a dominant feature for speech recognition. Mel-frequency cepstral coefficients (MFCCs) are the coefficients that collectively represent the shortterm power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a non linear m...

متن کامل

Modified Mel Filter Bank to Compute MFCC of Subsampled Speech

Journal: :CoRR 2014

Kiran Kumar Bhuvanagiri Sunil Kumar Kopparapu

Mel Frequency Cepstral Coefficients (MFCCs) are the most popularly used speech features in most speech and speaker recognition applications. In this work, we propose a modified Mel filter bank to extract MFCCs from subsampled speech. We also propose a stronger metric which effectively captures the correlation between MFCCs of original speech and MFCC of resampled speech. It is found that the pr...

متن کامل

Selective gammatone filterbank feature for robust sound event recognition

2010

Yi Ren Leng Tran Huy Dat Norihide Kitaoka Haizhou Li

This paper introduces a novel feature based on the raw output of the gammatone filterbank. Channel selection is used to enhance robustness over a range of signal-to-noise ratios (SNR) of additive noise. The recognition accuracy of the proposed feature is tested on a sound event database using a Hidden Markov Model (HMM) recogniser. A comparison with a series of similar features and the conventi...

متن کامل

Auditory image model features for automatic speech recognition

2005

Mario E. Munich Qiguang Lin

Conventional speech recognition engines extract Mel Frequency Cepstral Coefficients (MFCC) features from incoming speech. This paper presents a novel approach for feature extraction in which speech is processed according to the Auditory Image Model, a model of human psychoacoustics. We fist describe the proposed frontend, then we present recognition results obtained with the TIMIT database. Com...

متن کامل

Improving the noise-robustness of mel-frequency cepstral coefficients for speech processing

2006

Sourabh Ravindran David V. Anderson Malcolm Slaney

In this paper we study the noise-robustness of mel-frequency cepstral coefficients (MFCCs) and explore ways to improve their performance in noisy conditions. Improvements based on a more accurate model of the early auditory system are suggested to make the MFCC features more robust to noise while preserving their class discrimination ability. Speech versus non-speech classification and speech r...

متن کامل

Maximum Likelihood and Maximum Mutual Information Training in Gender and Age Recognition System

2007

Valiantsina Hubeika Igor Szöke Lukás Burget Jan Cernocký

Gender and age estimation based on Gaussian Mixture Models (GMM) is introduced. Telephone recordings from the Czech SpeechDatEast database are used as training and test data set. Mel-Frequency Cepstral Coefficients (MFCC) are extracted from the speech recordings. To estimate the GMMs’ parameters Maximum Likelihood (ML) training is applied. Consequently these estimations are used as the baseline...

متن کامل

Determining the Euclidean Distance Between Two Steady State Sounds

2006

Hiroko Terasawa Malcolm Slaney Jonathan Berger

We describe a perceptual space for timbre, define an objective metric that takes into account perceptual orthogonality and measure the quality of timbre interpolation. We discuss two timbre representations and using these two representations, measure perceived relationships between pairs of sounds on a equivalent range of timbre variety. We determine that a timbre space based on Mel-frequency c...

متن کامل

Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter

2007

Sandipan Chakroborty Goutam Saha

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for speech related applications. On a recent contribution by authors...

متن کامل

Forward masking for increased robustness in automatic speech recognition

2001

Sascha Wendt Gernot A. Fink Franz Kummert

In automatic speech recognition mel-frequency cepstral coefficients (MFCC) or linear predictive cepstral coefficients (LPCC) are features commonly used today. However, their calculation considers only a few features of the auditory system. On the assumption that the human representation of speech is an optimal representation, considering more features of the auditory system might lead to a bett...

متن کامل

GMM Classifier for Identification of Neurological Disordered Voices Using MFCC Features

2015

K. Uma Rani Mallikarjun S Holi

Automatic detection of neurological disordered subjects voice mostly relies on parameters extracted from time-domain processing. The calculation of these parameters often requires prior pitch period estimation; which in turn depends heavily on the robustness of pitch detection algorithm. In the present work cepstraldomain processing technique which does not require pitch estimation has been ado...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید