frequency cepstral coefficient

Pitch Mean Based Frequency Warping

2006

Jian Liu Thomas Fang Zheng Wenhu Wu

In this paper, a novel pitch mean based frequency warping (PMFW) method is proposed to reduce the pitch variability in speech signals at the frontend of speech recognition. The warp factors used in this process are calculated based on the average pitch of a speech segment. Two functions to describe the relations between the frequency warping factor and the pitch mean are defined and compared. W...

متن کامل

Optimizing feature complementarity by evolution strategy: Application to automatic speaker verification

Journal: :Speech Communication 2009

Christophe Charbuillet Bruno Gas Mohamed Chetouani Jean-Luc Zarader

Conventional automatic speaker verification systems are based on cepstral features like Mel-scale Frequency Cepstrum Coefficient (MFCC), or Linear Predictive Cepstrum Coefficient (LPCC). Recent published works showed that the use of complementary features can significantly improve the system performances. In this paper, we propose to use an evolution strategy to optimize the complementarity of ...

متن کامل

Robust Voiced/unvoiced Classification Using Novel Features and Gaussian Mixture Model

2003

Jashmin K. Shah Ananth N. Iyer Brett Y. Smolenski Robert E. Yantorno

Need for deciding whether a given frame of a speech waveform should be classified as voiced speech or unvoiced speech arises in many speech analysis systems. Several approaches have been described in the literature for making this decision. In this paper, we have presented two novel approaches of using acoustical features and pattern recognition. The first method is based on Mel frequency cepst...

متن کامل

Formant frequency prediction from MFCC vectors in noisy environments

2005

Jonathan Darch Ben P. Milner Saeed Vaseghi

This paper proposes a method of predicting the formant frequencies of a frame of speech from its mel-frequency cepstral coefficient (MFCC) representation. Prediction is achieved through the creation of a Gaussian mixture model (GMM) which models the joint density of formant frequencies and MFCCs. Using this GMM and an input MFCC vector, a maximum a posteriori (MAP) prediction of the formant fre...

متن کامل

Text Dependent Speaker Recognition using MFCC features and BPANN

2013

Tessamma Thomas Tomi H. Kinnunen S. B. Davis

Mel-Frequency Cepstral Coefficients are spectral feature which are widely used for speaker recognition and text dependent speaker recognition systems are the most accurate in voice based authentication systems. In this paper, a text dependent speaker recognition method is developed. MFCCs are computed for a selected sentence. The first 13 MFCCs are considered for each frames of duration 26ms an...

متن کامل

Detecting sound events in basketball video archive

2001

Dongqing Zhang

The report proposes a method for detecting the sound events in a basketball game with focusing on detecting cheering sound. MFCC (Mel-frequency cepstral coefficient) features are used to identify the cheering sounds from speeches and other confusing sounds. The mfcc features are fed into a neural network and classified into three classes (cheering, speech, and others). To improve the MFCC-NN pe...

متن کامل

Sub-band based text-dependent speaker verification

Journal: :Speech Communication 2003

Perasiriyan Sivakumaran Aladdin M. Ariyaeeinia Martin Loomes

, (p s c S p th cepstral coefficient of the s th sub-bands { c 1 (1,p) = c(p) is the p th full-band cepstral parameter} S number of sub-bands Y(k) k th log spectral magnitude K number of log spectral magnitudes) (k Y ′ ′ k th log-energy outputs of the mel-scale filterbank K ′ ′ number of log-energy outputs of the mel-scale filterbank h t weight associated with the t th segment U number of compe...

متن کامل

Likelihood Ratio Calculation in Acoustic-Phonetic Forensic Voice Comparison: Comparison of Three Statistical Modelling Approaches

2016

Ewald Enzinger

This study compares three statistical models used to calculate likelihood ratios in acoustic-phonetic forensic-voicecomparison systems: Multivariate kernel density, principal component analysis kernel density, and a multivariate normal model. The data were coefficient values obtained from discrete cosine transforms fitted to human-supervised formant-trajectory measurements of tokens of /iau/ fr...

متن کامل

autocorrelation for a class of polynomials with coefficients defined on t

Journal: :iranian journal of science and technology (sciences) 2008

m. taghavi

in this work we deal with the coefficients of a (e it ) 2 , where a is in a class of polynomialshaving unimodular coefficients. we first present a technique that calculates lower bounds for particularautocorrelations and then in a more general case we present an upper bound for their maximal order.

متن کامل

Experimental evaluation of features for robust speaker identification

Journal: :IEEE Trans. Speech and Audio Processing 1994

Douglas A. Reynolds

This paper presents an experimental evaluation of different features and channel compensation techniques for robust speaker identification. The goal is to keep all processing and classification steps constant and to vary only the features and compensations used to allow a controlled comparison. A general, maximum-likelihood classifier based on Gaussian mixture densities is used as the classifie...

متن کامل