cepstral coefficients

SVM-based Voice Activity Detection for Distributed Specch Recognition System

2015

Azzedine Touazi Mohamed Debyeche

Voice Activity Detection (VAD) algorithms based on machine learning techniques have shown competitive results in the area of automatic speech recognition. This paper describes a new approach of VAD based on Support Vector Machines (SVM) for Distributed Speech Recognition (DSR) system. In the proposed scheme, the speech and the non-speech frames are detected from the compressed Mel Frequency Cep...

متن کامل

Emotion Recognition and Evaluation of Mandarin Speech Using Weighted D-KNN Classification

2005

Tsang-Long Pao Yu-Te Chen Jun-Heng Yeh Yuan-Hao Chang

In this paper, we proposed a weighted discrete K-nearest neighbor (weighted D-KNN) classification algorithm for detecting and evaluating emotion from Mandarin speech. In the experiments of the emotion recognition, Mandarin emotional speech database used contains five basic emotions, including anger, happiness, sadness, boredom and neutral, and the extracted acoustic features are Mel-Frequency C...

متن کامل

Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection

2017

Hardik B. Sailor Madhu R. Kamble Hemant A. Patil

Speech Synthesis (SS) and Voice Conversion (VC) presents a genuine risk of attacks for Automatic Speaker Verification (ASV) technology. In this paper, we use our recently proposed unsupervised filterbank learning technique using Convolutional Restricted Boltzmann Machine (ConvRBM) as a frontend feature representation. ConvRBM is trained on training subset of ASV spoof 2015 challenge database. A...

متن کامل

Improving Speaker Identification Performance by Combining Vocal Tract Features

2012

S.Selva Nidhyananthan Selva Kumari

This paper proposes fusion and addition techniques of vocal tract features such as Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Mel Frequency Cepstral Coefficients (DMFCC) in speaker identification. Feature extraction plays an important role as a front end processing block in Speaker Identification (SI) process. Mel frequency features are used to extract the spectral characteristics o...

متن کامل

On the use of filter-bank energies as features for robust speech recognition

1999

Kuldip K. Paliwal

have been very successful in speech recognition, they have the following two problems: 1) They do not have any physical interpretation, and 2) Liftering of cepstral coefficients, found to be highly useful in the earlier dynamic warping-based speech recognition systems, has no effect in the recognition process when used with continuous observation Gaussian density 4 hidden Markov models. In this...

متن کامل

Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition

Journal: :Speech Communication 2012

Eduardo Pavez Jorge F. Silva

This work proposes using Wavelet-Packet Cepstral coefficients (WPPCs) as an alternative way to do filter-bank energy-based feature extraction (FE) for automatic speech recognition (ASR). The rich coverage of time-frequency properties of Wavelet Packets (WPs) is used to obtain new sets of acoustic features, in which competitive and better performances are obtained with respect to the widely adop...

متن کامل

Performance Evaluation of Bangla Word Recognition Using Different Acoustic Features

2010

Nusrat Jahan Lisa Qamrun Nahar Eity Ghulam Muhammad Mohammad Nurul Huda Chowdhury Mofizur Rahman

This paper describes a medium size Bangla speech corpus preparation and the comparison of the performances of different acoustic features for Bangla word recognition. A small number of speakers are use for most of the Bangla automatic speech recognition (ASR) system, but 40 speakers selected from a wide area of Bangladesh, where Bangla is used as a native language, are involved here. In the exp...

متن کامل

Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event Recognition

2012

Yi Ren Leng Tran Huy Dat

The proposed Missing Feature Linear-Frequency Cepstral Coefficients (MF-LFCC) is a noise robust cepstral feature that transforms both clean and noisy signals into a similar representation. Unlike conventional Missing Feature Techniques, the MF-LFCC does not require the substitution of spectrogram elements (imputation) or classifier modification (marginalization). To improve the noise mask used ...

متن کامل

Improving the filter bank of a classic speech feature extraction algorithm

2003

Mark D. Skowronski John G. Harris

The most popular speech feature extractor used in automatic speech recognition (ASR) systems today is the mel frequency cepstral coefficient (mfcc) algorithm. Introduced in 1980, the filter bank-based algorithm eventually replaced linear prediction cepstral coefficients (lpcc) as the premier front end, primarily because of mfcc’s superior robustness to additive noise. However, mfcc does not app...

متن کامل

Joint Cohort Normalization in a Multi-Feature Speaker Verification System

2001

Conrad Sanderson Kuldip K. Paliwal

In this paper we propose a new fusion technique, termed Joint Cohort Normalization Fusion, where the information fusion is done prior to the likelihood ratio test in a speaker verification system. The performance of the technique is compared against two popular types of fusion: feature vector concatenation and expert opinion fusion, for fusion of Mel Frequency Cepstral Coefficients (MFCC), MFCC...

متن کامل