cepstral coefficients

Quantization of cepstral parameters for speech recognition over the World Wide Web

1998

Vassilios Digalakis Leonardo Neumeyer Manolis Perakakis

We examine alternative architectures for a client-server model of speech-enabled applications over the World Wide Web. We compare a server-only processing model, where the client encodes and transmits the speech signal to the server, to a model where the recognition front end runs locally at the client and encodes and transmits the cepstral coefficients to the recognition server over the Intern...

متن کامل

Detection of Replay Attacks Using Single Frequency Filtering Cepstral Coefficients

2017

K. N. R. K. Raju Alluri Sivanand Achanta Sudarsana Reddy Kadiri Suryakanth V. Gangashetty Anil Kumar Vuppala

Automatic speaker verification systems are vulnerable to spoofing attacks. Recently, various countermeasures have been developed for detecting high technology attacks such as speech synthesis and voice conversion. However, there is a wide gap in dealing with replay attacks. In this paper, we propose a new feature for replay attack detection based on single frequency filtering (SFF), which provi...

متن کامل

Robust Feature Vector Set Using Higher Order Autocorrelation Coefficients

Journal: :IJCINI 2010

Poonam Bansal Amita Dev Shail Bala Jain

In this paper, a feature extraction method that is robust to additive background noise is proposed for automatic speech recognition. Since the background noise corrupts the autocorrelation coefficients of the speech signal mostly at the lower orders, while the higher-order autocorrelation coefficients are least affected, this method discards the lower order autocorrelation coefficients and uses...

متن کامل

Mel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognition

2009

Howard Lei Eduardo López Gonzalo

We’ve examined the speaker discriminative power of mel-, antimeland linear-frequency cepstral coefficients (MFCCs, aMFCCs and LFCCs) in the nasal, vowel, and non-nasal consonant speech regions. Our inspiration came from the work of Lu and Dang in 2007, who showed that filterbank energies at some frequencies mainly outside the telephone bandwidth possess more speaker discriminative power due to ...

متن کامل

Robust Emotion Recognition using Pitch Synchronous and Sub-syllabic Spectral Features

2018

This chapter discusses the use of vocal tract information for recognizing the emotions. Linear prediction cepstral coefficients (LPCC) and mel frequency cepstral coefficients (MFCC) are used as the correlates of vocal tract information. In addition to LPCCs and MFCCs, formant related features are also explored in this work for recognizing emotions from speech. Extraction of the above mentioned ...

متن کامل

Novel Cochlear Filter Based Cepstral Coefficients for Classification of Unvoiced Fricatives

2014

Namrata Singh Nikhil Bhendawade Hemant A. Patil

In this paper, the use of new auditory-based features derived from cochlear filters, have been proposed for classification of unvoiced fricatives. Classification attempts have been made to classify sibilant (i.e., /s/, /sh/) vs. non-sibilants (i.e., /f/, /th/) as well as for fricatives within each sub-category (i.e., intra-sibilants and intra-non-sibilants). Our experimental results indicate th...

متن کامل

Hybrid Feature and Decision Fusion Based Audio-Visual Speaker Identification in Challenging Environment

2010

Md. Rabiul Islam Md. Fayzur Rahman D. G. Stork M. E. Hennecke C. C. Chibelushi F. Deravi

The contribution of this paper is to propose a novel approach of evaluating the performance of a noise robust audio-visual speaker identification system in challenging environment. Though the traditional HMM based audio-visual speaker identification system is very sensitive to the speech parameter variation, the proposed hybrid feature and decision fusion based audio-visual speaker identificati...

متن کامل

Canonical Correlation Analysis between Time Series and Static Outcomes, with Application to the Spectral Analysis of Heart Rate Variability.

Journal: :The annals of applied statistics 2013

Robert T Krafty Martica Hall

Although many studies collect biomedical time series signals from multiple subjects, there is a dearth of models and methods for assessing the association between frequency domain properties of time series and other study outcomes. This article introduces the random Cramér representation as a joint model for collections of time series and static outcomes where power spectra are random functions...

متن کامل

A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques

1998

Keiichi Tokuda Takashi Masuko Jun Hiroi Takao Kobayashi Tadashi Kitamura

This paper presents a very low bit rate speech coder based on HMM (Hidden Markov Model). The encoder carries out phoneme recognition, and transmits phoneme indexes, state durations and pitch information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indexes, and a sequence of mel-cepstral coefficient vectors is generated from the concatenated HMM by using...

متن کامل

Spectro-temporal modulation energy based mask for robust speaker identification.

Journal: :The Journal of the Acoustical Society of America 2012

Tai-Shih Chi Ting-Han Lin Chung-Chien Hsu

Spectro-temporal modulations of speech encode speech structures and speaker characteristics. An algorithm which distinguishes speech from non-speech based on spectro-temporal modulation energies is proposed and evaluated in robust text-independent closed-set speaker identification simulations using the TIMIT and GRID corpora. Simulation results show the proposed method produces much higher spea...

متن کامل