cepstral coefficients

Spoofing Detection on the ASVspoof2015 Challenge Corpus Employing Deep Neural Networks

2016

Md Jahangir Alam Patrick Kenny Vishwa Gupta Themos Stafylakis

This paper describes the application of deep neural networks (DNN), trained to discriminate between human and spoofed speech signals, to improve the performance of spoofing detection. In this work we use amplitude, phase, linear prediction residual, and combined amplitude phase-based acoustic level features. First we train a DNN on the spoofing challenge training data to discriminate between hu...

متن کامل

Smoothed Nonlinear Energy Operator-Based Amplitude Modulation Features for Robust Speech Recognition

2013

Md. Jahangir Alam Patrick Kenny Douglas D. O'Shaughnessy

In this paper we present a robust feature extractor that includes the use of a smoothed nonlinear energy operator (SNEO)-based amplitude modulation features for a large vocabulary continuous speech recognition (LVCSR) task. SNEO estimates the energy required to produce the AM-FM signal, and then the estimated energy is separated into its amplitude and frequency components using an energy separa...

متن کامل

Whether Mfcc or Gfcc Is Better for Recognizing Emotion from Speech? a Study

2014

Minu Babu

A major challenge for automatic speech recognition (ASR) relates to significant performance reduction in noisy environments. Recently, the study of the emotional content of speech signals got more importance and hence, many systems have been proposed to identify the emotional content of a spoken utterance. The important aspects of the design of a speech emotion recognition system are pre-proces...

متن کامل

Recognition of Tamil Syllables Using Vowel Onset Points with Production, Perception Based Features

2016

S. Karpagavalli E. Chandra

Tamil Language is one of the ancient Dravidian languages spoken in south India. Most of the Indian languages are syllabic in nature and syllables are in the form of Consonant-Vowel (CV) units. In Tamil language, CV pattern occurs in the beginning, middle and end of a word. In this work, CV Units formed with Stop Consonant – Short Vowel (SCSV) were considered for classification task. The work ca...

متن کامل

Amplitude modulation features for emotion recognition from speech

2013

Md. Jahangir Alam Yazid Attabi Pierre Dumouchel Patrick Kenny Douglas D. O'Shaughnessy

The goal of speech emotion recognition (SER) is to identify the emotional or physical state of a human being from his or her voice. One of the most important things in a SER task is to extract and select relevant speech features with which most emotions could be recognized. In this paper, we present a smoothed nonlinear energy operator (SNEO)-based amplitude modulation cepstral coefficients (AM...

متن کامل

Detection of landmines and underground utilities from acoustic and GPR images with a cepstral approach

Journal: :J. Visual Communication and Image Representation 2010

Umar S. Khan Waleed Al-Nuaimy Fathi E. Abd El-Samie

This paper introduces a cepstral approach for the automatic detection of landmines and underground utilities from acoustic and ground penetrating radar (GPR) images. This approach is based on treating the problem as a pattern recognition problem. Cepstral features are extracted from a group of images, which are transformed first to 1-D signals by lexicographic ordering. Mel-frequency cepstral c...

متن کامل

Robust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique

Journal: :Digital Signal Processing 2014

Md. Jahangir Alam Patrick Kenny Douglas D. O'Shaughnessy

In this paper we introduce a robust feature extractor, dubbed as robust compressive gammachirp filterbank cepstral coefficients (RCGCC), based on an asymmetric and level-dependent compressive gammachirp filterbank and a sigmoid shape weighting rule for the enhancement of speech spectra in the auditory domain. The goal of this work is to improve the robustness of speech recognition systems in ad...

متن کامل

GMM-based Classifiers for the Automatic Detection of Obstructive Sleep Apnea

2013

Jorge Andrés Gómez García José Luis Blanco Murillo Juan Ignacio Godino-Llorente Luis A. Hernández Gómez Germán Castellanos-Domínguez

The aim of automatic pathological voice detection systems is to serve as tools, to medical specialists, for a more objective, less invasive and improved diagnosis of diseases. In this respect, the gold standard for those systern^ include the usage of a^optimized representation of the spectral envelope, either based on cepstral coefficients from the mel-scaled Fourier spectral envelope (Mel-Freq...

متن کامل

Cosine distance features for robust speaker verification

2015

Kuruvachan K. George C. Santhosh Kumar K. I. Ramachandran Ashish Panda

We use similarities with people we know already as a means to enhance the speaker verification accuracy. Motivated by this, we use cosine distance similarities with a set of reference speakers, cosine distance features (CDF), to improve the performance of speaker verification systems for clean and additive noise test conditions. We used mel frequency cepstral coefficients, power normalized ceps...

متن کامل

Tandem Features for Text-Dependent Speaker Verification on the RedDots Corpus

2016

Md. Jahangir Alam Patrick Kenny Vishwa Gupta

We use tandem features and a fusion of four systems for textdependent speaker verification on the RedDots corpus. In the tandem system, a senone-discriminant neural network provides a low-dimensional bottleneck feature at each frame which are concatenated with a standard Mel-frequency cepstral coefficients (MFCC) feature representation. The concatenated features are propagated to a conventional...

متن کامل