cepstral coefficients

A model of dynamic auditory perception and its application to robust word recognition

Journal: :IEEE Trans. Speech and Audio Processing 1997

Brian Strope Abeer Alwan

This paper describes two mechanisms that augment the common automatic speech recognition (ASR) front end and provide adaptation and isolation of local spectral peaks. A dynamic model consisting of a linear filterbank with a novel additive logarithmic adaptation stage after each filter output is proposed. An extensive series of perceptual forward masking experiments, together with previously rep...

متن کامل

DWT features performance analysis for automatic speech recognition of Urdu

2014

Hazrat Ali Nasir Ahmad Xianwei Zhou Khalid Iqbal Sahibzada Muhammad Ali

This paper presents the work on Automatic Speech Recognition of Urdu language, using a comparative analysis for Discrete Wavelets Transform (DWT) based features and Mel Frequency Cepstral Coefficients (MFCC). These features have been extracted for one hundred isolated words of Urdu, each word uttered by ten different speakers. The words have been selected from the most frequently used words of ...

متن کامل

Perceptual Analysis of Speech Signals from People with Parkinson's Disease

2013

Juan R. Orozco-Arroyave Julián D. Arias-Londoño Jesus Francisco Vargas Bonilla Elmar Nöth

Parkinson’s disease (PD) is a neurodegenerative disorder of the nervous central system and it affects the limbs motor control and the communication skills of the patients. The evolution of the disease can get to the point of affecting the intelligibility of the patient’s speech. The treatments of the PD are mainly focused on improving limb symptoms and their impact on speech production is still...

متن کامل

Forward masking for increased robustness in automatic speech recognition

2001

Sascha Wendt Gernot A. Fink Franz Kummert

In automatic speech recognition mel-frequency cepstral coefficients (MFCC) or linear predictive cepstral coefficients (LPCC) are features commonly used today. However, their calculation considers only a few features of the auditory system. On the assumption that the human representation of speech is an optimal representation, considering more features of the auditory system might lead to a bett...

متن کامل

A Novel Method for Feature Extraction in Vocal Fold Pathology Diagnosis

2012

Vahid Majidnezhad Igor Kheidorov

Acoustic analysis is a proper method in vocal fold pathology diagnosis so that it can complement and in some cases replace the other invasive, based on direct vocal fold observation, methods. There are different approaches for vocal fold pathology diagnosis. These algorithms usually have two stages which are Feature Extraction and Classification. While the second stage implies a choice of a var...

متن کامل

A bio-inspired feature extraction for robust speech recognition

2014

Youssef Zouhir Kaïs Ouni

In this paper, a feature extraction method for robust speech recognition in noisy environments is proposed. The proposed method is motivated by a biologically inspired auditory model which simulates the outer/middle ear filtering by a low-pass filter and the spectral behaviour of the cochlea by the Gammachirp auditory filterbank (GcFB). The speech recognition performance of our method is tested...

متن کامل

Assessment of Dysarthric Speech Using Mfcc

2017

Speech is the effective form of communication between human and its environment. Dysarthria is a motor speech disorder in which the person lacks the control over articulators used for speech production. Speech accuracy is the outcome of well-timed and coordinated activities of the articulators and other related neuro muscular feature. In this paper, Speech utterance is converted into a phone se...

متن کامل

Listening Level Changes Music Similarity

2012

Michael Terrell György Fazekas Andrew C. Simpson Jordan B. L. Smith Simon Dixon

We examine the effect of listening level, i.e. the absolute sound pressure level at which sounds are reproduced, on music similarity, and in particular, on playlist generation. Current methods commonly use similarity metrics based on Mel-frequency cepstral coefficients (MFCCs), which are derived from the objective frequency spectrum of a sound. We follow this approach, but use the level-depende...

متن کامل

DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification

2016

Zeyan Oo Yuta Kawakami Longbiao Wang Seiichi Nakagawa Xiong Xiao Masahiro Iwahashi

The importance of the phase information of speech signal is gathering attention. Many researches indicate system combination of the amplitude and phase features is effective for improving speaker recognition performance under noisy environments. On the other hand, speech enhancement approach is taken usually to reduce the influence of noises. However, this approach only enhances the amplitude s...

متن کامل

Feature extraction from analytic phase of speech signals for speaker verification

2014

Karthika Vijayan Vinay Kumar K. Sri Rama Murty

The objective of this work is to study the speaker-specific nature of analytic phase of speech signals. Since computation of analytic phase suffers from phase wrapping problem, we have used its derivativethe instantaneous frequency for feature extraction. The cepstral coefficients extracted from smoothed subband instantaneous frequencies (IFCC) are used as features for speaker verification. The...

متن کامل