mel frequency cepstral coefficient

Feature Learning with Gaussian Restricted Boltzmann Machine for Robust Speech Recognition

Journal: :CoRR 2013

Xin Zheng Zhiyong Wu Helen M. Meng Weifeng Li Lianhong Cai

In this paper, we first present a new variant of Gaussian restricted Boltzmann machine (GRBM) called multivariate Gaussian restricted Boltzmann machine (MGRBM), with its definition and learning algorithm. Then we propose using a learned GRBM or MGRBM to extract better features for robust speech recognition. Our experiments on Aurora2 show that both GRBM-extracted and MGRBM-extracted feature per...

متن کامل

Feature extraction using Mel frequency cepstral coefficients for hyperspectral image classification

2010

Delian Liu Xiaorui Wang Jianqi Zhang Xi Huang

The Mel frequency cepstral coefficient (MFCC) model, which is widely used in speech detection and recognition, is introduced to extract features from hyperspectral image data. The similarities and differences between speech signals and spectral image data are compared and analyzed. The standard MFCC model is then improved to suit the characteristics of spectral image data by reintroducing the d...

متن کامل

Perbandingan Akurasi Deteksi Emosi Pada Suara Menggunakan Multilayer Perceptron, Random Forest, Decision Tree dan K-NN

Journal: :Insyst 2023

Penelitian ini bertujuan untuk membandingkan akurasi pengenalan emosi melalui suara dengan menggunakan beberapa jenis classifier. Emosi dasar yang akan dikenali ada 4, yaitu senang, sedih, neutral dan marah. Metodologi penelitian dimulai memperoleh dataset dari database RAVDESS, terdiri 24 aktor jumlah sebanyak 60 per aktor. Namun, hanya 28 dipilih setiap aktor, sehingga total 672 digunakan dal...

متن کامل

[Comparison of cepstral coefficients to other voice evaluation parameters in patients with occupational dysphonia].

Journal: :Medycyna pracy 2013

Ewa Niebudek-Bogusz Paweł Strumiłło Justyna Wiktorowicz Mariola Sliwińska-Kowalska

UNLABELLED BACKGROUND Special consideration has recently been given to cepstral analysis with mel-frequency cepstral coefficients (MFCCs). The aim of this study was to assess the applicability of MFCCs in acoustic analysis for diagnosing occupational dysphonia in comparison to subjective and objective parameters of voice evaluation. MATERIALS AND METHODS The study comprised 2 groups, one of 5...

متن کامل

Frequency-warping in speech

1996

Srinivasan Umesh Leon Cohen Nenad Marinovic Douglas J. Nelson

In this paper we present results that indicate that the formant frequencies between di erent speakers scale di erently at di erent frequencies. Based on our experiments on speech data, we then numerically compute a universal frequencywarping function, to make the scale-factor independent of frequency in the warped domain. The proposed warping function is found to be similar to the mel-scale, wh...

متن کامل

A Study of Low-variance Multi-taper Features for Distributed Speech Recognition

2011

Md. Jahangir Alam Patrick Kenny Douglas D. O'Shaughnessy

In this paper we study low-variance multi-taper spectrum estimation methods to compute the mel-frequency cepstral coefficient (MFCC) features for robust speech recognition. In speech recognition, MFCC features are usually computed from a Hamming-windowed DFT spectrum. Although windowing helps in reducing the bias of the spectrum, but variance remains high. Multitaper spectrum estimation methods...

متن کامل

Puff Noise Detection and Cancellation for Robust Speech Recognition

2012

Sangjun Park Jungpyo Hong Byung - Ok Kang Yun - keun Lee Minsoo Hahn

In this paper, an algorithm for detecting and attenuating puff noises frequently generated under the mobile environment is proposed. As a baseline system, puff detection system is designed based on Gaussian Mixture Model (GMM), and 39th Mel Frequency Cepstral Coefficient (MFCC) is extracted as feature parameters. To improve the detection performance, effective acoustic features for puff detecti...

متن کامل

Application of Mel Cepstral Representation of Voice Recordings for Diagnosing Vocal Disorders

2012

Jacek GRYGIEL Paweł STRUMIŁŁO Ewa NIEBUDEK-BOGUSZ

The aim of this study was to assess the applicability of Mel Frequency Cepstral Coefficients (MFCC) of voice samples in diagnosing vocal nodules and polyps. Patients’ voice samples were analysed acoustically with the measurement of MFCC and values of the first three formants. Classification of mel coefficients was performed by applying the Sammon Mapping and Support Vector Machines. For the tes...

متن کامل

a novel hybrid method for vocal fold pathology diagnosis based on russian language

Journal: :journal of ai and data mining 2014

vahid majidnezhad

in this paper, first, an initial feature vector for vocal fold pathology diagnosis is proposed. then, for optimizing the initial feature vector, a genetic algorithm is proposed. some experiments are carried out for evaluating and comparing the classification accuracies which are obtained by the use of the different classifiers (ensemble of decision tree, discriminant analysis and k-nearest neig...

متن کامل

On an Advanced Strategy of Gammachirp Wavelet Transform for Isolated Words Recognition using HMM

2014

Khaireddine Salhi Zied Hajaiej Noureddine Ellouze

The gammachirp filter bank was used to model human cochlea filtering. Recently, we have proposed an auditory feature based on gammachirp wavelet transform. We have found that the features extractor gives a more significant improvement in robust speaker recognition than conventional acoustic features. The Gammachirp Wavelet Transform Frequency Cepestral Coefficients GWTFCC proposed the use of th...

متن کامل