نتایج جستجو برای: mel frequency cel cepstrum mfcc

تعداد نتایج: 490625  

2010
GEORGE TZANETAKIS LUIS GUSTAVO MARTINS RANDY JONES

Recording engineers, mixers, and producers play important yet often overlooked roles in defining the sound of a particular record, artist, or group. The placement of different sound sources in space using stereo panning is an important component of their work. Stereo panning information typically is not utilized in music information retrieval (MIR) tasks such as genre and artist classification....

Journal: :CoRR 2014
Pedro Girão Antunes David Martins de Matos Ricardo Ribeiro Isabel Trancoso

In late 2011, Fado was elevated to the oral and intangible heritage of humanity by UNESCO. This study aims to develop a tool for automatic detection of Fado music based on the audio signal. To do this, frequency spectrum-related characteristics were captured form the audio signal: in addition to the Mel Frequency Cepstral Coefficients (MFCCs) and the energy of the signal, the signal was further...

2002
O. Farooq

In this paper we propose a filter bank structure derived by using admissible wavelet packet transform. These filters have Mel scale spacing and have an advantage of easy implementation with higher resolution in time-frequency domain because of wavelet transform. The features are obtained by first calculating the energy in each filter band and then applying the Discrete Cosine Transform (DCT) to...

2000
Dan Chazan Ron Hoory Gilad Cohen Meir Tzur

This paper presents a novel low complexity, frequency domain algorithm for reconstruction of speech from the melfrequency cepstral coe cients (MFCC), commonly used by speech recognition systems, and the pitch frequency values. The reconstruction technique is based on the sinusoidal speech representation. A set of sine-wave frequencies is derived using the pitch frequency and voicing decisions, ...

2006
Nengheng Zheng Ning Wang Tan Lee Pak-Chung Ching

This paper describes a speaker verification system which uses two complementary acoustic features: Mel-frequency cepstral coefficients (MFCC) and wavelet octave coefficients of residues (WOCOR). While MFCC characterizes mainly the spectral envelope, or the formant structure of the vocal tract system, WOCOR aims at representing the spectro-temporal characteristics of the vocal source excitation....

1998
Stefan Slomka Sridha Sridharan Vinod Chandran

Input level fusion and output level fusion methods are compared for fusing Mel-frequency Cepstral Coefficients with their corresponding delta coefficients. A 49 speaker subset of the King database is used under wideband and telephone conditions. The best input level fusion system is more computationally complex than the output level fusion system. Both input and output fusion systems were able ...

2011
Marius Vasile Ghiurcau Corneliu Rusu Jaakko Astola

The goal of this paper is to assess the effect of emotional state of a speaker when text-independent speaker identification is performed. Mel-frequency cepstral coefficients are the features of the speech signal used for speaker recognition. For training the speaker models and testing the system, Support Vector Machines are employed. Berlin emotional speech database, which contains 10 different...

2005
Jonathan Darch Ben P. Milner Xu Shao Saeed Vaseghi Qin Yan

This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCCs and formant frequencies using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method predict...

2005
Yi Yu Chiemi Watanabe Kazuki Joe

In this paper we present a fast and efficient match algorithm, which consists of two key techniques: Spectral Correlation Based Feature Merge(SCBFM) and Two-Step Retrieval(TSR). SCBFM can remove the redundant information. In consequence, the resulting feature sequence has a smaller size, requiring less storage and computation. In addition, most of the tempo variation is removed; thus a much sim...

2006
Hiroko Terasawa Malcolm Slaney Jonathan Berger

We describe a perceptual space for timbre, define an objective metric that takes into account perceptual orthogonality and measure the quality of timbre interpolation. We discuss two timbre representations and measure perceptual judgments on an equivalent range of timbre variety. We determine that a timbre space based on Mel-frequency cepstral coefficients (MFCC) is a good model for a perceptua...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید