Robust Speech Detection with Heteroscedastic Discriminant Analysis Applied to the Time-frequency Energy

نویسندگان

  • Ye Tian
  • Zuoying Wang
  • Dajin Lu
چکیده

In this paper, we propose a robust speech detection algorithm with Heteroscedastic Discriminant Analysis (HDA) applied to the Time-Frequency Energy (TFE). The TFE consists of the log energy in time domain, the log energy in the fixed band 2503500 Hz, and the log Mel-scale frequency bands energy. The bottom-up algorithm with automatic threshold adjustment is used for accurate word boundary detection. Compared to the algorithms based on the energy in time domain [1], the ATF parameter [2], the energy and the LDA-MFCC parameter [3], the proposed algorithm shows better performance under different types of noise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust speech/non-speech detection using LDA applied to MFCC

In speech recognition, a speech/non-speech detection must be robust to noise. In this work, a new method for speech/nonspeech detection using a Linear Discriminant Analysis (LDA) applied to Mel Frequency Cepstrum Coefficients (MFCC) is presented. The energy is the most discriminant parameter between noise and speech. But with this single parameter, the speech/non-speech detection system detects...

متن کامل

Robust speech/non-speech detection using LDA applied to MFCC for continuous speech recognition

Continuous speech recognition applications need precise detection because the number of words to recognize is unknown and vocabulary words can be short. The speech/non-speech detection must be robust to the boundary precision. In this work, a new approach to evaluate detection algorithm for continuous speech recognition is presented. The speech/non-speech detection using energy parameter combin...

متن کامل

Acoustic and Data-driven Features for Robust Speech Activity Detection

In this paper we evaluate different features for speech activity detection (SAD). Several signal processing techniques are used to derive acoustic features that capture attributes of speech useful in differentiating speech segments in noise. The acoustic features include short-term spectral features, long-term modulation features both derived using Frequency Domain Linear Prediction (FDLP), and...

متن کامل

Long Span Features and Minimum Phoneme Error Heteroscedastic Linear Discriminant Analysis

In this paper we explore the effect of long-span features, resulting from concatenating multiple speech frames and projecting the resulting vector onto a subspace using Linear Discriminant Analysis (LDA) techniques. We show that LDA is not always effective in selecting the optimal combination of long-span features, and introduce a discriminative feature analysis method that seeks to minimize ph...

متن کامل

Audio classification using dominant spatial patterns in time-frequency space

This paper presents a novel audio discrimination algorithm using spatial features in time-frequency (TF) space. Three types of audio signals – speech, music without vocal and music with background vocal are taken into consideration for classification. The audio segment is transformed into TF domain yielding the spatial illustration of energy. Nonnegative matrix factorization (NMF) is applied to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002