A data-driven speech enhancement method based on modeled long-range temporal dynamics

نویسندگان

  • Yue Hao
  • Changchun Bao
  • Feng Bao
  • Feng Deng
چکیده

In this paper, a data-driven speech enhancement method based on modeled long-range temporal dynamics (LRTDs) is proposed. First, given speech and noise corpora, Gaussian Mixture Models (GMMs) of the speech and noise can be trained respectively based on the expectation-maximization (EM) algorithm. Then, the LRTDs are obtained from the GMM models. Next, based on the LRTDs, a noise robustness longest segment searching (NRLSS) method combined with the Vector Taylor Series (VTS) approximation algorithm is adopted to search the longest matching speech and noise segments (LMSNS) from speech and noise corpora. Finally, using the obtained LMSNS, the estimation of speech spectrum is achieved. Furthermore, a modified Wiener filter is constructed to further eliminate residual noise. The test results show that the proposed method outperforms the state-of-the-art speech enhancement methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Inter-frame modeling of DFT trajectories of speech and noise for speech enhancement using Kalman filters

In this paper a time-frequency estimator for enhancement of noisy speech signals in the DFT domain is introduced. This estimator is based on modeling the time-varying correlation of the temporal trajectories of the short time (ST) DFT components of the noisy speech signal using autoregressive (AR) models. The timevarying trajectory of the DFT components of speech in each channel is modeled by a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015