Single-Microphone Speech Dereverberation: Modulation Domain Processing and Quality Assessment
نویسنده
چکیده
In a reverberant enclosure, acoustic speech signals are degraded by reflections from walls, ceilings, and objects. Restoring speech quality and intelligibility from reverberated speech has received increasing interest over the past few years. Although multiple channel dereverberation methods provide some improvements in speech quality/intelligibility, single-channel dereverberation remains an open challenge. Two types of advanced single-channel dereverberation methods, namely acoustic domain spectral subtraction and modulation domain filtering, provide small improvement in speech quality and intelligibility. In this thesis, we study single-channel dereverberation algorithms. Firstly, an upper bound of time-frequency masking (TFM) performance for dereverberation is obtained using ideal time-frequency masking (ITFM). ITFM has access to both the clean and reverberated speech signals in estimating the binary-mask matrix. ITFM implements binary masking in the short time Fourier transform (STFT) domain, preserving only those spectral components less corrupted by reverberation. The experiment results show that single-channel ITFM outperforms four existing multi-channel dereverberation methods and suggest that large potential improvements could be obtained using TFM for speech dereverberation. Secondly, a novel modulation domain spectral subtraction method is proposed i for dereverberation. This method estimates modulation domain long reverberation spectral variance (LRSV) from time domain LRSV using a statistical room impulse response (RIR) model and implements spectral subtraction in the modulation domain. On one hand, different from acoustic domain spectral subtraction, our method implements spectral subtraction in the modulation domain, which has been shown to play an important role in speech perception. On the other hand, different from modulation domain filtering which uses a time-invariant filter, our method takes the changes of reverberated speech spectral variance along time into account and implements spectral subtraction adaptively. Objective and informal subjective tests show that our proposed method outperforms two existing state-of-the-art single-channel dereverberation algorithms.
منابع مشابه
Single-Microphone Speech Dereverberation based on Multiple-Step Linear Predictive Inverse Filtering and Spectral Subtraction
Complies with the regulations of this University and meets the accepted standards with respect to originality and quality. Single-channel speech dereverberation is a challenging problem of deconvolution of reverberation, produced by the room impulse response, from the speech signal, when only one observation of the reverberant signal (one microphone) is available. Although reverberation in mild...
متن کاملOne Microphone Blind Dereverberation Based on Quasi-periodicity of Speech Signals
Speech dereverberation is desirable with a view to achieving, for example, robust speech recognition in the real world. However, it is still a challenging problem, especially when using a single microphone. Although blind equalization techniques have been exploited, they cannot deal with speech signals appropriately because their assumptions are not satisfied by speech signals. We propose a new...
متن کاملA General Framework for Incorporating Time-Frequency Domain Sparsity in Multichannel Speech Dereverberation
Blind multichannel speech dereverberation methods based on multichannel linear prediction (MCLP) estimate the dereverberated speech component without any knowledge of the room acoustics by estimating and subtracting the undesired reverberant component from the reference microphone signal. In this paper we present a general framework for incorporating sparsity in the time-frequency domain into M...
متن کاملCombined Frequency-domain Dereverberation and Noise Reduction Technique for Multi-microphone Speech Enhancement
In this paper a frequency-domain technique is described for estimating the acoustic transfer functions, when reverberated speech signals are corrupted by spatially coloured noise. This technique is an extension of the frequencydomain procedure of [1], which is only optimal in the case of spatially white noise. Using the estimated acoustic transfer functions, dereverberation can be performed wit...
متن کاملTitle Placeholder
A speech signal captured by a distant microphone is generally contaminated by reverberation and background noise, which severely degrade the automatic speech recognition (ASR) performance. In this paper, we first extend a previously proposed single channel dereverberation algorithm to a multi-channel scenario. The method estimates late reflections using multichannel multi-step linear prediction...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011