نتایج جستجو برای: ideal binary mask
تعداد نتایج: 224329 فیلتر نتایج به سال:
This paper proposes a framework for separating several speech sources in non-ideal, reverberant environments. A movable human dummy head residing in a normal office room is used to model the conditions humans experience when listening to complex auditory scenes. Before the source separation takes place the human dummy head explores the auditory scene and extracts characteristics the same way as...
Monaural speech segregation is an important problem in robust speech processing and has been formulated as a supervised learning problem. In supervised learning methods, the ideal binary mask (IBM) is usually used as the target because of its simplicity and large speech intelligibility gains. Recently, the ideal ratio mask (IRM) has been found to improve the speech quality over the IBM. However...
In a natural environment, speech signals are degraded by both reverberation and concurrent noise sources. While human listening is robust under these conditions using only two ears, current two-microphone algorithms perform poorly. The psychological process of figure-ground segregation suggests that the target signal is perceived as a foreground while the remaining stimuli are perceived as a ba...
To date, the most commonly used outcome measure for assessing ideal binary mask estimation algorithms is based on the difference between the hit rate and the false alarm rate (H-FA). Recently, the error distribution has been shown to substantially affect intelligibility. However, H-FA treats each mask unit independently and does not take into account how errors are distributed. Alternatively, a...
In this paper, we present a real-time implementation of the ideal binary-mask algorithm, which is a promising approach for enhancing speech intelligibility. Our implementation is hardware efficient, making it suitable for embedded biomedical devices such as hearing aids and cochlear implants. We tested our algorithm implementation on an FPGA platform, and produced results that verify that it ef...
Processing noisy signals using the ideal binary mask has been shown to improve automatic speech recognition (ASR) performance. In this paper, we present the first study that investigates the role of mask patterns in ASR under varying signalto-noise ratios (SNR), noise conditions and mask definitions. Binary masks are typically computed either by comparing the local SNR within a time-frequency u...
A time-varying Wiener filter extracts a speech signal from a mixture using the a priori signal-to-noise ratio in a local time-frequency unit. We estimate this ratio using a binaural processor and derive a ratio time-frequency mask. This mask is used to extract the speech, which is then fed to a conventional speech recognizer operating in the cepstral domain. We compare the performance of this s...
Identification and extraction of singing voice from within musical mixtures is a key challenge in source separation and machine audition. Recently, deep neural networks (DNN) have been used to estimate 'ideal' binary masks for carefully controlled cocktail party speech separation problems. However, it is not yet known whether these methods are capable of generalizing to the discrimination of vo...
A time-varying Weiner filter extracts the speech signal from a noisy mixture using the a priori signal-to-noise ratio in a local time-frequency unit. We estimate this ratio using a binaural processor and derive a ratio time-frequency mask. This mask is used to extract the speech signal, which is then fed to a conventional speech recognizer operating in the cepstral domain. We compare the perfor...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید