Towards single-channel unsupervised source separation of speech mixtures: the layered harmonics/formants separation-tracking model

نویسندگان

Manuel Reyes-Gomez

Nebojsa Jojic

Daniel P. W. Ellis

چکیده

Speaker models for blind source separation are typically based on HMMs consisting of vast numbers of states to capture source spectral variation, and trained on large amounts of isolated speech. Since observations can be similar between sources, inference relies on sequential constraints from the state transition matrix which are, however, quite weak. To avoid these problems, we propose a strategy of capturing local deformations of the time-frequency energy distribution. Since consecutive spectral frames are highly correlated, each frame can be accurately described as a nonuniform deformation of its predecessor. A smooth pattern of deformations is indicative of a single speaker, and the cliffs in the deformation fields may indicate a speaker switch. Further, the log-spectrum of speech can be decomposed into two additive layers, separately describing the harmonics and formant structure. We model smooth deformations as hidden transformation variables in both layers, using MRFs with overlapping subwindows as observations, assumed to be a noisy sum of the two layers. Loopy belief propagation provides for efficient inference. Without any pre-trained speech or speaker models, this approach can be used to fill in missing time-frequency observations, and the local entropy of the deformation fields indicate source boundaries for separation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reconstructing individual monophonic instruments from musical mixtures using scene completion

Monaural sound source separation is the process of separating sound sources from a single channel mixture. In mixtures of pitched musical instruments, the problem of overlapping harmonics poses a significant challenge to source separation and reconstruction. One standard method to resolve overlapped harmonics is based on the assumption that harmonics of the same source have correlated amplitude...

متن کامل

Denoising through source separation and minimum tracking

In this paper, we develop a multi-channel noise reduction algorithm based on blind source separation (BSS). In contrast to general BSS algorithms that attempt to recover all the signals, we explicitly estimate only the speech signal. By tracking the minimum of the spectral density of the microphone signals, noise-only segments are identified. The coefficients of the unmixing matrix that are nec...

متن کامل

Self-adaption in single-channel source separation

Single-channel source separation (SCSS) usually uses pre-trained source-specific models to separate the sources. These models capture the characteristics of each source and they perform well when matching the test conditions. In this paper, we extend the applicability of SCSS. We develop an EM-like iterative adaption algorithm which is capable to adapt the pre-trained models to the changed char...

متن کامل

Multi-channel Source Separation by Beamforming Trained with Factorial Hmms

Speaker separation has conventionally been treated as a problem of Blind Source Separation (BSS). This approach does not utilize any knowledge of the statistical characteristics of the signals to be separated, relying mainly on the independence between the various signals to separate them. Maximum-likelihood techniques, on the other hand, utilize knowledge of the a priori probability distributi...

متن کامل

Source-Filter-Based Single-Channel Speech Separation Using Pitch Information

In this paper, we investigate the source–filter-based approach for single-channel speech separation. We incorporate source-driven aspects by multi-pitch estimation in the model-driven method. For multi-pitch estimation, the factorial HMM is utilized. For modeling the vocal tract filters either vector quantization (VQ) or non-negative matrix factorization are considered. For both methods, the fi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Towards single-channel unsupervised source separation of speech mixtures: the layered harmonics/formants separation-tracking model

نویسندگان

چکیده

منابع مشابه

Reconstructing individual monophonic instruments from musical mixtures using scene completion

Denoising through source separation and minimum tracking

Self-adaption in single-channel source separation

Multi-channel Source Separation by Beamforming Trained with Factorial Hmms

Source-Filter-Based Single-Channel Speech Separation Using Pitch Information

عنوان ژورنال:

اشتراک گذاری