Non-Intrusive Binaural Speech Intelligibility Prediction from Discrete Latent Representations

نویسندگان

چکیده

Non-intrusive speech intelligibility (SI) prediction from binaural signals is useful in many applications. However, most existing signal-based measures are designed to be applied single-channel signals. Measures specifically take into account the properties of signal often intrusive - characterised by requiring access a clean and typically rely on combining both channels before making predictions. This paper proposes non-intrusive SI measure that computes features input using combination vector quantization (VQ) contrastive predictive coding (CPC) methods. VQ-CPC feature extraction does not any model auditory system instead trained maximise mutual information between output features. The computed predicting function parameterized neural network. Two functions considered this paper. Both extractor simulated with isotropic noise. They tested real For all signals, ground truth scores (intrusive) deterministic STOI. Results presented terms correlations MSE demonstrate able capture relevant modelling outperform benchmarks even when evaluating data comprising different noise field types.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Blind Non-Intrusive Speech Intelligibility Prediction Using Twin-HMMs

Automatic prediction of speech intelligibility is highly desirable in the speech research community, since listening tests are timeconsuming and can not be used online. Most of the available objective speech intelligibility measures are intrusive methods, as they require a clean reference signal in addition to the corresponding noisy/processed signal at hand. In order to overcome the problem of...

متن کامل

Prediction of binaural speech intelligibility against noise in rooms.

In the presence of competing speech or noise, reverberation degrades speech intelligibility not only by its direct effect on the target but also by affecting the interferer. Two experiments were designed to validate a method for predicting the loss of intelligibility associated with this latter effect. Speech reception thresholds were measured under headphones, using spatially separated target ...

متن کامل

Binaural prediction of speech intelligibility in reverberant rooms with multiple noise sources.

When speech is in competition with interfering sources in rooms, monaural indicators of intelligibility fail to take account of the listener's abilities to separate target speech from interfering sounds using the binaural system. In order to incorporate these segregation abilities and their susceptibility to reverberation, Lavandier and Culling [J. Acoust. Soc. Am. 127, 387-399 (2010)] proposed...

متن کامل

Role of binaural hearing in speech intelligibility and spatial release from masking using vocoded speech.

A cochlear implant vocoder was used to evaluate relative contributions of spectral and binaural temporal fine-structure cues to speech intelligibility. In Study I, stimuli were vocoded, and then convolved through head related transfer functions (HRTFs) to remove speech temporal fine structure but preserve the binaural temporal fine-structure cues. In Study II, the order of processing was revers...

متن کامل

The Influence of Dynamic Binaural Cues on Speech Intelligibility

Binaural cues help locating and segregating sound sources and can have a significant influence on speech intelligibility in complex acoustic conditions. Spatial separation of speech and noise maskers improves the speech reception thresholds (SRT). In binaurally ambiguous static listening conditions, however, the binaural intelligibility level difference (BILD) might become very small. Dynamic b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Signal Processing Letters

سال: 2022

ISSN: ['1558-2361', '1070-9908']

DOI: https://doi.org/10.1109/lsp.2022.3161115