pesq

Quality control for UMTS-AMR speech channels

2005

Marc Werner Peter Vary

In UMTS speech transmissions, the Adaptive Multi-Rate (AMR) speech codec is employed to allow for a dynamic assignment of data rates to individual users. The control of AMR modes is based on quality measurements of the transmission channel and aims at the maximization of speech quality by selecting the mode which is best suited to the current interference situation. In contrast to GSM, a reduct...

متن کامل

Objective Speech Quality Estimation using Gaussian Mixture Models

2005

Tiago Henrique Falk Wai-Yip Geoffrey Chan

In this thesis, we propose the use of Gaussian mixture models (GMMs) as simple, yet effective predictors of perceived speech quality. A large pool of perceptual distortion features is extracted from speech files. Initially, statistical data mining algorithms are used to sift out the most relevant variables from the pool. We show that the five most salient feature variables are sufficient to con...

متن کامل

Audio Data Hiding That Is Robust with Respect to Aerial Transmission and Speech Codecs

2010

Akira Nishimura

A technique for audio data hiding by using subband amplitude modulation was evaluated by computer simulations in terms of robustness with respect to the cumulative effects of reverberations, background noise, and encoding and decoding with a speech codec. Speech signals from 22 speakers and signals from 100 pieces of music were used as the host audio data. Computer simulations revealed that spe...

متن کامل

An Iterative Phase Recovery Framework with Phase Mask for Spectral Mapping with an Application to Speech Enhancement

2016

Kehuang Li Bo Wu Chin-Hui Lee

We propose an iterative phase recovery framework to improve spectral mapping with an application to improving the performance of state-of-the-art speech enhancement systems using magnitude-based spectral mapping with deep neural networks (DNNs). We further propose to use an estimated time-frequency mask to reduce sign uncertainty in the overlap-add waveform reconstruction algorithm. In a series...

متن کامل

The Investigation of Frame Disturbance (fd) in Perceptual Evaluation Speech Quality (pesq) as a Perceptual Metric

2015

Ahmad Zamani Jusoh Roberto Togneri Sven Nordholm Nadzril Sulaiman Muhamad Haziq Khairolanuar

Satisfying customers’ needs economically is one of the important aspects in mobile communication industry. Provider should cater a good and consistent quality of service as expected by the customers. Hence, it is amounts to controlling the speech quality perceived by the customers. However, to control the speech quality, the reliable measurement of the speech quality must be determined first, t...

متن کامل

Speech quality prediction for artificial bandwidth extension algorithms

2013

Sebastian Möller Emilia Kelaidi Friedemann Köster Nicolas Côté Patrick Bauer Tim Fingscheidt Thomas Schlien Hannu Pulakka Paavo Alku

During the transition period from narrowband to wideband speech transmission services, Artificial Bandwidth Extension (ABE) algorithms are able to reduce the perceptual degradation of narrowband-transmitted speech signals by extending the audio bandwidth. In this paper, we analyze whether the resulting speech quality can be predicted reliably with instrumental models. Estimations from the new I...

متن کامل

Non-Uniform Sub-Band Kalman Filtering for Speech Enhancement

2007

Phu Ngoc Le Eliathamby Ambikairajah

In this paper, a novel method for single-channel speech enhancement based on Kalman filtering is proposed. Instead of applying the Kalman algorithm for full-band speech or uniform sub-band speech, speech enhancement is performed by applying the Kalman algorithm to non-uniform sub-band signals obtained from the decomposition of whole-band speech using gammatone filters. Simulation results indica...

متن کامل

CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

Journal: :IEEE Access 2022

This study presents a deep learning-based speech signal-processing mobile application known as CITISEN. The CITISEN can perform three functions: enhancement (SE), model adaptation (MA), and background noise conversion (BNC), which allow to be used platform for utilizing evaluating SE models flexibly extend the address various environments users. For SE, downloads pretrained on cloud server then...

متن کامل

Rate Distortion Analyses and Bounds on Speech Codec Performance

2011

Jerry D. Gibson Ying-Yi Li

We develop new rate distortion bounds for narrowband and wideband speech coding based on composite source models for speech and perceptual PESQ-MOS/WPESQ distortion measures. It is shown that these new rate distortion bounds do in fact lower bound the performance of important standardized speech codecs, including, G.726, G.727, AMR-NB, G.729, G.718, G.722, G.722.1, and AMR-WB. The approach is t...

متن کامل

Noise Reduction Using Wavelet Thresholding of Multitaper Estimators and Geometric Approach to Spectral Subtraction for Speech Coding Strategy

2012

Kai Chuan Chu Charles T. M. Choi

OBJECTIVES Noise reduction using wavelet thresholding of multitaper estimators (WTME) and geometric approach to spectral subtraction (GASS) can improve speech quality of noisy sound for speech coding strategy. This study used Perceptual Evaluation of Speech Quality (PESQ) to assess the performance of the WTME and GASS for speech coding strategy. METHODS This study included 25 Mandarin sentenc...

متن کامل