audio signal

Comparing audio- and a-posteriori-probability-based stream confidence measures for audio-visual speech recognition

2001

Martin Heckmann Thorsten Wild Frédéric Berthommier Kristian Kroschel

During the fusion of audio and video information for speech recognition, the estimation of the reliability of the noise affected audio channel is crucial to get meaningful recognition results. In this paper we compare two types of reliability measures. One is the use of the statistics of the phoneme a-posteriori probabilities and the other is the analysis of the audio signal itself. We implemen...

متن کامل

An Audio Blind Watermarking Scheme Based on DWT-SVD

Journal: :JSW 2013

Yong-mei Cai Wen-qiang Guo Hai-yang Ding

In order to protect the digital audio and video products copyright in the network, an improved audio blind watermarking algorithm scheme based on discrete wavelet transform (DWT) and singular value decomposition (SVD) is proposed. In the algorithm, an original audio is split as blocks and each block is decomposed on discrete wavelet transform for two degree, then first quarter audio approximate...

متن کامل

Large Vocabulary Audio-Visual Speech Recognition Using Active Shape Models

2000

Tanveer A. Faruquie Abhik Majumdar Nitendra Rajput L. Venkata Subramaniam

Orthogonal information present in the video signal associated with the audio helps in improving the accuracy of a speech recognition system. Audio-visual speech recognition involves extraction of both the audio as well as visual features from the input signal. Extraction of visual parameters is done by the recognition of speech dependent features from the video sequence. This paper uses geometr...

متن کامل

Speech intelligibility derived from asynchronous processing of auditory-visual information

2001

Ken W. Grant Steven Greenberg

The current study examines the temporal parameters associated with cross-modal integration of auditory-visual information for sentential material (Harvard/IEEE sentences). The speech signal was filtered into 1/3-octave channels, all of which were discarded (in the primary experiment) save for a low-frequency (298-375 Hz) and a high-frequency (4762-6000 Hz) band. The intelligibility of this audi...

متن کامل

Common Acoustical Pole Estimation from Multi-Channel Musical Audio Signals

Journal: :IEICE Transactions 2006

Takuya Yoshioka Takafumi Hikichi Masato Miyoshi Hiroshi G. Okuno

This paper describes a method for estimating the amplitude characteristics of poles common to multiple room transfer functions from musical audio signals received by multiple microphones. Knowledge of these pole characteristics would make it easier to manipulate audio equalizers, since they correspond to the room resonance. It has been proven that an estimate of the poles can be calculated prec...

متن کامل

Required bit rate of 22.2 multichannel audio signal compressed by MPEG-H 3D Audio to meet broadcast quality

Journal: :Acoustical Science and Technology 2018

متن کامل

Using Impulse Response Technology to Recreate the Sonic Characteristics of Analog Microphone Preamps and Acoustic Spaces

2013

Kevin Boettger

The study of impulse response technology allows audio engineers to attempt to recreate the tonal characteristics of a certain device. The purpose of this project was to recreate a sonic representation of analog microphone pre amplifiers and acoustic spaces through the use of impulse response technology. By creating these impulses, the student will attain the sonic characteristics of professiona...

متن کامل

Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR

2016

Sebastian Gergen Steffen Zeiler Ahmed Hussen Abdelaziz Robert M. Nickel Dorothea Kolossa

Automatic speech recognition (ASR) enables very intuitive human-machine interaction. However, signal degradations due to reverberation or noise reduce the accuracy of audio-based recognition. The introduction of a second signal stream that is not affected by degradations in the audio domain (e.g., a video stream) increases the robustness of ASR against degradations in the original domain. Here,...

متن کامل

Audio Steganography for Covert Data Transmission by Imperceptible Tone Insertion

2004

Kaliappan Gopalan Stanley Wenndt

This paper presents the technique of embedding data in an audio signal by inserting low power tones and its robustness to noise and cropping of embedded speech samples. Experiments on the embedding procedure applied to cover audio utterances from noise-free TIMIT database and a noisy database demonstrate the feasibility of the technique in terms of imperceptible embedding, high data rate and ac...

متن کامل

Mixed Watermarking-Fingerprinting Approach for Integrity Verification of Audio Recordings

2002

Emilia Gómez Pedro Cano Leandro de C. T. Gomes Eloi Batlle Madeleine Bonnet

We introduce a method for audio-integrity verification based on a combination of watermarking and fingerprinting. An audio fingerprint is a perceptual digest that holds content information of a recording and allows one to identify it from other recordings. Integrity verification is performed by embedding the fingerprint into the audio signal itself by means of a watermark. The original fingerpr...

متن کامل