audio input flooding

Narrowband perceptual audio coding: enhancements for speech

2001

Hossein Najaf-Zadeh Peter Kabal

This paper presents a bi-modal coding paradigm to compress narrowband audio signals at 8 kbit/s. In the general mode, the Enhanced Narrowband Audio Coder (ENPAC) exploits the characteristics of the human hearing system to adaptively code the perceptually important spectral components of the input audio. The other mode is employed to handle audio inputs with a strong harmonic structure. In that ...

متن کامل

Face Synthesis Driven by Audio Speech Input Based on Hmms

2002

Ling SUN Wei LAI Ren-Hua WANG

In this paper, a HMM-based visual speech system driven by audio speech input is designed to render a face model while synchronous audio is played. Compared to many methods adopted by other researchers, there is much difference between our approach and theirs. We first train the models for every final and initial in mandarin. In this process, a large quantity of audio training data under differe...

متن کامل

Robust Identification of Time-Scaled Audio

2004

Rolf Bardeli Frank Kurth

Automatic identification of audio titles on radio broadcasts is a first step towards automatic annotation of radio programmes. Systems designed for the purpose of identification have to deal with a variety of postprocessing potentially imposed on audio material at the radio stations. One of the more difficult techniques to be handled is time-scaling, i.e., the variation of playback speed. In th...

متن کامل

A Single Core Hardware Approach of MPEG Audio Decoder for Real-Time Transmission

2012

Mohd Marufuzzaman

The decoding of the voice audio bit stream is an issue in terms of real-time transmission of high quality voice audio over the Internet. A stand-alone chip to perform decoding is a better solution over software approach. The MPEG audio compression provides high compression with minimal loss. This study describes a VHDL model of MPEG audio layer 1 decoder that perform concurrent processing while...

متن کامل

Don't Look at Me, I'm Talking to You: Investigating Input and Output Modalities for In-Vehicle Systems

2011

Lars Holm Christiansen Nikolaj Yde Frederiksen Brit Susan Jensen Alex Ranch Mikael B. Skov Nissan Thiruravichandran

With a growing number of in-vehicle systems integrated in contemporary cars, the risk of driver distraction and lack of attention on the primary task of driving is increasing. One major research area concerns eyesoff-the-road and mind-off-the-road that are manifested in different ways for input and output techniques. In this paper, we investigate in-vehicle systems input and output techniques t...

متن کامل

Bi-directional AES/EBU Digital Audio and Remote Power over a single Cable

1999

Adrian Freed

Although the AES/EBU digital audio standard has already been adapted to optical, twisted pair [1] and coaxial cables [2]. This paper explores cabling options for new and emerging applications of digital audio communications. Enabling features include remotely powering devices over the audio cable and bi-directional communications. Remote power benefits both ends of the audio reproduction chain:...

متن کامل

Wireless Audio Effects Processor

2007

Lohith Kini Spyros Zoumpoulis Rahul Shroff

Our final project will be designing and implementing an audio system in which the audio signal is input, wirelessly transmitted, audio filtered and equalized. The audio is inputted through a microphone and is digitized using an Analog-To-Digital converter with the LM4550 AC’97 Codec. The audio signal is then compressed using the Discrete Cosine Transform (DCT) and is wirelessly transmitted and ...

متن کامل

A Multimodal Listener Behaviour Driven by Audio Input

2010

Etienne de Sevin Elisabetta Bevacqua Sathish Pammi Catherine Pelachaud Marc Schröder Björn Schuller

Our aim is to build a platform allowing a user to chat with virtual agent. The agent displays audio-visual backchannels as a response to the user’s verbal and nonverbal behaviours. Our system takes as inputs the audio-visual signals of the user and outputs synchronously the audio-visual behaviours of the agent. In this paper, we describe the SEMAINE architecture and the data flow that goes from...

متن کامل

Exploring Cognitivist and Emotivist Positions of Musical Emotion Using Neural Network Models

2013

Naresh N. Vempala Frank A. Russo

There are two positions in the classic debate regarding musical emotion: the cognitivist position and the emotivist position. According to the cognitivist position, music expresses emotion but does not induce it in listeners. So, listeners may recognize emotion in music without feeling it, unlike real, everyday emotion. According to the emotivist position, listeners not only recognize emotion b...

متن کامل

Multimodal voice conversion based on non-negative matrix factorization

Journal: :EURASIP J. Audio, Speech and Music Processing 2015

Kenta Masaka Ryo Aihara Tetsuya Takiguchi Yasuo Ariki

A multimodal voice conversion (VC) method for noisy environments is proposed. In our previous non-negative matrix factorization (NMF)-based VC method, source and target exemplars are extracted from parallel training data, in which the same texts are uttered by the source and target speakers. The input source signal is then decomposed into source exemplars, noise exemplars, and their weights. Th...

متن کامل