Visual speech synthesis for speech perception experiments
نویسندگان
چکیده
منابع مشابه
Audio-visual speech perception without speech cues
A series of experiments was conducted in which listeners were presented with audio-visual sentences in a transcription task. The visual components of the stimuli consisted of a male talker’s face. The acoustic components consisted of : (1) natural speech (2) envelope-shaped noise which preserved the duration and amplitude of the original speech waveform and (3) various types of sinewave speech ...
متن کاملVisual Speech Synthesis With Concatenative Speech
Today synthetic speech is often based on concatenation of natural speech, i.e. units such as diphones or polyphones are taken from natural speech and are then put together to form any word or sentence [5]. So far there have mainly been two ways of adding a visual modality to such a synthesis: Morphing between single images or concatenating video sequences. In this study, however, a new method i...
متن کاملClose Copy Speech Synthesis for Speech Perception Testing
The present study is concerned with developing a speech synthesis subcomponent for perception testing in the context of evaluating cochlear implants in children. We provide a detailed requirements analysis, and develop a strategy for maximally high quality speech synthesis using Close Copy Speech synthesis techniques with a diphone based speech synthesiser, MBROLA. The close copy concept used i...
متن کاملLaterality in visual speech perception.
The lateralization of visual speech perception was examined in 3 experiments. Participants were presented with a realistic computer-animated face articulating 1 of 4 consonant-vowel syllables without sound. The face appeared at 1 of 5 locations in the visual field. The participants' task was to identify each test syllable. To prevent eye movement during the presentation of the face, participant...
متن کاملTranslingual Visual Speech Synthesis
Audio-driven facial animation is an interesting and evolving technique for human-computer interaction. Based on an incoming audio stream, a face image is animated with full lip synchronization. This requires a speech recognition system in the language in which audio is provided to get the time alignment for the phonetic sequence of the audio signal. However, building a speech recognition system...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Journal of the Acoustical Society of America
سال: 1982
ISSN: 0001-4966
DOI: 10.1121/1.2019553