utterance

Analysis and Assessment of Controllability of an Expressive Deep Learning-Based TTS System

Journal: :Informatics (Basel) 2021

In this paper, we study the controllability of an Expressive TTS system trained on a dataset for continuous control. The is Blizzard 2013 based audiobooks read by female speaker containing great variability in styles and expressiveness. Controllability evaluated with both objective subjective experiment. assessment measure correlation between acoustic features dimensions latent space representi...

متن کامل

Towards Context-adaptive Utterance Interpretation

2002

Robert Porzel Iryna Gurevych

متن کامل

Examination of the Locus of Positional Effects on Children's Production of Plural -s: Considerations From Local and Global Speech Planning.

Journal: :Journal of speech, language, and hearing research : JSLHR 2015

Rachel M Theodore Katherine Demuth Stefanie Shattuck-Hufnagel

PURPOSE Prosodic and articulatory factors influence children's production of inflectional morphemes. For example, plural -s is produced more reliably in utterance-final compared to utterance-medial position (i.e., the positional effect), which has been attributed to the increased planning time in utterance-final position. In previous investigations of plural -s, utterance-medial plurals were fo...

متن کامل

Modified re-synthesis of initial voiceless plosives by concatenation of speech from different speakers

2009

Sofia Strömbergsson

This paper describes a method of resynthesising utterance-initial voiceless plosives, given an original utterance by one speaker and a speech database of utterances by many other speakers. The system removes an initial voiceless plosive from an utterance and replaces it with another voiceless plosive selected from the speech database. (For example, if the original utterance was /tat/, the resyn...

متن کامل

Automatic Reconstruction of Utterance Boundaries Time Marks in Speech Database Re-grabbed from DAT Recorder

2005

Hynek Bořil

In this paper, an algorithm performing automatic reconstruction of utterance boundaries time marks in speech database re-grabbed from DAT recorder is presented. Originally, the database was grabbed from DAT and, after down-sampling, processed at 16 kHz. Utterance boundaries were manually found, each utterance was stored to a separate file and orthographic and phonetic transcriptions were perfor...

متن کامل

Utterance independent bimodal emotion recognition in spontaneous communication

Journal: :EURASIP J. Adv. Sig. Proc. 2011

Jianhua Tao Shifeng Pan Minghao Yang Ya Li Kaihui Mu Jianfeng Che

Emotion expressions sometimes are mixed with the utterance expression in spontaneous face-to-face communication, which makes difficulties for emotion recognition. This article introduces the methods of reducing the utterance influences in visual parameters for the audio-visual-based emotion recognition. The audio and visual channels are first combined under a Multistream Hidden Markov Model (MH...

متن کامل

Automatic Utterance Segmentation in Spontaneous Speech

2014

Norimasa Yoshida Peter Gorniak

As applications incorporating speech recognition technology become widely used, it is desireable to have such systems interact naturally with its users. For such natural interaction to occur, recognition systems must be able to accurately detect when a speaker has finished speaking. This research presents an analysis combining lower and higher level cues to perform the utterance endpointing tas...

متن کامل

Software architectures for incremental understanding of human speech

2006

Gregory Aist James F. Allen Ellen Campana Lucian Galescu Carlos Gómez Gallo Scott C. Stoness Mary D. Swift Michael K. Tanenhaus

The prevalent state of the art in spoken language understanding by spoken dialog systems is both modular and whole-utterance. It is modular in that incoming utterances are processed by independent components that handle different aspects, such as acoustics, syntax, semantics, and intention / goal recognition. It is whole-utterance in that each component completes its work for an entire utteranc...

متن کامل

Software architectures for incremental u

2006

Gregory Aist James Allen Ellen C Carlos A. Gómez Gallo Scott C. Stoness Lucian Galescu Michael Tanenhaus

The prevalent state of the art in spoken language understanding by spoken dialog systems is both modular and whole-utterance. It is modular in that incoming utterances are processed by independent components that handle different aspects, such as acoustics, syntax, semantics, and intention / goal recognition. It is whole-utterance in that each component completes its work for an entire utteranc...

متن کامل

Effects of utterance position on English speech timing.

Journal: :Phonetica 1982

J E Flege W S Brown

8 speakers of American English produced utterances consisting of one to five disyllables ([bábe] or [pápe]). Vowel and stop closure intervals were defined by variations in supraglottal pressure, sensed through a thin tube inserted in the mouth. Closure was always longer for /p/ than /b/ in utterance-medial positions. In utterance-initial position, however, /b/ lengthened more than /p/ so that n...

متن کامل