نتایج جستجو برای: 1 linguistic behavior 2 paralinguistic information 3 prosodic features 4 acoustic correlates

تعداد نتایج: 6474078  

2011
Erin Cvejic Jeesun Kim Chris Davis

This study examined the perception of linguistic prosody from augmented point-light displays that were derived from motion tracking six talkers producing different prosodic contrasts. In Experiment 1, we determined perceivers’ ability to use these abstract visual displays to match prosody across modalities (audio to video), when the non-matching visual display was segmentally identical and diff...

2004
Shinya Fujie Tetsunori Kobayashi Daizo Yagi Hideaki Kikuchi

In this paper, prosody-based attitude recognition and its application to a spoken dialog system are proposed. Paralinguistic information plays a important role in the human communication. We aimed to recognize the user’s attitude by prosody, and apply it to a spoken dialog system as para-linguistic information. In order to find important features to recognize the attitude from automatically ext...

2011
John K. Pate Sharon Goldwater

Learning to group words into phrases without supervision is a hard task for NLP systems, but infants routinely accomplish it. We hypothesize that infants use acoustic cues to prosody, which NLP systems typically ignore. To evaluate the utility of prosodic information for phrase discovery, we present an HMMbased unsupervised chunker that learns from only transcribed words and raw acoustic correl...

2004

The purpose of this study was to explore the notion of prominence in spoken language. It concentrated on finding an operational definition of prominence, on giving a description of the linguistic and acoustical correlates of prominence, and on analyzing these correlates in terms of their contribution to prominence distinctions. Furthermore, this study was concerned with feature extraction, and ...

2005
Björn W. Schuller Ronald Müller Manfred K. Lang Gerhard Rigoll

Herein we present a comparison of novel concepts for a robust fusion of prosodic and verbal cues in speech emotion recognition. Thereby 276 acoustic features are extracted out of a spoken phrase. For linguistic content analysis we use the Bag-of-Words text representation. This allows for integration of acoustic and linguistic features within one vector prior to a final classification. Extensive...

2016

The purpose of this study was to explore the notion of prominence in spoken language. It concentrated on finding an operational definition of prominence, on giving a description of the linguistic and acoustical correlates of prominence, and on analyzing these correlates in terms of their contribution to prominence distinctions. Furthermore, this study was concerned with feature extraction, and ...

2007
Yasuhisa Fujii Norihide Kitaoka Seiichi Nakagawa

We automatically extract the summaries of spoken class lectures. This paper presents a novel method for sentence extraction-based automatic speech summarization. We propose a technique that extracts “cue phrases for important sentences (CPs)” that often appear in important sentences. We formulate CP extraction as a labeling problem of word sequences and use Conditional Random Fields (CRF) [1] f...

2006
Björn W. Schuller Niels Köhler Ronald Müller Gerhard Rigoll

Recognition of interest of a speaker within a human dialog bears great potential in many commercial applications. Within this work we therefore introduce an approach that analyses acoustic and linguistic cues of a spoken utterance. A systematic generation of more than 5k hi-level features basing on prosodic and spectral feature contours by means of descriptive statistical analysis and subsequen...

2014
Olli Vuolteenaho Sinikka Eskelinen Eero Väyrynen EERO VÄYRYNEN Tapio Seppänen Klára Vicsi Raimo Ahonen

Emotion recognition, a key step of affective computing, is the process of decoding an embedded emotional message from human communication signals, e.g. visual, audio, and/or other physiological cues. It is well-known that speech is the main channel for human communication and thus vital in the signalling of emotion and semantic cues for the correct interpretation of contexts. In the verbal chan...

2007
Anton Batliner Christian Hacker Moritz Kaiser Hannes Mögele Elmar Nöth

In the German SmartWeb project, the user is interacting with the web via a PDA in order to get information on, for example, points of interest. To overcome the tedious use of devices such as push-to-talk, but still to be able to tell whether the user is addressing the system or talking to herself or to a third person, we developed a module that monitors speech and video in parallel. Our databas...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید