نتایج جستجو برای: 1 linguistic behavior 2 paralinguistic information 3 prosodic features 4 acoustic correlates

تعداد نتایج: 6474078  

2011
Florian Eyben Stavros Petridis Björn Schuller Maja Pantic

In this study, we investigate an audiovisual approach for classification of vocal outbursts (non-linguistic vocalisations) in noisy conditions using Long Short-Term Memory (LSTM) Recurrent Neural Networks and Support Vector Machines. Fusion of geometric shape features and acoustic low-level descriptors is performed on the feature level. Three different types of acoustic noise are considered: ba...

Journal: :Journal of Speech Sciences 2021

This paper describes a framework that extends automatic speech transcripts in order to accommodate relevant information coming from manual transcripts, the signal itself, and other resources, like lexica. The proposed automatically collects, relates, computes, stores all together self-contained data source, making it possible easily provide wide range of interconnected suitable for analysis, tr...

2001
Heidi Christensen Yoshihiko Gotoh Steve Renals

This paper is about the development of statistical models of prosodic features to generate linguistic meta-data for spoken language. In particular, we are concerned with automatically punctuating the output of a broadcast news speech recogniser. We present a statistical finite state model that combines prosodic, linguistic and punctuation class features. Experimental results are presented using...

Journal: :Ear and hearing 2005
Krista L Johnson Trent G Nicol Nina Kraus

The auditory brain stem response to speech mimics the acoustic characteristics of the speech signal with remarkable fidelity. This makes it possible to derive from it considerable theoretical and clinically applicable information relevant to auditory processing of complex stimuli. Years of research have led to the current characterization of these neural events with respect to the underlying ac...

2012
Michelle Hewlett Sanchez Aaron Lawson Dimitra Vergyri Harry Bratt

As automatic speech processing has matured, research attention has expanded to paralinguistic speech problems that aim to detect beyond-the-words information. This paper focuses on the identification of seven speaker trait categories from the Interspeech Speaker Trait Challenge: likeability, intelligibility, openness, conscientiousness, extraversion, agreeableness, and neuroticism. Our approach...

2010
Raul Fernandez Bhuvana Ramabhadran

Many applications of spoken-language systems can benefit from having access to annotations of prosodic events. Unfortunately, obtaining human annotations of these events, even sensible amounts to train a supervised system, can become a laborious and costly effort. In this paper we explore applying conditional random fields to automatically label major and minor break indices and pitch accents f...

2016
Na Zhi Daniel Hirst Pier Marco Bertinetto Aijun Li Yuan Jia

In the present paper an analysis by synthesis study of mandarin speech prosody is carried out. The mandarin prosodic features are discussed from two salient perspectives, specifically: the function of prosody and the form of prosody. The symbolic representation of prosodic form with the INTSINT (INternational Transcription System for INTonation) system [1] reduces the surface complexity of a pr...

2000
Albert Rilliard Véronique Aubergé

A set of perception experiments, using reiterant speech, were designed to carry out a diagnostic of the segmentation / hierarchisation linguistic function of prosody. The prosodic parameters of F0, syllabic duration and intensity of the stimuli used during this experiment were extracted. Several dissimilarity measures (Correlation, root-mean-square distance and mutual information) were used to ...

Journal: :CoRR 2017
Trang Tran Shubham Toshniwal Mohit Bansal Kevin Gimpel Karen Livescu Mari Ostendorf

In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses. For automatically parsing a spoken utterance, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and word...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید