The TYPALOC Corpus: A Collection of Various Dysarthric Speech Recordings in Read and Spontaneous Styles
نویسندگان
چکیده
This paper presents the TYPALOC corpus of French Dysarthric and Healthy speech and the rationale underlying its constitution. The objective is to compare phonetic variation in the speech of dysarthric vs. healthy speakers in different speech conditions (read and unprepared speech). More precisely, we aim to compare the extent, types and location of phonetic variation within these different populations and speech conditions. The TYPALOC corpus is constituted of a selection of 28 dysarthric patients (three different pathologies) and of 12 healthy control speakers recorded while reading the same text and in a more natural continuous speech condition. Each audio signal has been segmented into Inter-Pausal Units. Then, the corpus has been manually transcribed and automatically aligned. The alignment has been corrected by an expert phonetician. Moreover, the corpus benefits from an automatic syllabification and an Automatic Detection of Acoustic Phone-Based Anomalies. Finally, in order to interpret phonetic variations due to pathologies, a perceptual evaluation of each patient has been conducted. Quantitative data are provided at the end of the paper.
منابع مشابه
Automatic Anomaly Detection for Dysarthria across Two Speech Styles: Read vs Spontaneous Speech
Perceptive evaluation of speech disorders is still the standard method in clinical practice for the diagnosing and the following of the condition progression of patients. Such methods include different tasks such as read speech, spontaneous speech, isolated words, sustained vowels, etc. In this context, automatic speech processing tools have proven pertinence in speech quality evaluation and as...
متن کاملCommon and Language Dependent Phonetic Differences Between Read and Spontaneous Speech in Russian, Finnish and Dutch
This preliminary study aims to reveal both common and language-specific phonetic differences between read and spontaneous speech in three typologically unrelated languages – Russian, Finnish, and Dutch. These languages differ in prosody, sound systems, speech styles, and means for conveying intonational meaning. Spontaneous speech was recorded from 5 to 8 speakers in each language. Transliterat...
متن کاملA Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus
This paper introduces a new British English speech database, named the homeService corpus, which has been gathered as part of the homeService project. This project aims to help users with speech and motor disabilities to operate their home appliances using voice commands. The audio recorded during such interactions consists of realistic data of speakers with severe dysarthria. The majority of t...
متن کاملRUNDKAST: an Annotated Norwegian Broadcast News Speech Corpus
This paper describes the Norwegian broadcast news speech corpus RUNDKAST. The corpus contains recordings of approximately 77 hours of broadcast news shows from the Norwegian broadcasting company NRK. The corpus covers both read and spontaneous speech as well as spontaneous dialogues and multipart discussions, including frequent occurrences of non-speech material (e.g. music, jingles). The recor...
متن کاملCreating a speech corpus with semi-spontaneous, parallel conversational and clear speech Tech Report: CSLU-11-003
Our goal is to collect a speech corpus for the purpose of studying intelligibility and acoustic differences between the conversational and clear speech styles. The ideal corpus has the following properties: (1) speech has been produced spontaneously as part of a communicative interaction, as opposed to having been read to an imagined interlocutor; (2) entire identical utterances, or large parts...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016