On the sufficiency of automatic phonetic transcriptions for pronunciation variation research

نویسندگان

  • Christophe Van Bael
  • Hans van Halteren
چکیده

W e investigated whether automatic phonetic transcriptions (APTs) can replace manually verified phonetic transcriptions (MPTs) in a large corpus-based study on pronunciation variation. To this end, we compared the performance o f both transcription types in a classification experiment aimed at establishing the direct influence o f a particular situational setting on pronunciation variation. W e trained classifiers on the speech processes extracted from the alignments o f an APT and an MPT with a canonical transcription. W e tested whether the classifiers were equally good at verifying whether unknown transcriptions represent read speech or telephone dialogues, and whether the same speech processes were identified to distinguish between transcriptions o f the two situational settings. Our results not only show that similar distinguishing speech processes were identified; our APT-based classifier yielded better classification accuracy than the MPT-based classifier whilst using fewer classification features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic phonetic transcription of large speech corpora

This study is aimed at investigating whether automatic phonetic transcription procedures can approximate manual transcriptions typically delivered with contemporary large speech corpora. To this end, ten automatic procedures were used to generate a broad phonetic transcription of well-prepared speech (read-aloud texts) and spontaneous speech (telephone dialogues) from the Spoken Dutch Corpus. T...

متن کامل

How to Improve Human and Machine Transcriptions of Spontaneous Speech

This paper reports on an experiment aimed at measuring the quality o f automatic and human phonetic transcriptions of different speech styles that were produced within the framework o f a large speech corpus project for Dutch, the Spoken Dutch Corpus (C orpus Gesproken Nederlands, CGN). The results indicate that the procedure adopted in the CGN to improve the quality o f phonetic transcriptions...

متن کامل

On automatic phonetic transcription quality: lower word error rates do not guarantee better transcriptions

The first goal of this study was to investigate the effect of changing several properties of a continuous speech recognizer (CSR) on the automatic phonetic transcriptions generated by the same CSR. Our results show that the quality of the automatic transcriptions can be improved by using short hidden Markov models (HMMs) and by reducing the amount of contamination in the HMMs. The amount of con...

متن کامل

Making a difference On automatic transcription and modeling of Dutch pronunciation variation for automatic speech recognition

The first goal of this study is to investigate the effect of several properties of acontinuous speech recognizer (CSR) on automatic phonetic transcription. Our resultsshow that changing certain properties of the CSR affects the resulting automatictranscriptions. The quality of the automatic transcriptions can be improved by using‘short’ HMMs and by reducing the amount of contami...

متن کامل

Analysis of phonetic transcriptions for Danish automatic speech recognition

Automatic speech recognition (ASR) relies on three resources: audio, orthographic transcriptions and a pronunciation dictionary. The dictionary or lexicon maps orthographic words to sequences of phones or phonemes that represent the pronunciation of the corresponding word. The quality of a speech recognition system depends heavily on the dictionary and the transcriptions therein. This paper pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006