Automatic acoustic segmentation for speech recognition on broadcast recordings
نویسندگان
چکیده
This paper investigates the issue of automatic segmentation of speech recordings for broadcast news (BN) and broadcast conversation (BC) speech recognition. Our previous segmentation algorithm often exhibited high deletion errors, where some speech segments were misclassified as non-speech and thus were never passed on to the recognizer. In contrast with our previous segmentation models, which only differentiated between speech and non-speech segments, phonetic knowledge is applied to represent speech by using multiple models for different types of speech segments. Moreover, the “pronunciation” of the speech segment has been modified to loosen the minimum duration constraint. This method makes use of language specific knowledge, while keeping the number of models low to achieve fast segmentation. Experimental results show that the new segmenter outperforms our previous segmenter significantly, particularly in reducing deletion errors.
منابع مشابه
Advances in automatic transcription of Italian broadcast news
This paper presents some recent improvements in automatic transcription of Italian broadcast news obtained at ITCirst. A first preliminary activity was carried out in order to develop a suitable speech corpus for the Italian language. The resulting corpus, formed by recordings covering 30 hours of radio news, was exploited for developing a baseline system for transcription of broadcast news. Th...
متن کاملOn Building and Evaluating a Broadcast-News Audio Segmentation System
Audio segmentation is useful in diverse applications like audio indexing and retrieval, subtitling, monitoring of acoustic scenes, etc. Also, an initial audio segmentation stage may help to improve the robustness of speech technologies like automatic speech recognition and speaker diarization. In this paper, firstly, the Albayzín-2010 audio segmentation evaluation is reported, including some co...
متن کاملAudio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion
Recently, audio segmentation has attracted research interest because of its usefulness in several applications like audio indexing and retrieval, subtitling, monitoring of acoustic scenes, etc. Moreover, a previous audio segmentation stage may be useful to improve the robustness of speech technologies like automatic speech recognition and speaker diarization. In this article, we present the eva...
متن کاملThe EPAC Corpus: Manual and Automatic Annotations of Conversational Speech in French Broadcast News
This paper presents the EPAC corpus which is composed by a set of 100 hours of conversational speech manually transcribed and by the outputs of automatic tools (automatic segmentation, transcription, POS tagging, etc.) applied on the entire French ESTER 1 audio corpus: this concerns about 1700 hours of audio recordings from radiophonic shows. This corpus was built during the EPAC project funded...
متن کاملAn Analysis of Sentence Segmentation Features for Broadcast News, Broadcast Conversations, and Meetings
Information retrieval techniques for speech are based on those developed for text, and thus expect structured data as input. An essential task is to add sentence boundary information to the otherwise unannotated stream of words output by automatic speech recognition systems. We analyze sentence segmentation performance as a function of feature types and transcription (manual versus automatic) f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007