Speech-based Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition
نویسندگان
چکیده
This paper reports on the setup and evaluation of robust speech recognition system parts, geared towards transcript generation for heterogeneous, real-life media collections. The system is deployed for generating speech transcripts for the NIST/TRECVID-2007 test collection, part of a Dutch real-life archive of news-related genres. Performance figures for this type of content are compared to figures for broadcast news test data.
منابع مشابه
Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition
This paper reports on the setup and evaluation of robust speech recognition system parts, geared towards transcript generation for heterogeneous, real-life media collections. The system is deployed for generating speech transcripts for the NIST/TRECVID-2007 test collection, part of a Dutch real-life archive of news-related genres. Performance figures for this type of content are compared to fig...
متن کاملMultifeature Audio Segmentation for Browsing and Annotation
Indexing and content-based retrieval are necessary to handle the large amounts of audio and multimedia data that is becoming available on the web and elsewhere. Since manual indexing using existing audio editors is extremely time consuming a number of automatic content analysis systems have been proposed. Most of these systems rely on speech recognition techniques to create text indices. On the...
متن کاملAn Online System for Automatic Annotation of Audio Documents
This article presents a system for automatic transcription of audio documents. The system includes online implementations of recent algorithms for audio segmentation, speech/nonspeech classification, and speaker clustering, and integrates them with large vocabulary speech recognition systems for both English and French. We also propose a segment-based speech confidence score, and demonstrate th...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملThe Role of Automated Speech and Audio Analysis in Semantic Multimedia Annotation
This paper overviews the various ways in which automatic speech and audio analysis can be deployed to enhance the semantic annotation of multimedia content, and as a consequence to improve the effectiveness of conceptual access tools. A number of techniques will be presented, including the alignment of text resources, large vocabulary speech recognition, key word spotting and speaker classifica...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007