Joint signal and transcription analysis for named speaker identification
نویسندگان
چکیده
For some years, processing mass of multimedia documents has become a very crucial issue for applications like indexation or information retrieval. Among the focused information, speaker identity can be very useful for such applications. A huge collection of documents cannot be manually processed with a reasonable cost: only automatic systems are a relevant solution.In this paper, we consider the extraction of speaker identity (firstname and lastname) from audio records of broadcast news. Using a rich transcription system, we present a method which allows to extract speaker identities from automatic transcripts and to assign them to speaker turns. Experiments are carried out on French broadcast news records from the ESTER 1 phase II evaluation campaign. MOTS-CLÉS : identification nommée du locuteur, reconnaissance du locuteur, transcription enrichie.
منابع مشابه
شبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملReal-time rich-content transcription of Chinese broadcast news
This paper describes the recent development of an Audio Indexing System for Chinese (Mandarin) broadcast news. Key issues of the three major components: automatic speech recognition, speaker identification and named entity extraction are addressed. The Chinese-language-specific challenges are discussed and our solutions are described. The recognition accuracy of the final system is comparable t...
متن کاملPerson Instance Graphs for Named Speaker Identification in TV Broadcast
We address the problem of named speaker identification in TV broadcast which consists in answering the question “who speaks when?” with the real identity of speakers, using person names automatically obtained from speech transcripts. While existing approaches rely on a first speaker diarization step followed by a local name propagation step to speaker clusters, we propose a unified framework ca...
متن کاملSpeaker Identification From Youtube Obtained Data
An efficient, and intuitive algorithm is presented for the identification of speakers from a long dataset (like YouTube long discussion, Cocktail party recorded audio or video).The goal of automatic speaker identification is to identify the number of different speakers and prepare a model for that speaker by extraction, characterization and speaker-specific information contained in the speech s...
متن کاملAnalysis of i-vector framework for speaker identification in TV-shows
Inspired from the Joint Factor Analysis, the I-vector-based analysis has become the most popular and state-of-the-art framework for the speaker verification task. Mainly applied within the NIST/SRE evaluation campaigns, many studies have been proposed to improve more and more performance of speaker verification systems. Nevertheless, while the i-vector framework has been used in other speech pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- TAL
دوره 50 شماره
صفحات -
تاریخ انتشار 2009