Joint signal and transcription analysis for named speaker identification

نویسندگان

  • Vincent Jousse
  • Sylvain Meignier
  • Christine Jacquin
  • Simon Petit-Renaud
  • Yannick Estève
  • Béatrice Daille
چکیده

For some years, processing mass of multimedia documents has become a very crucial issue for applications like indexation or information retrieval. Among the focused information, speaker identity can be very useful for such applications. A huge collection of documents cannot be manually processed with a reasonable cost: only automatic systems are a relevant solution.In this paper, we consider the extraction of speaker identity (firstname and lastname) from audio records of broadcast news. Using a rich transcription system, we present a method which allows to extract speaker identities from automatic transcripts and to assign them to speaker turns. Experiments are carried out on French broadcast news records from the ESTER 1 phase II evaluation campaign. MOTS-CLÉS : identification nommée du locuteur, reconnaissance du locuteur, transcription enrichie.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Real-time rich-content transcription of Chinese broadcast news

This paper describes the recent development of an Audio Indexing System for Chinese (Mandarin) broadcast news. Key issues of the three major components: automatic speech recognition, speaker identification and named entity extraction are addressed. The Chinese-language-specific challenges are discussed and our solutions are described. The recognition accuracy of the final system is comparable t...

متن کامل

Person Instance Graphs for Named Speaker Identification in TV Broadcast

We address the problem of named speaker identification in TV broadcast which consists in answering the question “who speaks when?” with the real identity of speakers, using person names automatically obtained from speech transcripts. While existing approaches rely on a first speaker diarization step followed by a local name propagation step to speaker clusters, we propose a unified framework ca...

متن کامل

Speaker Identification From Youtube Obtained Data

An efficient, and intuitive algorithm is presented for the identification of speakers from a long dataset (like YouTube long discussion, Cocktail party recorded audio or video).The goal of automatic speaker identification is to identify the number of different speakers and prepare a model for that speaker by extraction, characterization and speaker-specific information contained in the speech s...

متن کامل

Analysis of i-vector framework for speaker identification in TV-shows

Inspired from the Joint Factor Analysis, the I-vector-based analysis has become the most popular and state-of-the-art framework for the speaker verification task. Mainly applied within the NIST/SRE evaluation campaigns, many studies have been proposed to improve more and more performance of speaker verification systems. Nevertheless, while the i-vector framework has been used in other speech pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • TAL

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2009