New bilingual speech databases for audio diarization

نویسندگان

  • David Tavarez
  • Eva Navas
  • Daniel Erro
  • Ibon Saratxaga
  • Inma Hernáez
چکیده

This paper describes the process of collecting and recording two new bilingual speech databases in Spanish and Basque. They are designed primarily for speaker diarization in two different application domains: broadcast news audio and recorded meetings. First, both databases have been manually segmented. Next, several diarization experiments have been carried out in order to evaluate them. Our baseline speaker diarization system has been applied to both databases with around 30% of DER for broadcast news audio and 40% of DER for recorded meetings. Also, the behavior of the system when different languages are used by the same speaker has been tested.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust methods for content analysis of auditory scenes

The increasing progress of audio analysis methods opens possibilities for more new applications. At the same time, recent improvements in these methods bring the established approaches constantly closer to their performance limits, which are defined by disturbing factors such as overlapping speech or noise and reverberation. This thesis presents progress in new possibilities and addressing dist...

متن کامل

Speaker Diarization - “Who Spoke When”

Speaker diarization is the process of annotating an input audio with informationthat attributes temporal regions of the audio signal to their respective sources,which may include both speech and non-speech events. For speech regions, thediarization system also specifies the locations of speaker boundaries and assignrelative speaker labels to each homogeneous segment of speech. I...

متن کامل

On the Applicability of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video Soundtracks

A video‘s soundtrack is usually highly correlated to its content. Hence, audio-based techniques have recently emerged as a means for video concept detection complementary to visual analysis. Most state-of-the-art approaches rely on manual definition of predefined sound concepts such as “engine sounds”, “outdoor/indoor sounds”. These approaches come with three major drawbacks: manual definitions...

متن کامل

Towards a complete binary key system for the speaker diarization task

Speaker diarization is the task of partitioning an audio stream into homogeneous segments according to speaker identity. Today state-of-the-art speaker diarization systems have achieved very competitive performance. However, any small improvement in Diarization Error Rate (DER) is usually subject to very large processing times (real time factor above one), which makes systems not suitable for s...

متن کامل

Robust Unsupervised Speaker Segmentation for Audio Diarization

Audio diarization Reynolds & Carrasquillo (2005) is the process of partitioning an input audio stream into homogeneous regions according to their specific audio sources. These sources can include audio type (speech, music, background noise, ect.), speaker identity and channel characteristics. With the continually increasing number of larges volumes of spoken documents including broadcasts, voic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014