Vector-based spoken language recognition using output coding

نویسندگان

  • Haizhou Li
  • Bin Ma
  • Rong Tong
چکیده

The vector-based spoken language recognition approach converts a spoken utterance into a high dimensional vector, also known as a bag-of-sounds vector, that consists of n-gram statistics of acoustic units. Dimensionality reduction would better prepare the bag-of-sounds vectors for classifier design. We propose projecting the bag-of-sounds vectors onto a low dimensional SVM output coding space, where each dimension represents a decision hyperplane between a pair of spoken languages. We also compare the performances of the output coding approach and the traditional low ranking approximation approach using latent semantic indexing (LSI) on the NIST 1996, 2003 and 2005 Language Recognition Evaluation (LRE) databases. The experiments show that the output coding approach consistently outperforms LSI with competitive results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

Detection of Non-Native Named Entities Using Prosodic Features for Improved Speech Recognition and Translation

In this work, we describe the use of acoustic-prosodic features to detect and localize non-native named entities spoken by a native speaker in the target language (English) for the purpose of improved speech recognition and translation. The exaggerated variation in accent and duration introduced by the speaker for non-native names is exploited in the detection process through the use of prosodi...

متن کامل

Automated content scoring of spoken responses containing multiple parts with factual information

This paper presents approaches to automated content scoring of spoken language test responses from non-native speakers of English which contain multiple parts addressing factual information that the test taker has previously heard via auditory stimulus materials. While previous work relating to content scoring of spontaneous, unpredictable speech has focused only on entire responses and on gene...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Spoken Term Detection Using SVM-Based Classifier Trained with Pre-Indexed Keywords

This study presents a two-stage spoken term detection (STD) method that uses the same STD engine twice and a support vector machine (SVM)-based classifier to verify detected terms from the STD engine’s output. In a front-end process, the STD engine is used to preindex target spoken documents from a keyword list built from an automatic speech recognition result. The STD result includes a set of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006