Acoustic, phonetic, and discriminative approaches to automatic language identification
نویسندگان
چکیده
Formal evaluations conducted by NIST in 1996 demonstrated that systems that used parallel banks of tokenizer-dependent language models produced the best language identification performance. Since that time, other approaches to language identification have been developed that match or surpass the performance of phone-based systems. This paper describes and evaluates three techniques that have been applied to the language identification problem: phone recognition, Gaussian mixture modeling, and support vector machine classification. A recognizer that fuses the scores of three systems that employ these techniques produces a 2.7% equal error rate (EER) on the 1996 NIST evaluation set and a 2.8% EER on the NIST 2003 primary condition evaluation set. An approach to dealing with the problem of out-of-set data is also discussed.
منابع مشابه
مقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملA back-off discriminative acoustic model for automatic speech recognition
In this paper we propose a back-off discriminative acoustic model for Automatic Speech Recognition (ASR). We use a set of broad phonetic classes to divide the classification problem originating from context-dependent modeling into a set of subproblems. By appropriately combining the scores from classifiers designed for the sub-problems, we can guarantee that the back-off acoustic score for diff...
متن کاملAutomatic detection of speaker state: Lexical, prosodic, and phonetic approaches to level-of-interest and intoxication classification
Traditional studies of speaker state focus primarily upon one-stage classification techniques using standard acoustic features. In this article, we investigate multiple novel features and approaches to two recent tasks in speaker state detection: level-of-interest (LOI) detection and intoxication detection. In the task of LOI prediction, we propose a novel Discriminative TFIDF feature to captur...
متن کاملImproving phonotactic language recognition with acoustic adaptation
In recent evaluations of automatic language recognition systems, phonotactic approaches have proven highly effective [1][2]. However, as most of these systems rely on underlying ASR techniques to derive a phonetic tokenization, these techniques are potentially susceptible to acoustic variability from non-language sources (i.e. gender, speaker, channel, etc.). In this paper we apply techniques f...
متن کاملA New Phono-Articulatory Feature Representation for Language Identification in a Discriminative Framework
State of the Art language identification methods are based on acoustic or phonetic features. Recently, phono-articulatory features have been included as a new speech characteristic that conveys language information. Authors propose a new phono-articulatory representation of speech in a discriminative framework to identify languages. This simple representation shows good results discriminating b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003