Subsegmental, Segmental and Suprasegmental Features for Speaker Recognition Using Gaussian Mixture Model
ثبت نشده
چکیده
In the feature extraction stage, features representing speaker information are extracted from the speech signal. In the present study LP residual derived from the speech data is used for training and testing and also processing of LP residual in time domain at subsegmental, segmental and suprasegmental levels. In the training phase, GMMs are built, one for each speaker, using the training data of the speaker. During the testing phase, the models are tested with the test data. Based on the results with test data, decision is made about the identity of the speaker.
منابع مشابه
Subsegmental, Segmental and Suprasegmental Features for Speaker Recognition Using Ergodic Hidden Markov Model
متن کامل
Self Determining Speaker Recognition by Three Level Segmental Processing Of Linear Prediction Residual
This paper proposes a speaker specific source information at different levels.speaker recognition system exploits the source information (LP residual) present at different levels namely subsegmental, segmental &suprasegmental. The subsegmental analysis considers LP residual in blocks of 5 msec with shift of 2.5 msec to extract speaker information. The segmental analysis extracts speaker informa...
متن کاملCombining Gaussian Mixture Models and Segmental Feature Models for Speaker Recognition
In most speaker recognition systems speech utterances are not constrained in content or language. In a text-dependent speaker recognition system lexical content of speech and language are known in advance. The goal of this paper is to show that this information can be used by a segmental features (SF) approach to improve a standard Gaussian mixture model with MFCC features (GMM-MFCC). Speech fe...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملSpeaker Information using Subsegmental and Segmental Analysis of LP Residual
Linear Prediction (LP) residual mostly contains the excitation source information. This work analyzes the LP residual once using frame size of 5 ms (subsegmental) and another time using frame size of 20 ms (segmental), each with a shift of 2.5 ms. The residual frames are then subjected to nonparametric Vector Quantization (VQ) to store the unique excitation sequences for each speaker. The testi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014