روش gmm

Two-class signal segmentation for speech/music detection in audio tracks

1999

Mouhamadou Seck Frédéric Bimbot Didier Zugaj Bernard Delyon

We present a technique for the segmention of a sound track into two classes of segments. Each frame of signal is preprocessed by extracting cepstral coefficients and their first order derivatives. For each class, the distribution of the frame parameter vectors is modeled by a Gaussian Mixture Model (GMM). GMM order is selected using two criteria : the Minimum Description Length (MDL) criterion ...

متن کامل

Comparison of MLP and GMM Classifiers for Face Verification on XM2VTS

2003

Fabien Cardinaux Conrad Sanderson Sébastien Marcel

We compare two classifier approaches, namely classifiers based on Multi Layer Perceptrons (MLPs) and Gaussian Mixture Models (GMMs), for use in a face verification system. The comparison is carried out in terms of performance, robustness and practicability. Apart from structural differences, the two approaches use different training criteria; the MLP approach uses a discriminative criterion, wh...

متن کامل

Speaker indexing in large audio databases using anchor models

2001

Douglas E. Sturim Douglas A. Reynolds Elliot Singer Joseph P. Campbell

This paper introduces the technique of anchor modeling in the applications of speaker detection and speaker indexing. The anchor modeling algorithm is refined by pruning the number of models needed. The system is applied to the speaker detection problem where its performance is shown to fall short of the state-of-the-art Gaussian Mixture Model with Universal Background Model (GMM-UBM) system. H...

متن کامل

Natural-emotion GMM transformation algorithm for emotional speaker recognition

2007

Zhenyu Shan Yingchun Yang Ruizhi Ye

One of the largest challenges in speaker recognition is dealing with speaker-emotion variability problem. Nowadays, compensation techniques are the main solutions to this problem. In these methods, all kinds of speakers’ emotion speech should be elicited thus it is not user-friendly in the application. Therefore the basic problem is how to get the distribution of speakers’ emotion speech and ho...

متن کامل

Exploring robustness of DNN/RNN for extracting speaker baum-welch statistics in mismatched conditions

2015

Hao Zheng Shanshan Zhang Wenju Liu

This work explores the use of DNN/RNN for extracting Baum-Welch sufficient statistics in place of the conventional GMM-UBM in speaker recognition. In this framework, the DNN/RNN is trained for automatic speech recognition (ASR) and each of the output unit corresponds to a component of GMM-UBM. Then the outputs of network are combined with acoustic features to calculate sufficient statistics for...

متن کامل

Geometrically-constrained balloon fitting for multiple connected ellipses

Journal: :Pattern Recognition 2015

Michael Kemp Richard Y. D. Xu

This paper presents a framework to fit data to a model consisting of multiple connected ellipses. For each iteration of the fitting algorithm, the representation of the multiple ellipses is mapped to a Gaussian mixture model (GMM) and the connections are mapped to geometric constraints for the GMM. The fitting is a modified constrained expectation maximisation (EM) method on the GMM (maximising...

متن کامل

Large Margin GMM for discriminative speaker verification

2011

Reda Jourani Khalid Daoudi Driss Aboutajdine

Gaussian mixture models (GMM), trained using the generative criterion of maximum likelihood estimation, have been the most popular approach in speaker recognition during the last decades. This approach is also widely used in many other classification tasks and applications. Generative learning in not however the optimal way to address classification problems. In this paper we first present a ne...

متن کامل

Dialect recognition using a phone-GMM-supervector-based SVM kernel

2010

Fadi Biadsy Julia Hirschberg Michael Collins

In this paper, we introduce a new approach to dialect recognition which relies on the hypothesis that certain phones are realized differently across dialects. Given a speaker’s utterance, we first obtain the most likely phone sequence using a phone recognizer. We then extract GMM Supervectors for each phone instance. Using these vectors, we design a kernel function that computes the similaritie...

متن کامل

Supervised/Unsupervised Voice Activity Detectors for Text- dependent Speaker Recognition on the RSR2015 Corpus

2014

Md Jahangir Alam Patrick Kenny Pierre Ouellet Themos Stafylakis Pierre Dumouchel

Voice activity detection, i.e., discrimination of the speech/nonspeech segments in a speech signal, is an important enabling technology for a variety of speech-based applications including the speaker recognition. In this work we provide a performance evaluation of the following supervised and unsupervised VAD algorithms in the context of text-dependent speaker recognition on the RSR2015 (Robus...

متن کامل

Gmm Estimation and Uniform Subvector Inference with Possible Identification Failure By

2014

Donald W.K. Andrews Xu Cheng DONALD W.K. ANDREWS XU CHENG

This paper determines the properties of standard generalized method of moments (GMM) estimators, tests, and confidence sets (CSs) in moment condition models in which some parameters are unidentified or weakly identified in part of the parameter space. The asymptotic distributions of GMM estimators are established under a full range of drifting sequences of true parameters and distributions. The...

متن کامل