switchboard

Dialogue act classification using a Bayesian approach∗

2004

Sergio Grau

In this work, we make a contribution to natural speech dialogue act detection. We focus our attention on the dialogue act classification using a Bayesian approach. Our classifier is tested on two corpora, the Switchboard and the Basurde tasks. A combination of a naive Bayes classifier and n-grams is used. The impact of different smoothing methods (Laplace and Witten Bell) and n-grams in classif...

متن کامل

Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks

2012

Dong Yu Li Deng Frank Seide

Recently, we proposed and developed the context-dependent deep neural network hidden Markov models (CD-DNN-HMMs) for large vocabulary speech recognition and achieved highly promising recognition results including over one third fewer word errors than the discriminatively trained, conventional HMM-based systems on the 300hr Switchboard benchmark task. In this paper, we extend DNNs to deep tensor...

متن کامل

Experiments in speaker verification using factor analysis likelihood ratios

2004

Patrick Kenny Pierre Dumouchel

We report the results of some speaker verification experiments on the NIST 1999 and NIST 2000 test sets using factor analysis likelihood ratio statistics. For the experiments on the 1999 test set we had to use a mismatched training set, namely Phases 1 and 2 of the Switchboard II corpus, to train the factor analysis model. Our results on this test set are are comparable to (but not better than)...

متن کامل

Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization

2012

Brian Kingsbury Tara N. Sainath Hagen Soltau

Training neural network acoustic models with sequencediscriminative criteria, such as state-level minimum Bayes risk (sMBR), been shown to produce large improvements in performance over cross-entropy. However, because they entail the processing of lattices, sequence criteria are much more computationally intensive than cross-entropy. We describe a distributed neural network training algorithm, ...

متن کامل

Discriminative Syntactic Language Modeling for Speech Recognition

2005

Michael Collins Brian Roark Murat Saraclar

We describe a method for discriminative training of a language model that makes use of syntactic features. We follow a reranking approach, where a baseline recogniser is used to produce 1000-best output for each acoustic input, and a second “reranking” model is then used to choose an utterance from these 1000-best lists. The reranking model makes use of syntactic features together with a parame...

متن کامل

Estimating the Number of Segments of a Turn in Dialogue Systems

2009

Vicent Tamarit Carlos D. Martínez-Hinarejos

An important part of a dialogue system is the correct labelling of turns with dialogue-related meaning. This meaning is usually represented by dialogue acts, which give the system semantic information about user intentions. This labelling is usually done in two steps, dividing the turn into segments, and classifying them into DAs. Some works have shown that the segmentation step can be improved...

متن کامل

Smoothing issues in the structured language model

2001

Woosung Kim Sanjeev Khudanpur Jun Wu

The Structured Language Model (SLM) recently introduced by Chelba and Jelinek is a powerful general formalism for exploiting syntactic dependencies in a left-to-right language model for applications such as speech and handwriting recognition, spelling correction, machine translation, etc. Unlike traditional N-gram models, optimal smoothing techniques – discounting methods and hierarchical struc...

متن کامل

Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition

1997

Michael Finke Alexander H. Waibel

In spontaneous conversational speech there is a large amount of variability due to accents, speaking styles and speaking rates (also known as the speaking mode) [3]. Because current recognition systems usually use only a relatively small number of pronunciation variants for the words in their dictionaries, the amount of variability that can be modeled is limited. Increasing the number of varian...

متن کامل

Discriminative map for acoustic model adaptation

2003

Daniel Povey Philip C. Woodland Mark J. F. Gales

In this paper we show how a discriminative objective function such as Maximum Mutual Information (MMI) can be combined with a prior distribution over the HMM parameters to give a discriminative Maximum A Posteriori (MAP) estimate for HMM training. The prior distribution can be based around the Maximum Likelihood (ML) parameter estimates, leading to a technique previously referred to as I-smooth...

متن کامل

70 24 v 1 1 3 Ju l 2 00 0 Many Uses , Many Annotations for Large Speech Corpora : Switchboard and TDT as Case Studies

2006

Steven Bird

This paper discusses the challenges that arise when large speech corpora receive an ever-broadening range of diverse and distinct annotations. Two case studies of this process are presented: the Switchboard Corpus of telephone conversations and the TDT2 corpus of broadcast news. Switchboard has undergone two independent transcriptions and various types of additional annotation, all carried out ...

متن کامل