نتایج جستجو برای: switchboard
تعداد نتایج: 728 فیلتر نتایج به سال:
We describe the latest improvements to the IBM English conversational telephone speech recognition system. Some of the techniques that were found beneficial are: maxout networks with annealed dropout rates; networks with a very large number of outputs trained on 2000 hours of data; joint modeling of partially unfolded recurrent neural networks and convolutional nets by combining the bottleneck ...
We present a framework for the integrated analysis of the textual and prosodic characteristics of information structure in the Switchboard corpus of conversational English. Information structure describes the availability, organisation and salience of entities in a discourse model. We present standards for the annotation of information status (old, mediated and new), and give guidelines for ann...
Parametric trajectory models explicitly represent the temporal evolution of the speech features as a Gaussian process with time-varying parameters. HMMs are a special case of such models, one in which the trajectory constraints in the speech segment are ignored by the assumption of conditional independence across frames within the segment. In this paper, we investigate in detail some extensions...
This paper describes a newly realized highperformance speaker recognition system and examines methods for its improvement. Innovative experiments early this year showed that phone strings are exceptional features for speaker recognition. The original system produced equal error rates less than 11.5% on Switchboard-I audio files. Subsequent research indicates that the equal error rate can be nea...
This report discusses the use of multi-layered tagsets for dialogue acts, in the context of dialogue understanding for multi-party meeting recording and retrieval applications. We discuss some desiderata for such tagsets and critically examine some previous proposals. We then define MALTUS, a new tagset based on the ICSI-MR and Switchboard tagsets, which satisfies these requirements. We present...
The purpose of this paper is to unify several of the state-of-the-art score normalization techniques applied to text-independent speaker verification systems. We propose a new Bayesian framework for this purpose. The two well-known Zand T-normalization techniques can be easily interpreted in this framework as different ways to estimate score distributions. This is useful as it helps to understa...
In this paper, we present a novel hybrid keyword spotting system that combines supervised and semi-supervised competitive learning algorithms. The rst stage is a S-SOM (Semi-supervised SelfOrganizing Map) module which is speci cally designed for discrimination between keywords (KWs) and non-keywords (NKWs). The second stage is an FDVQ (Fuzzy Dynamic Vector Quantization) module which consists of...
A new language model for speech recognition inspired by linguistic analysis is presented. The model develops hidden hierarchical structure incrementally and uses it to extract meaningful information from the word history — thus enabling the use of extended distance dependencies — in an attempt to complement the locality of currently used trigram models. The structured language model, its probab...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید