switchboard

Topicalization and Left-Dislocation: A Functional Opposition Revisited

1999

Michelle L. Gregory Laura A. Michaelis

In this case study, we use conversational data from the Switchboard corpus to investigate the functional opposition between two pragmatically specialized constructions of English: Topicalization and Left Dislocation. Specifically, we use distributional trends in the Switchboard corpus to revise several conclusions reached by Prince (1981a, 1981b, 1997) concerning the function of Left Dislocatio...

متن کامل

Deep Learning-Based Telephony Speech Recognition in the Wild

2017

Kyu J. Han Seongjun Hahm Byung-Hak Kim Jungsuk Kim Ian R. Lane

In this paper, we explore the effectiveness of a variety of Deep Learning-based acoustic models for conversational telephony speech, specifically TDNN, bLSTM and CNN-bLSTM models. We evaluated these models on both research testsets, such as Switchboard and CallHome, as well as recordings from a realworld call-center application. Our best single system, consisting of a single CNN-bLSTM acoustic ...

متن کامل

Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech

2004

Michelle L. Gregory Yasemin Altun

The detection of prosodic characteristics is an important aspect of both speech synthesis and speech recognition. Correct placement of pitch accents aids in more natural sounding speech, while automatic detection of accents can contribute to better wordlevel recognition and better textual understanding. In this paper we investigate probabilistic, contextual, and phonological factors that influe...

متن کامل

Robust Vowel Landmark Detection Using Epoch-Based Features

2016

Sri Harsha Dumpala Bhanu Teja Nellore Raghu Ram Nevali Suryakanth V. Gangashetty Bayya Yegnanarayana

Automatic detection of vowel landmarks is useful in many applications such as automatic speech recognition (ASR), audio search, syllabification of speech and expressive speech processing. In this paper, acoustic features extracted around epochs are proposed for detection of vowel landmarks in continuous speech. These features are based on zero frequency filtering (ZFF) and single frequency filt...

متن کامل

Multi-Speaker Language Modeling

2004

Gang Ji Jeff A. Bilmes

In conventional language modeling, the words from only one speaker are represented at a time, even for conversational tasks such as meetings and telephone calls. In a conversational or meeting setting, however, different speakers can influence each other. In order to recover this missing inter-speaker information, in this work we present a novel approach for conversational language modeling tha...

متن کامل

Optimization of dynamic regimes in a statistical hidden dynamic model for conversational speech recognition

1999

Jeff Z. Ma Li Deng

This paper reports our on-going work aimimg to improve the performance of a novel speech recognizer based on an underlying statistical hidden dynamic model of phonetic reduction in the production of conversational speech. We have developed a path-stack search algorithm which e ciently computes the likelihood of any observation utterance while optimizing the dynamic regimes in the speech model. ...

متن کامل

A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech

Journal: :Computer Speech & Language 2000

Jeff Z. Ma Li Deng

In this paper we report our recent research whose goal is to improve the performance of a novel speech recognizer based on an underlying statistical hidden dynamic model of phonetic reduction in the production of conversational speech. We have developed a path-stack search algorithm which efficiently computes the likelihood of any observation utterance while optimizing the dynamic regimes in th...

متن کامل

Achieving Human Parity in Conversational Speech Recognition

Journal: :CoRR 2016

Wayne Xiong Jasha Droppo Xuedong Huang Frank Seide Mike Seltzer Andreas Stolcke Dong Yu Geoffrey Zweig

Conversational speech recognition has served as a flagship speech recognition task since the release of the DARPA Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the widely used NIST 2000 test set, and find that our latest automated system has reached human parity. The error rate of professional transcriptionists is 5.9% for the Switchboard portion of the data...

متن کامل

Flexible Transcription Alignment

1997

Michael Finke Alex Waibel

In this paper we present a set of techniques we employed in our Janus Recognition Toolkit (JRTk) Switchboard and CallHome recognizer in order to deal with imperfections in the transcriptions: inconsistent transcription of pronunciations and contractions as well as errors in utterance segmentations. These techniques consist of a dynamic, speaking mode dependent pronunciation model and a exible u...

متن کامل

HIPK2, a Versatile Switchboard Regulating the Transcription Machinery and Cell Death

Journal: :Cell Cycle 2007

متن کامل