Sage: The New BBN Speech Processing Platform

نویسندگان

  • Roger Hsiao
  • Ralf Meermeier
  • Tim Ng
  • Zhongqiang Huang
  • Maxwell Jordan
  • Enoch Kan
  • Tanel Alumäe
  • Jan Silovský
  • William Hartmann
  • Francis Keith
  • Omer Lang
  • Man-Hung Siu
  • Owen Kimball
چکیده

To capitalize on the rapid development of Speech-to-Text (STT) technologies and the proliferation of open source machine learning toolkits, BBN has developed Sage, a new speech processing platform that integrates technologies from multiple sources, each of which has particular strengths. In this paper, we describe the design of Sage, which allows the easy interchange of STT components from different sources. We also describe our approach for fast prototyping with new machine learning toolkits, and a framework for sharing STT components across different applications. Finally, we report Sage’s state-of-the-art performance on different STT tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applications of the BBN Sage Speech Processing Platform

As a follow-up to our paper at Interspeech 2016 [1], we propose to showcase various applications that now all use BBN’s Sage Speech Processing Platform, demonstrating the platform’s versatility and ease of integration. In particular, we will showcase 1) BBN TransTalk: A turnbased speech-to-speech translation program running entirely on an Android smartphone, alongside a custom 3D-printed periph...

متن کامل

West Point, SAAVB, and BBN/AUB Arabic Speech Corpora: A Comparative Survey

The aim of this paper is to evaluate three public Arabic speech corpora, namely the West Point (WP), Saudi Accented Arabic Voice Bank (SAAVB) and the BBN Technologies/American University at Beirut (BBN/AUB) corpus by referring the TIMIT English speech corpus as benchmark. Weaknesses, strengths, and discrepancies of these Arabic corpora regarding their design and content are covered in this pape...

متن کامل

Porting tonNew Domains Using the Learner

Acquiring syntactic and semantic information about a new application domain for a natural language processing system is often a time-consuming task. To address this problem, various researchers have developed acquisition tools to speed the process. While such tools are very useful, they are typically tied to particular systems and so their benefits cannot be shared by other researchers. In this...

متن کامل

Auditory Cortical Temporal Processing Abilities in Young Adults

Purpose: To evaluate whether cortical encoding of temporal processing ability, using the N1 peak of the cortical auditory evoked potential, could be measured in normally hearing young adults using three paradigms: voice-onsettime, speech-in-noise and amplitude-modulated broadband noise. Research design: Cortical auditory evoked potentials (CAEPs) were elicited using: (1) naturally produced stop...

متن کامل

A Speech Processing Research Platform for Android Based Smart Phones and Tablets

This paper presents a new research and education platform for speech processing. The platform is called Speech Enhancement for Android (SEA) and incorporates past and present speech enhancement techniques applied to recorded speech corrupted by real world noise sources. Researchers, students and teaching staff can use this platform to perform speech enhancement and observe its effects on the li...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016