audio input flooding

RtAudio: A Cross-Platform C++ Class for Realtime Audio Input/Output

2002

Gary P. Scavone

This paper presents a cross-platform C++ class for realtime audio input and output streaming. RtAudio provides a flexible, easy to use application programming interface (API) which allows complete audio system control, including device capability querying, multiple concurrent streams, blocking and callback functionality. RtAudio is currently supported on Windows platforms using the DirectSound ...

متن کامل

Some Experiments in Evaluating ASR Systems Applied to Multimedia Retrieval

2009

Julián Moreno Schneider Marta Garrote Salazar Paloma Martínez José Luis Martínez-Fernández

This paper describes some tests performed on different types of voice/audio input applying three commercial speech recognition tools. Three multimedia retrieval scenarios are considered: a question answering system, an automatic transcription of audio from video files and a real-time captioning system used in the classroom for deaf students. A software tool, RET (Recognition Evaluation Tool), h...

متن کامل

Beat Tracking with a Two State Model

2005

M. E. P. Davies M. D. Plumbley

In this paper we apply a two state switching model to the problem of audio based beat tracking. Our analysis is based around the generation and application of adaptively weighted comb filterbank structures to extract beat timing information from the midlevel representation of an input audio signal known as the onset detection function [1]. We evaluate our system using a previously published dat...

متن کامل

Violence Content Classification Using Audio Features

2006

Theodoros Giannakopoulos Dimitrios I. Kosmopoulos Andreas Aristidou Sergios Theodoridis

This work studies the problem of violence detection in audio data, which can be used for automated content rating. We employ some popular frame-level audio features both from the time and frequency domain. Afterwards, several statistics of the calculated feature sequences are fed as input to a Support Vector Machine classifier, which decides about the segment content with respect to violence. T...

متن کامل

Automatic Composition from Non-musical Inspiration Sources

2012

Robert Smith Aaron W. Dennis Dan Ventura

In this paper, we describe a system which creates novel musical compositions inspired by non-musical audio signals. The system processes input audio signals using onset detection and pitch estimation algorithms. Additional musical voices are added to the resulting melody by models of note relationships that are built using machine learning trained with different pieces of music. The system crea...

متن کامل

معیارهای ارزیابی و تولید کتاب‌های گویا از دیدگاه تولیدکنندگان: تحلیل محتوای کیفی

ژورنال: تحقیقات اطلاع رسانی کتابخانه های عمومی 2015

فهیم‌نیا, فاطمه, نقشینه, نادر, چهرقانی, مریم,

Purpose: Audio books have a special stand in the publishing industry. Publishers around the world produce audio books with different criterions and standards. This study aimed to identify and introduce the most important criterions for evaluation and production of audio books from the producers' point of view. Methodology: this study was performed with qualitative content analysis of interview...

متن کامل

Onset Detection Exploiting Adaptive Linear Prediction Filtering in Dwt Domain with Bidirectional Long Short-term Memory Neural Networks

2013

G. Ferroni E. Marchi F. Eyben L. Gabrielli S. Squartini B. Schuller

The following short paper presents an experimental algorithm for onset detection which apply features extraction in the wavelet domain and auditory spectral features to Bidirectional Long Short-Term Memory (BLSTM) recurrent neural networks for decision-making. The presented algorithm exploits multi-resolution time-frequency features via the discrete wavelet transformation to decompose the input...

متن کامل

Synthesis by Rule of Disordered Voices

2013

Jean Schoentgen Jorge C. Lucero

The synthesis of disordered voices designates the use of numerical methods to simulate the vocal timbre of speakers suffering from laryngeal pathologies or dysfunctions to investigate the link between perceived timbre and speech signal properties. The simulation is based on a mapping of the amplitude of a narrow-band input signal onto the amplitude of a desired output signal, while the cycle le...

متن کامل

TOWARDS MULTIMODAL CONTENT REPRESENTATION Discussion paper

2008

Harry Bunt Laurent Romary

Multimodal interfaces, combining the use of speech, graphics, gestures, and facial expressions in input and output, promise to provide new possibilities to deal with information in more effective and efficient ways, supporting for instance: the understanding of possibly imprecise, partial or ambiguous multimodal input; the generation of coordinated, cohesive, and coherent multimodal presentatio...

متن کامل

Musical Hit Detection

2008

Eleanor Crane Sarah Houts Kiran Murthy

Musical visualizers are programs that process audio input in order to provide aestheticallypleasing audio-synchronized graphics. In popular music, musical instrumentation changes known as hits are an important indicator of changes in the music’s mood. Ideally, a visualizer should respond to a hit by also changing the mood of the displayed graphics to match the music. This project will focus on ...

متن کامل