Robust Automatic Video-Conferencing with Multiple Cameras and Microphones
نویسندگان
چکیده
An automatic video-conferencing system is proposed which employs acoustic source localization, video face tracking and pose estimation, and multi-channel speech enhancement. The video portion of the system tracks talkers by utilizing source motion, contour geometry, color data, and simple facial features. Decisions involving which camera to use are based on an estimate of the head’s gazing angle. This head pose estimation is achieved using a very general head model which employs hairline features and a learned network classification procedure. Finally, a wavelet microphone array technique is used to create an enhanced speech waveform to accompany the recorded video signal. The system presented in this paper is robust to both visual clutter (e.g. ovals in the scene of interest which are not faces) and audible noise (e.g. reverberations and background noise).
منابع مشابه
Audiovisual Head Orientation Estimation with Particle Filtering in Multisensor Scenarios
This article presents a multimodal approach to head pose estimation of individuals in environments equipped with multiple cameras and microphones, such as SmartRooms or automatic video conferencing. Determining the individuals head orientation is the basis for many forms of more sophisticated interactions between humans and technical devices and can also be used for automatic sensor selection (...
متن کاملAutomatic camera control using unobtrusive vision and audio tracking
While video can be useful for remotely attending and archiving meetings, the video itself is often dull and difficult to watch. One key reason for this is that, except in very high-end systems, little attention has been paid to the production quality of the video being captured. The video stream from a meeting often lacks detail and camera shots rarely change unless a person is tasked with oper...
متن کاملCollaboration Support Using Environment Images and Videos
This paper summarizes our environment-image/videosupported collaboration technologies developed in the past several years. These technologies use environment images and videos as active interfaces and use visual cues in these images and videos to orient device controls, annotations and other information access. By using visual cues in various interfaces, we expect to make the control interface ...
متن کاملDevelopmentally Appropriate Technology in Early Childhood : ‘ Video Conferencing ’ – a limit case ?
This paper originates from our desire to identify a limit case of appropriate educational applications of information and communications technology (ICT). Numerous claims have been made about the potential of technology to change the traditionally accepted developmental limits on children’s learning. Claims had also been made in the UK regarding the successful application of video conferencing ...
متن کاملMultimodal 3-D Tracking and Event Detection via the Particle Filter
Determining the occurrence of an event is fundamental to developing systems that can observe and react to them. Often, this determination is based on collecting video and/or audio data and determining the state or location of a tracked object. We use Bayesian inference and the particle filter for tracking moving objects, using both video data obtained from multiple cameras and audio data obtain...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000