Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks
نویسندگان
چکیده
We present a novel approach to generating photo-realistic images of a face with accurate lip sync, given an audio input. By using a recurrent neural network, we achieved mouth landmarks based on audio features. We exploited the power of conditional generative adversarial networks to produce highly-realistic face conditioned on a set of landmarks. These two networks together are capable of producing sequence of natural faces in sync with an input audio track.
منابع مشابه
High-Quality Face Image SR Using Conditional Generative Adversarial Networks
We propose a novel single face image superresolution method, which named Face Conditional Generative Adversarial Network(FCGAN), based on boundary equilibrium generative adversarial networks. Without taking any facial prior information, our method can generate a high-resolution face image from a low-resolution one. Compared with existing studies, both our training and testing phases are end-toe...
متن کاملConditional generative adversarial nets for convolutional face generation
We apply an extension of generative adversarial networks (GANs) [8] to a conditional setting. In the GAN framework, a “generator” network is tasked with fooling a “discriminator” network into believing that its own samples are real data. We add the capability for each network to condition on some arbitrary external data which describes the image being generated or discriminated. By varying the ...
متن کاملConditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification
Improving speech system performance in noisy environments remains a challenging task, and speech enhancement (SE) is one of the effective techniques to solve the problem. Motivated by the promising results of generative adversarial networks (GANs) in a variety of image processing tasks, we explore the potential of conditional GANs (cGANs) for SE, and in particular, we make use of the image proc...
متن کاملImprovement of generative adversarial networks for automatic text-to-image generation
This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...
متن کاملGenerative Adversarial Network-Based Glottal Waveform Model for Statistical Parametric Speech Synthesis
Recent studies have shown that text-to-speech synthesis quality can be improved by using glottal vocoding. This refers to vocoders that parameterize speech into two parts, the glottal excitation and vocal tract, that occur in the human speech production apparatus. Current glottal vocoders generate the glottal excitation waveform by using deep neural networks (DNNs). However, the squared error-b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018