Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks
March 25, 2019 ยท Entered Twilight ยท ๐ IEEE International Conference on Acoustics, Speech, and Signal Processing
"No code URL or promise found in abstract"
"Derived repo from GitHub Pages (backfill)"
Evidence collected by the PWNC Scanner
Repo contents: LICENSE, Readme.md, assets, config.yaml, dataset, models, runtime.py, scripts, wav2pix-2019-icassp.pdf
Authors
Amanda Duarte, Francisco Roldan, Miquel Tubau, Janna Escur, Santiago Pascual, Amaia Salvador, Eva Mohedano, Kevin McGuinness, Jordi Torres, Xavier Giro-i-Nieto
arXiv ID
1903.10195
Category
cs.MM: Multimedia
Cross-listed
cs.CV
Citations
84
Venue
IEEE International Conference on Acoustics, Speech, and Signal Processing
Repository
https://github.com/imatge-upc/wav2pix
โญ 56
Last Checked
7 days ago
Abstract
Speech is a rich biometric signal that contains information about the identity, gender and emotional state of the speaker. In this work, we explore its potential to generate face images of a speaker by conditioning a Generative Adversarial Network (GAN) with raw speech input. We propose a deep neural network that is trained from scratch in an end-to-end fashion, generating a face directly from the raw speech waveform without any additional identity information (e.g reference image or one-hot encoding). Our model is trained in a self-supervised approach by exploiting the audio and visual signals naturally aligned in videos. With the purpose of training from video data, we present a novel dataset collected for this work, with high-quality videos of youtubers with notable expressiveness in both the speech and visual signals.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Multimedia
R.I.P.
๐ป
Ghosted
๐
๐
Old Age
Quality Assessment of In-the-Wild Videos
R.I.P.
๐ป
Ghosted
Viewport-Adaptive Navigable 360-Degree Video Delivery
R.I.P.
๐ป
Ghosted
A Comprehensive Survey on Cross-modal Retrieval
R.I.P.
๐ป
Ghosted
An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges
R.I.P.
๐ป
Ghosted