Variational Autoencoders for Learning Latent Representations of Speech Emotion: A Preliminary Study
December 23, 2017 ยท Declared Dead ยท ๐ Interspeech
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Siddique Latif, Rajib Rana, Junaid Qadir, Julien Epps
arXiv ID
1712.08708
Category
cs.SD: Sound
Cross-listed
cs.LG,
eess.AS,
stat.ML
Citations
104
Venue
Interspeech
Last Checked
3 months ago
Abstract
Learning the latent representation of data in unsupervised fashion is a very interesting process that provides relevant features for enhancing the performance of a classifier. For speech emotion recognition tasks, generating effective features is crucial. Currently, handcrafted features are mostly used for speech emotion recognition, however, features learned automatically using deep learning have shown strong success in many problems, especially in image processing. In particular, deep generative models such as Variational Autoencoders (VAEs) have gained enormous success for generating features for natural images. Inspired by this, we propose VAEs for deriving the latent representation of speech signals and use this representation to classify emotions. To the best of our knowledge, we are the first to propose VAEs for speech emotion classification. Evaluations on the IEMOCAP dataset demonstrate that features learned by VAEs can produce state-of-the-art results for speech emotion classification.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Sound
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
CNN Architectures for Large-Scale Audio Classification
R.I.P.
๐ป
Ghosted
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
R.I.P.
๐ป
Ghosted
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R.I.P.
๐ป
Ghosted
Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted