DEPA: Self-Supervised Audio Embedding for Depression Detection

October 29, 2019 ยท Declared Dead ยท ๐Ÿ› ACM Multimedia

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Pingyue Zhang, Mengyue Wu, Heinrich Dinkel, Kai Yu arXiv ID 1910.13028 Category cs.HC: Human-Computer Interaction Cross-listed cs.SD, eess.AS Citations 74 Venue ACM Multimedia Last Checked 3 months ago
Abstract
Depression detection research has increased over the last few decades, one major bottleneck of which is the limited data availability and representation learning. Recently, self-supervised learning has seen success in pretraining text embeddings and has been applied broadly on related tasks with sparse data, while pretrained audio embeddings based on self-supervised learning are rarely investigated. This paper proposes DEPA, a self-supervised, pretrained depression audio embedding method for depression detection. An encoder-decoder network is used to extract DEPA on in-domain depressed datasets (DAIC and MDD) and out-domain (Switchboard, Alzheimer's) datasets. With DEPA as the audio embedding extracted at response-level, a significant performance gain is achieved on downstream tasks, evaluated on both sparse datasets like DAIC and large major depression disorder dataset (MDD). This paper not only exhibits itself as a novel embedding extracting method capturing response-level representation for depression detection but more significantly, is an exploration of self-supervised learning in a specific task within audio processing.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Human-Computer Interaction

Died the same way โ€” ๐Ÿ‘ป Ghosted