๐
๐
Old Age
Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization
November 06, 2022 ยท Entered Twilight ยท ๐ IEEE Workshop/Winter Conference on Applications of Computer Vision
Repo contents: .gitignore, LICENSE, README.md, datamodule.py, datasets_flow.py, figures, metadata, model.py, preprocess, requirements.txt, test.py, train.py
Authors
Dennis Fedorishin, Deen Dayal Mohan, Bhavin Jawade, Srirangaraj Setlur, Venu Govindaraju
arXiv ID
2211.03019
Category
cs.CV: Computer Vision
Citations
14
Venue
IEEE Workshop/Winter Conference on Applications of Computer Vision
Repository
https://github.com/denfed/heartheflow
โญ 12
Last Checked
1 month ago
Abstract
Learning to localize the sound source in videos without explicit annotations is a novel area of audio-visual research. Existing work in this area focuses on creating attention maps to capture the correlation between the two modalities to localize the source of the sound. In a video, oftentimes, the objects exhibiting movement are the ones generating the sound. In this work, we capture this characteristic by modeling the optical flow in a video as a prior to better aid in localizing the sound source. We further demonstrate that the addition of flow-based attention substantially improves visual sound source localization. Finally, we benchmark our method on standard sound source localization datasets and achieve state-of-the-art performance on the Soundnet Flickr and VGG Sound Source datasets. Code: https://github.com/denfed/heartheflow.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
๐ป
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
R.I.P.
๐ป
Ghosted