Unsupervised Extraction of Video Highlights Via Robust Recurrent Auto-encoders
October 06, 2015 Β· Declared Dead Β· π IEEE International Conference on Computer Vision
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Huan Yang, Baoyuan Wang, Stephen Lin, David Wipf, Minyi Guo, Baining Guo
arXiv ID
1510.01442
Category
cs.CV: Computer Vision
Citations
183
Venue
IEEE International Conference on Computer Vision
Last Checked
4 months ago
Abstract
With the growing popularity of short-form video sharing platforms such as \em{Instagram} and \em{Vine}, there has been an increasing need for techniques that automatically extract highlights from video. Whereas prior works have approached this problem with heuristic rules or supervised learning, we present an unsupervised learning approach that takes advantage of the abundance of user-edited videos on social media websites such as YouTube. Based on the idea that the most significant sub-events within a video class are commonly present among edited videos while less interesting ones appear less frequently, we identify the significant sub-events via a robust recurrent auto-encoder trained on a collection of user-edited videos queried for each particular class of interest. The auto-encoder is trained using a proposed shrinking exponential loss function that makes it robust to noise in the web-crawled training data, and is configured with bidirectional long short term memory (LSTM)~\cite{LSTM:97} cells to better model the temporal structure of highlight segments. Different from supervised techniques, our method can infer highlights using only a set of downloaded edited videos, without also needing their pre-edited counterparts which are rarely available online. Extensive experiments indicate the promise of our proposed solution in this challenging unsupervised settin
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Computer Vision
π
π
Old Age
π
π
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
π
π
Old Age
SSD: Single Shot MultiBox Detector
π
π
Old Age
Squeeze-and-Excitation Networks
π
π
Old Age
Fast R-CNN
π
π
Old Age
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted