๐
๐
Old Age
Video Representation Learning and Latent Concept Mining for Large-scale Multi-label Video Classification
July 05, 2017 ยท Declared Dead ยท ๐ arXiv.org
Authors
Po-Yao Huang, Ye Yuan, Zhenzhong Lan, Lu Jiang, Alexander G. Hauptmann
arXiv ID
1707.01408
Category
cs.CV: Computer Vision
Citations
8
Venue
arXiv.org
Repository
https://github.com/Martini09/informedia-yt8m-release
Last Checked
1 month ago
Abstract
We report on CMU Informedia Lab's system used in Google's YouTube 8 Million Video Understanding Challenge. In this multi-label video classification task, our pipeline achieved 84.675% and 84.662% GAP on our evaluation split and the official test set. We attribute the good performance to three components: 1) Refined video representation learning with residual links and hypercolumns 2) Latent concept mining which captures interactions among concepts. 3) Learning with temporal segments and weighted multi-model ensemble. We conduct experiments to validate and analyze the contribution of our models. We also share some unsuccessful trials leveraging conventional approaches such as recurrent neural networks for video representation learning for this large-scale video dataset. All the codes to reproduce our results are publicly available at https://github.com/Martini09/informedia-yt8m-release.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
๐ป
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
R.I.P.
๐ป
Ghosted
Rethinking the Inception Architecture for Computer Vision
Died the same way โ ๐ 404 Not Found
R.I.P.
๐
404 Not Found
Deep High-Resolution Representation Learning for Visual Recognition
R.I.P.
๐
404 Not Found
HuggingFace's Transformers: State-of-the-art Natural Language Processing
R.I.P.
๐
404 Not Found
CCNet: Criss-Cross Attention for Semantic Segmentation
R.I.P.
๐
404 Not Found