Temporal Action Detection with Structured Segment Networks

April 20, 2017 ยท Entered Twilight ยท ๐Ÿ› International Journal of Computer Vision

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, .gitmodules, LICENSE, README.md, anet_toolkit, binary_model.py, binary_test.py, binary_train.py, data, eval_detection_results.py, gen_bottom_up_proposals.py, gen_proposal_list.py, gen_sliding_window_proposals.py, load_binary_score.py, model_zoo, ops, requirements.txt, ssn_dataset.py, ssn_models.py, ssn_opts.py, ssn_test.py, ssn_train.py, transforms.py

Authors Yue Zhao, Yuanjun Xiong, Limin Wang, Zhirong Wu, Xiaoou Tang, Dahua Lin arXiv ID 1704.06228 Category cs.CV: Computer Vision Citations 966 Venue International Journal of Computer Vision Repository https://github.com/yjxiong/action-detection โญ 646 Last Checked 6 days ago
Abstract
Detecting actions in untrimmed videos is an important yet challenging task. In this paper, we present the structured segment network (SSN), a novel framework which models the temporal structure of each action instance via a structured temporal pyramid. On top of the pyramid, we further introduce a decomposed discriminative model comprising two classifiers, respectively for classifying actions and determining completeness. This allows the framework to effectively distinguish positive proposals from background or incomplete ones, thus leading to both accurate recognition and localization. These components are integrated into a unified network that can be efficiently trained in an end-to-end fashion. Additionally, a simple yet effective temporal action proposal scheme, dubbed temporal actionness grouping (TAG) is devised to generate high quality action proposals. On two challenging benchmarks, THUMOS14 and ActivityNet, our method remarkably outperforms previous state-of-the-art methods, demonstrating superior accuracy and strong adaptivity in handling actions with various temporal structures.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision