Egocentric Object Manipulation Graphs
June 05, 2020 · Declared Dead · 🏛 arXiv.org
"Paper promises code 'coming soon'"
Evidence collected by the PWNC Scanner
Authors
Eadom Dessalene, Michael Maynord, Chinmaya Devaraj, Cornelia Fermuller, Yiannis Aloimonos
arXiv ID
2006.03201
Category
cs.CV: Computer Vision
Cross-listed
cs.AI,
cs.RO
Citations
20
Venue
arXiv.org
Last Checked
1 month ago
Abstract
We introduce Egocentric Object Manipulation Graphs (Ego-OMG) - a novel representation for activity modeling and anticipation of near future actions integrating three components: 1) semantic temporal structure of activities, 2) short-term dynamics, and 3) representations for appearance. Semantic temporal structure is modeled through a graph, embedded through a Graph Convolutional Network, whose states model characteristics of and relations between hands and objects. These state representations derive from all three levels of abstraction, and span segments delimited by the making and breaking of hand-object contact. Short-term dynamics are modeled in two ways: A) through 3D convolutions, and B) through anticipating the spatiotemporal end points of hand trajectories, where hands come into contact with objects. Appearance is modeled through deep spatiotemporal features produced through existing methods. We note that in Ego-OMG it is simple to swap these appearance features, and thus Ego-OMG is complementary to most existing action anticipation methods. We evaluate Ego-OMG on the EPIC Kitchens Action Anticipation Challenge. The consistency of the egocentric perspective of EPIC Kitchens allows for the utilization of the hand-centric cues upon which Ego-OMG relies. We demonstrate state-of-the-art performance, outranking all other previous published methods by large margins and ranking first on the unseen test set and second on the seen test set of the EPIC Kitchens Action Anticipation Challenge. We attribute the success of Ego-OMG to the modeling of semantic structure captured over long timespans. We evaluate the design choices made through several ablation studies. Code will be released upon acceptance
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Computer Vision
🌅
🌅
Old Age
🌅
🌅
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
👻
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
🌅
🌅
Old Age
SSD: Single Shot MultiBox Detector
🌅
🌅
Old Age
Squeeze-and-Excitation Networks
R.I.P.
👻
Ghosted
Rethinking the Inception Architecture for Computer Vision
Died the same way — ⏳ Coming Soon™
R.I.P.
⏳
Coming Soon™
Exploring Simple Siamese Representation Learning
R.I.P.
⏳
Coming Soon™
An Analysis of Scale Invariance in Object Detection - SNIP
R.I.P.
⏳
Coming Soon™
Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection
R.I.P.
⏳
Coming Soon™