Efficiently Creating 3D Training Data for Fine Hand Pose Estimation

May 11, 2016 · Entered Twilight · 🏛 Computer Vision and Pattern Recognition

"Last commit was 8.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, LICENSE, README.md, data, src, user_manual.pdf

Authors Markus Oberweger, Gernot Riegler, Paul Wohlhart, Vincent Lepetit arXiv ID 1605.03389 Category cs.CV: Computer Vision Cross-listed cs.HC Citations 99 Venue Computer Vision and Pattern Recognition Repository https://github.com/moberweger/semi-auto-anno ⭐ 24 Last Checked 1 month ago

Abstract

While many recent hand pose estimation methods critically rely on a training set of labelled frames, the creation of such a dataset is a challenging task that has been overlooked so far. As a result, existing datasets are limited to a few sequences and individuals, with limited accuracy, and this prevents these methods from delivering their full potential. We propose a semi-automated method for efficiently and accurately labeling each frame of a hand depth video with the corresponding 3D locations of the joints: The user is asked to provide only an estimate of the 2D reprojections of the visible joints in some reference frames, which are automatically selected to minimize the labeling work by efficiently optimizing a sub-modular loss function. We then exploit spatial, temporal, and appearance constraints to retrieve the full 3D poses of the hand over the complete sequence. We show that this data can be used to train a recent state-of-the-art hand pose estimation method, leading to increased accuracy. The code and dataset can be found on our website https://cvarlab.icg.tugraz.at/projects/hand_detection/