3D Hand Pose Estimation using Simulation and Partial-Supervision with a Shared Latent Space

July 14, 2018 · Entered Twilight · 🏛 British Machine Vision Conference

"Last commit was 7.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .DS_Store, LICENSE, README.md, Youtube.png, exps, img, src

Authors Masoud Abdi, Ehsan Abbasnejad, Chee Peng Lim, Saeid Nahavandi arXiv ID 1807.05380 Category cs.CV: Computer Vision Citations 15 Venue British Machine Vision Conference Repository https://github.com/masabdi/LSPS ⭐ 61 Last Checked 1 month ago

Abstract

Tremendous amounts of expensive annotated data are a vital ingredient for state-of-the-art 3d hand pose estimation. Therefore, synthetic data has been popularized as annotations are automatically available. However, models trained only with synthetic samples do not generalize to real data, mainly due to the gap between the distribution of synthetic and real data. In this paper, we propose a novel method that seeks to predict the 3d position of the hand using both synthetic and partially-labeled real data. Accordingly, we form a shared latent space between three modalities: synthetic depth image, real depth image, and pose. We demonstrate that by carefully learning the shared latent space, we can find a regression model that is able to generalize to real data. As such, we show that our method produces accurate predictions in both semi-supervised and unsupervised settings. Additionally, the proposed model is capable of generating novel, meaningful, and consistent samples from all of the three domains. We evaluate our method qualitatively and quantitively on two highly competitive benchmarks (i.e., NYU and ICVL) and demonstrate its superiority over the state-of-the-art methods. The source code will be made available at https://github.com/masabdi/LSPS.