Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition

September 05, 2018 · Entered Twilight · 🏛 European Conference on Computer Vision

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, LICENSE, README.md, TODO.txt, experiments, faiss_kmeans.py, main.py, multi_api.py, preprocess.py, run_baselines.sh, single_api.py, source, test_api.py, test_multi.py, tools

Authors Xiaohang Zhan, Ziwei Liu, Junjie Yan, Dahua Lin, Chen Change Loy arXiv ID 1809.01407 Category cs.CV: Computer Vision Cross-listed cs.LG Citations 79 Venue European Conference on Computer Vision Repository https://github.com/XiaohangZhan/cdp ⭐ 452 Last Checked 5 days ago

Abstract

Face recognition has witnessed great progress in recent years, mainly attributed to the high-capacity model designed and the abundant labeled data collected. However, it becomes more and more prohibitive to scale up the current million-level identity annotations. In this work, we show that unlabeled face data can be as effective as the labeled ones. Here, we consider a setting closely mimicking the real-world scenario, where the unlabeled data are collected from unconstrained environments and their identities are exclusive from the labeled ones. Our main insight is that although the class information is not available, we can still faithfully approximate these semantic relationships by constructing a relational graph in a bottom-up manner. We propose Consensus-Driven Propagation (CDP) to tackle this challenging problem with two modules, the "committee" and the "mediator", which select positive face pairs robustly by carefully aggregating multi-view information. Extensive experiments validate the effectiveness of both modules to discard outliers and mine hard positives. With CDP, we achieve a compelling accuracy of 78.18% on MegaFace identification challenge by using only 9% of the labels, comparing to 61.78% when no unlabeled data are used and 78.52% when all labels are employed.