StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning
January 15, 2022 ยท Declared Dead ยท ๐ Conference on Computer and Communications Security
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yupei Liu, Jinyuan Jia, Hongbin Liu, Neil Zhenqiang Gong
arXiv ID
2201.05889
Category
cs.CR: Cryptography & Security
Cross-listed
cs.LG
Citations
34
Venue
Conference on Computer and Communications Security
Last Checked
3 months ago
Abstract
Pre-trained encoders are general-purpose feature extractors that can be used for many downstream tasks. Recent progress in self-supervised learning can pre-train highly effective encoders using a large volume of unlabeled data, leading to the emerging encoder as a service (EaaS). A pre-trained encoder may be deemed confidential because its training requires lots of data and computation resources as well as its public release may facilitate misuse of AI, e.g., for deepfakes generation. In this paper, we propose the first attack called StolenEncoder to steal pre-trained image encoders. We evaluate StolenEncoder on multiple target encoders pre-trained by ourselves and three real-world target encoders including the ImageNet encoder pre-trained by Google, CLIP encoder pre-trained by OpenAI, and Clarifai's General Embedding encoder deployed as a paid EaaS. Our results show that our stolen encoders have similar functionality with the target encoders. In particular, the downstream classifiers built upon a target encoder and a stolen one have similar accuracy. Moreover, stealing a target encoder using StolenEncoder requires much less data and computation resources than pre-training it from scratch. We also explore three defenses that perturb feature vectors produced by a target encoder. Our results show these defenses are not enough to mitigate StolenEncoder.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Cryptography & Security
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Membership Inference Attacks against Machine Learning Models
R.I.P.
๐ป
Ghosted
The Limitations of Deep Learning in Adversarial Settings
R.I.P.
๐ป
Ghosted
Practical Black-Box Attacks against Machine Learning
R.I.P.
๐ป
Ghosted
Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks
R.I.P.
๐ป
Ghosted
Extracting Training Data from Large Language Models
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted