Privacy Side Channels in Machine Learning Systems
September 11, 2023 Β· Declared Dead Β· π USENIX Security Symposium
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Edoardo Debenedetti, Giorgio Severi, Nicholas Carlini, Christopher A. Choquette-Choo, Matthew Jagielski, Milad Nasr, Eric Wallace, Florian Tramèr
arXiv ID
2309.05610
Category
cs.CR: Cryptography & Security
Cross-listed
cs.LG
Citations
50
Venue
USENIX Security Symposium
Last Checked
3 months ago
Abstract
Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum. Yet, in reality, these models are part of larger systems that include components for training data filtering, output monitoring, and more. In this work, we introduce privacy side channels: attacks that exploit these system-level components to extract private information at far higher rates than is otherwise possible for standalone models. We propose four categories of side channels that span the entire ML lifecycle (training data filtering, input preprocessing, output post-processing, and query filtering) and allow for enhanced membership inference, data extraction, and even novel threats such as extraction of users' test queries. For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees. We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set--even if the model did not memorize these keys. Taken together, our results demonstrate the need for a holistic, end-to-end privacy analysis of machine learning systems.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Cryptography & Security
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Membership Inference Attacks against Machine Learning Models
R.I.P.
π»
Ghosted
The Limitations of Deep Learning in Adversarial Settings
R.I.P.
π»
Ghosted
Practical Black-Box Attacks against Machine Learning
R.I.P.
π»
Ghosted
Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks
R.I.P.
π»
Ghosted
Extracting Training Data from Large Language Models
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted