RobustNLP: A Technique to Defend NLP Models Against Backdoor Attacks

February 18, 2023 · Declared Dead · 🏛 arXiv.org

Repo contents: Backdoor Learning resources for NLP.docx, activation_clustering_defence-main.zip

Authors Marwan Omar arXiv ID 2302.09420 Category cs.CR: Cryptography & Security Citations 0 Venue arXiv.org Repository https://github.com/marwanomar1/Backdoor-Learning-for-NLP Last Checked 1 month ago

Abstract

As machine learning (ML) systems are being increasingly employed in the real world to handle sensitive tasks and make decisions in various fields, the security and privacy of those models have also become increasingly critical. In particular, Deep Neural Networks (DNN) have been shown to be vulnerable to backdoor attacks whereby adversaries have access to the training data and the opportunity to manipulate such data by inserting carefully developed samples into the training dataset. Although the NLP community has produced several studies on generating backdoor attacks proving the vulnerable state of language modes, to the best of our knowledge, there does not exist any work to combat such attacks. To bridge this gap, we present RobustEncoder: a novel clustering-based technique for detecting and removing backdoor attacks in the text domain. Extensive empirical results demonstrate the effectiveness of our technique in detecting and removing backdoor triggers. Our code is available at https://github.com/marwanomar1/Backdoor-Learning-for-NLP

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 💻 Repository 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Cryptography & Security

R.I.P. 👻 Ghosted

Towards Evaluating the Robustness of Neural Networks

Nicholas Carlini, David Wagner

cs.CR 🏛 IEEE S&P 📚 9.5K cites 9 years ago

R.I.P. 👻 Ghosted

Membership Inference Attacks against Machine Learning Models

Reza Shokri, Marco Stronati, ... (+2 more)

cs.CR 🏛 IEEE S&P 📚 4.9K cites 9 years ago

R.I.P. 👻 Ghosted

The Limitations of Deep Learning in Adversarial Settings

Nicolas Papernot, Patrick McDaniel, ... (+4 more)

cs.CR 🏛 IEEE S&P 📚 4.2K cites 10 years ago

R.I.P. 👻 Ghosted

Practical Black-Box Attacks against Machine Learning

Nicolas Papernot, Patrick McDaniel, ... (+4 more)

cs.CR 🏛 ASIACCS 📚 3.9K cites 10 years ago

R.I.P. 👻 Ghosted

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks

Nicolas Papernot, Patrick McDaniel, ... (+3 more)

cs.CR 🏛 IEEE S&P 📚 3.2K cites 10 years ago

R.I.P. 👻 Ghosted

Extracting Training Data from Large Language Models

Nicholas Carlini, Florian Tramer, ... (+10 more)

cs.CR 🏛 USENIX Sec 📚 2.6K cites 5 years ago

Died the same way — 🦴 Skeleton Repo

R.I.P. 🦴 Skeleton Repo

EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification

Patrick Helber, Benjamin Bischke, ... (+2 more)

cs.CV 🏛 J.STAEORS 📚 2.4K cites 8 years ago

R.I.P. 🦴 Skeleton Repo

Deep Learning for 3D Point Clouds: A Survey

Yulan Guo, Hanyun Wang, ... (+4 more)

cs.CV 🏛 IEEE TPAMI 📚 2.1K cites 6 years ago

R.I.P. 🦴 Skeleton Repo

Adversarial Examples: Attacks and Defenses for Deep Learning

Xiaoyong Yuan, Pan He, ... (+2 more)

cs.LG 🏛 IEEE TNNLS 📚 1.8K cites 8 years ago

R.I.P. 🦴 Skeleton Repo

Neural Style Transfer: A Review

Yongcheng Jing, Yezhou Yang, ... (+4 more)

cs.CV 🏛 IEEE TVCG 📚 828 cites 8 years ago