Detecting out-of-distribution text using topological features of transformer-based language models
November 22, 2023 ยท Declared Dead ยท ๐ AISafety@IJCAI
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Andres Pollano, Anupam Chaudhuri, Anj Simmons
arXiv ID
2311.13102
Category
cs.CL: Computation & Language
Cross-listed
cs.LG,
math.AT
Citations
2
Venue
AISafety@IJCAI
Last Checked
3 months ago
Abstract
To safeguard machine learning systems that operate on textual data against out-of-distribution (OOD) inputs that could cause unpredictable behaviour, we explore the use of topological features of self-attention maps from transformer-based language models to detect when input text is out of distribution. Self-attention forms the core of transformer-based language models, dynamically assigning vectors to words based on context, thus in theory our methodology is applicable to any transformer-based language model with multihead self-attention. We evaluate our approach on BERT and compare it to a traditional OOD approach using CLS embeddings. Our results show that our approach outperforms CLS embeddings in distinguishing in-distribution samples from far-out-of-domain samples, but struggles with near or same-domain datasets.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age
HellaSwag: Can a Machine Really Finish Your Sentence?
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted