iBERT: Interpretable Embeddings via Sense Decomposition
October 10, 2025 ยท Declared Dead ยท ๐ the Main Proceedings of EACL 2026
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Vishal Anand, Milad Alshomary, Kathleen McKeown
arXiv ID
2510.09882
Category
cs.CL: Computation & Language
Citations
0
Venue
the Main Proceedings of EACL 2026
Last Checked
3 months ago
Abstract
We present iBERT (interpretable-BERT), an encoder to produce inherently interpretable and controllable embeddings - designed to modularize and expose the discriminative cues present in language, such as semantic or stylistic structure. Each input token is represented as a sparse, non-negative mixture over k context-independent sense vectors, which can be pooled into sentence embeddings or used directly at the token level. This enables modular control over representation, before any decoding or downstream use. To demonstrate our model's interpretability, we evaluate it on a suite of style-focused tasks. On the STEL benchmark, it improves style representation effectiveness by ~8 points over SBERT-style baselines, while maintaining competitive performance on authorship verification. Because each embedding is a structured composition of interpretable senses, we highlight how specific style attributes get assigned to specific sense vectors. While our experiments center on style, iBERT is not limited to stylistic modeling. Its structural modularity is designed to interpretably decompose whichever discriminative signals are present in the data - enabling generalization even when supervision blends semantic or stylistic factors.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted
Deep contextualized word representations
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted