The Mystery of In-Context Learning: A Comprehensive Survey on Interpretation and Analysis
November 01, 2023 ยท Declared Dead ยท ๐ Conference on Empirical Methods in Natural Language Processing
Repo contents: README.md
Authors
Yuxiang Zhou, Jiazheng Li, Yanzheng Xiang, Hanqi Yan, Lin Gui, Yulan He
arXiv ID
2311.00237
Category
cs.CL: Computation & Language
Citations
36
Venue
Conference on Empirical Methods in Natural Language Processing
Repository
https://github.com/zyxnlp/ICL-Interpretation-Analysis-Resources
โญ 15
Last Checked
1 month ago
Abstract
Understanding in-context learning (ICL) capability that enables large language models (LLMs) to excel in proficiency through demonstration examples is of utmost importance. This importance stems not only from the better utilization of this capability across various tasks, but also from the proactive identification and mitigation of potential risks, including concerns regarding truthfulness, bias, and toxicity, that may arise alongside the capability. In this paper, we present a thorough survey on the interpretation and analysis of in-context learning. First, we provide a concise introduction to the background and definition of in-context learning. Then, we give an overview of advancements from two perspectives: 1) a theoretical perspective, emphasizing studies on mechanistic interpretability and delving into the mathematical foundations behind ICL; and 2) an empirical perspective, concerning studies that empirically analyze factors associated with ICL. We conclude by highlighting the challenges encountered and suggesting potential avenues for future research. We believe that our work establishes the basis for further exploration into the interpretation of in-context learning. Additionally, we have created a repository containing the resources referenced in our survey.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted
Deep contextualized word representations
Died the same way โ ๐ Death by README
R.I.P.
๐
Death by README
Momentum Contrast for Unsupervised Visual Representation Learning
R.I.P.
๐
Death by README
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
R.I.P.
๐
Death by README
Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach
R.I.P.
๐
Death by README