WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models
November 13, 2023 ยท Entered Twilight ยท ๐ Annual Meeting of the Association for Computational Linguistics
Repo contents: .gitignore, LICENSE, README.md, config, convert_llama_weights_to_hf.py, data, detect.py, detect_human.py, eval.py, generate.py, llama_flash_attn_monkey_patch.py, metrics.py, pictures, plot_image_and_tables.ipynb, pred.py, process, requirements.txt, shell, watermark
Authors
Shangqing Tu, Yuliang Sun, Yushi Bai, Jifan Yu, Lei Hou, Juanzi Li
arXiv ID
2311.07138
Category
cs.CL: Computation & Language
Cross-listed
cs.AI
Citations
22
Venue
Annual Meeting of the Association for Computational Linguistics
Repository
https://github.com/THU-KEG/WaterBench
โญ 30
Last Checked
1 month ago
Abstract
To mitigate the potential misuse of large language models (LLMs), recent research has developed watermarking algorithms, which restrict the generation process to leave an invisible trace for watermark detection. Due to the two-stage nature of the task, most studies evaluate the generation and detection separately, thereby presenting a challenge in unbiased, thorough, and applicable evaluations. In this paper, we introduce WaterBench, the first comprehensive benchmark for LLM watermarks, in which we design three crucial factors: (1) For benchmarking procedure, to ensure an apples-to-apples comparison, we first adjust each watermarking method's hyper-parameter to reach the same watermarking strength, then jointly evaluate their generation and detection performance. (2) For task selection, we diversify the input and output length to form a five-category taxonomy, covering $9$ tasks. (3) For evaluation metric, we adopt the GPT4-Judge for automatically evaluating the decline of instruction-following abilities after watermarking. We evaluate $4$ open-source watermarks on $2$ LLMs under $2$ watermarking strengths and observe the common struggles for current methods on maintaining the generation quality. The code and data are available at https://github.com/THU-KEG/WaterBench.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted