Data-Efficient Autoregressive Document Retrieval for Fact Verification

November 17, 2022 ยท Declared Dead ยท ๐Ÿ› SUSTAINLP

๐Ÿ’€ CAUSE OF DEATH: 404 Not Found
Code link is broken/dead
Authors James Thorne arXiv ID 2211.09388 Category cs.CL: Computation & Language Cross-listed cs.AI, cs.IR, cs.LG Citations 7 Venue SUSTAINLP Repository https://github.com/j6mes/sustainlp2022-deardr Last Checked 1 month ago
Abstract
Document retrieval is a core component of many knowledge-intensive natural language processing task formulations such as fact verification and question answering. Sources of textual knowledge, such as Wikipedia articles, condition the generation of answers from the models. Recent advances in retrieval use sequence-to-sequence models to incrementally predict the title of the appropriate Wikipedia page given a query. However, this method requires supervision in the form of human annotation to label which Wikipedia pages contain appropriate context. This paper introduces a distant-supervision method that does not require any annotation to train autoregressive retrievers that attain competitive R-Precision and Recall in a zero-shot setting. Furthermore we show that with task-specific supervised fine-tuning, autoregressive retrieval performance for two Wikipedia-based fact verification tasks can approach or even exceed full supervision using less than $1/4$ of the annotated data indicating possible directions for data-efficient autoregressive retrieval.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computation & Language

๐ŸŒ… ๐ŸŒ… Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL ๐Ÿ› NeurIPS ๐Ÿ“š 166.0K cites 8 years ago

Died the same way โ€” ๐Ÿ’€ 404 Not Found