Third-Party Aligner for Neural Word Alignments

November 08, 2022 · Entered Twilight · 🏛 Conference on Empirical Methods in Natural Language Processing

Repo contents: 8w, LICENSE, README.md, scripts, src

Authors Jinpeng Zhang, Chuanqi Dong, Xiangyu Duan, Yuqi Zhang, Min Zhang arXiv ID 2211.04198 Category cs.CL: Computation & Language Citations 0 Venue Conference on Empirical Methods in Natural Language Processing Repository https://github.com/sdongchuanqi/Third-Party-Supervised-Aligner ⭐ 6 Last Checked 1 month ago

Abstract

Word alignment is to find translationally equivalent words between source and target sentences. Previous work has demonstrated that self-training can achieve competitive word alignment results. In this paper, we propose to use word alignments generated by a third-party word aligner to supervise the neural word alignment training. Specifically, source word and target word of each word pair aligned by the third-party aligner are trained to be close neighbors to each other in the contextualized embedding space when fine-tuning a pre-trained cross-lingual language model. Experiments on the benchmarks of various language pairs show that our approach can surprisingly do self-correction over the third-party supervision by finding more accurate word alignments and deleting wrong word alignments, leading to better performance than various third-party word aligners, including the currently best one. When we integrate all supervisions from various third-party aligners, we achieve state-of-the-art word alignment performances, with averagely more than two points lower alignment error rates than the best third-party aligner. We released our code at https://github.com/sdongchuanqi/Third-Party-Supervised-Aligner.