Localization of Fake News Detection via Multitask Transfer Learning

October 21, 2019 · Declared Dead · 🏛 the LREC 2020 Proceedings

Repo contents: LICENSE, README.md

Authors Jan Christian Blaise Cruz, Julianne Agatha Tan, Charibeth Cheng arXiv ID 1910.09295 Category cs.CL: Computation & Language Citations 0 Venue the LREC 2020 Proceedings Repository https://github.com/jcblaisecruz02/Tagalog-fake-news ⭐ 17 Last Checked 1 month ago

Abstract

The use of the internet as a fast medium of spreading fake news reinforces the need for computational tools that combat it. Techniques that train fake news classifiers exist, but they all assume an abundance of resources including large labeled datasets and expert-curated corpora, which low-resource languages may not have. In this work, we make two main contributions: First, we alleviate resource scarcity by constructing the first expertly-curated benchmark dataset for fake news detection in Filipino, which we call "Fake News Filipino." Second, we benchmark Transfer Learning (TL) techniques and show that they can be used to train robust fake news classifiers from little data, achieving 91% accuracy on our fake news dataset, reducing the error by 14% compared to established few-shot baselines. Furthermore, lifting ideas from multitask learning, we show that augmenting transformer-based transfer techniques with auxiliary language modeling losses improves their performance by adapting to writing style. Using this, we improve TL performance by 4-6%, achieving an accuracy of 96% on our best model. Lastly, we show that our method generalizes well to different types of news articles, including political news, entertainment news, and opinion articles.