NewB: 200,000+ Sentences for Political Bias Detection

June 04, 2020 · Declared Dead · 🏛 arXiv.org

Repo contents: README.md, conservative.txt, liberal.txt, test.txt, train_orig.txt

Authors Jerry Wei arXiv ID 2006.03051 Category cs.CL: Computation & Language Citations 5 Venue arXiv.org Repository https://github.com/JerryWeiAI/NewB ⭐ 17 Last Checked 1 month ago

Abstract

We present the Newspaper Bias Dataset (NewB), a text corpus of more than 200,000 sentences from eleven news sources regarding Donald Trump. While previous datasets have labeled sentences as either liberal or conservative, NewB covers the political views of eleven popular media sources, capturing more nuanced political viewpoints than a traditional binary classification system does. We train two state-of-the-art deep learning models to predict the news source of a given sentence from eleven newspapers and find that a recurrent neural network achieved top-1, top-3, and top-5 accuracies of 33.3%, 61.4%, and 77.6%, respectively, significantly outperforming a baseline logistic regression model's accuracies of 18.3%, 42.6%, and 60.8%. Using the news source label of sentences, we analyze the top n-grams with our model to gain meaningful insight into the portrayal of Trump by media sources.We hope that the public release of our dataset will encourage further research in using natural language processing to analyze more complex political biases. Our dataset is posted at https://github.com/JerryWeiAI/NewB .

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 💻 Repository 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Computation & Language

🌅 🌅 Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL 🏛 NeurIPS 📚 166.0K cites 8 years ago

🌅 🌅 Old Age

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, ... (+2 more)

cs.CL 🏛 NAACL 📚 110.2K cites 7 years ago

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, Myle Ott, ... (+8 more)

cs.CL 🏛 arXiv 📚 28.4K cites 6 years ago

R.I.P. 👻 Ghosted

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Mike Lewis, Yinhan Liu, ... (+6 more)

cs.CL 🏛 ACL 📚 12.3K cites 6 years ago

R.I.P. 👻 Ghosted

Deep contextualized word representations

Matthew E. Peters, Mark Neumann, ... (+5 more)

cs.CL 🏛 NAACL 📚 12.0K cites 8 years ago

Died the same way — 🦴 Skeleton Repo

R.I.P. 🦴 Skeleton Repo

EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification

Patrick Helber, Benjamin Bischke, ... (+2 more)

cs.CV 🏛 J.STAEORS 📚 2.4K cites 8 years ago

R.I.P. 🦴 Skeleton Repo

Deep Learning for 3D Point Clouds: A Survey

Yulan Guo, Hanyun Wang, ... (+4 more)

cs.CV 🏛 IEEE TPAMI 📚 2.1K cites 6 years ago

R.I.P. 🦴 Skeleton Repo

Adversarial Examples: Attacks and Defenses for Deep Learning

Xiaoyong Yuan, Pan He, ... (+2 more)

cs.LG 🏛 IEEE TNNLS 📚 1.8K cites 8 years ago

R.I.P. 🦴 Skeleton Repo

Neural Style Transfer: A Review

Yongcheng Jing, Yezhou Yang, ... (+4 more)

cs.CV 🏛 IEEE TVCG 📚 828 cites 8 years ago