When accurate prediction models yield harmful self-fulfilling prophecies

December 02, 2023 · Declared Dead · 🏛 Patterns

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Wouter A. C. van Amsterdam, Nan van Geloven, Jesse H. Krijthe, Rajesh Ranganath, Giovanni Ciná arXiv ID 2312.01210 Category stat.ME Cross-listed cs.LG, stat.ML Citations 18 Venue Patterns Last Checked 1 month ago

Abstract

Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. We show however, that using prediction models for decision making can lead to harmful decisions, even when the predictions exhibit good discrimination after deployment. These models are harmful self-fulfilling prophecies: their deployment harms a group of patients but the worse outcome of these patients does not invalidate the predictive power of the model. Our main result is a formal characterization of a set of such prediction models. Next we show that models that are well calibrated before and after deployment are useless for decision making as they made no change in the data distribution. These results point to the need to revise standard practices for validation, deployment and evaluation of prediction models that are used in medical decisions.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — stat.ME

R.I.P. 👻 Ghosted

Causal inference using invariant prediction: identification and confidence intervals

Jonas Peters, Peter Bühlmann, Nicolai Meinshausen

stat.ME 🏛 J.RSSSB 📚 1.1K cites 11 years ago

R.I.P. 👻 Ghosted

Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology

Alexei Botchkarev

stat.ME 🏛 Interdisciplinary Journal of Information, Knowledge, and Management 📚 671 cites 7 years ago

R.I.P. 👻 Ghosted

External Validity: From Do-Calculus to Transportability Across Populations

Judea Pearl, Elias Bareinboim

stat.ME 🏛 Probabilistic and Causal Inference 📚 366 cites 11 years ago

R.I.P. 👻 Ghosted

Least Ambiguous Set-Valued Classifiers with Bounded Error Levels

Mauricio Sadinle, Jing Lei, Larry Wasserman

stat.ME 🏛 J.ASA 📚 318 cites 9 years ago

R.I.P. 👻 Ghosted

Doubly Robust Policy Evaluation and Optimization

Miroslav Dudík, Dumitru Erhan, ... (+2 more)

stat.ME 🏛 arXiv 📚 308 cites 11 years ago

R.I.P. 👻 Ghosted

Comparison of Bayesian predictive methods for model selection

Juho Piironen, Aki Vehtari

stat.ME 🏛 Statistics and computing 📚 304 cites 11 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago