Inference through innovation processes tested in the authorship attribution task

June 08, 2023 · Declared Dead · 🏛 Communications Physics

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Giulio Tani Raffaelli, Margherita Lalli, Francesca Tria arXiv ID 2306.05186 Category stat.ME Cross-listed cs.IT, physics.app-ph, physics.data-an Citations 3 Venue Communications Physics Last Checked 2 months ago

Abstract

Urn models for innovation capture fundamental empirical laws shared by several real-world processes. The so-called urn model with triggering includes, as particular cases, the urn representation of the two-parameter Poisson-Dirichlet process and the Dirichlet process, seminal in Bayesian non-parametric inference. In this work, we leverage this connection to introduce a general approach for quantifying closeness between symbolic sequences and test it within the framework of the authorship attribution problem. The method demonstrates high accuracy when compared to other related methods in different scenarios, featuring a substantial gain in computational efficiency and theoretical transparency. Beyond the practical convenience, this work demonstrates how the recently established connection between urn models and non-parametric Bayesian inference can pave the way for designing more efficient inference methods. In particular, the hybrid approach that we propose allows us to relax the exchangeability hypothesis, which can be particularly relevant for systems exhibiting complex correlation patterns and non-stationary dynamics.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — stat.ME

R.I.P. 👻 Ghosted

Causal inference using invariant prediction: identification and confidence intervals

Jonas Peters, Peter Bühlmann, Nicolai Meinshausen

stat.ME 🏛 J.RSSSB 📚 1.1K cites 11 years ago

R.I.P. 👻 Ghosted

Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology

Alexei Botchkarev

stat.ME 🏛 Interdisciplinary Journal of Information, Knowledge, and Management 📚 671 cites 7 years ago

R.I.P. 👻 Ghosted

External Validity: From Do-Calculus to Transportability Across Populations

Judea Pearl, Elias Bareinboim

stat.ME 🏛 Probabilistic and Causal Inference 📚 366 cites 11 years ago

R.I.P. 👻 Ghosted

Least Ambiguous Set-Valued Classifiers with Bounded Error Levels

Mauricio Sadinle, Jing Lei, Larry Wasserman

stat.ME 🏛 J.ASA 📚 318 cites 9 years ago

R.I.P. 👻 Ghosted

Doubly Robust Policy Evaluation and Optimization

Miroslav Dudík, Dumitru Erhan, ... (+2 more)

stat.ME 🏛 arXiv 📚 308 cites 11 years ago

R.I.P. 👻 Ghosted

Comparison of Bayesian predictive methods for model selection

Juho Piironen, Aki Vehtari

stat.ME 🏛 Statistics and computing 📚 304 cites 11 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago