False Discoveries Occur Early on the Lasso Path
November 05, 2015 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Weijie Su, Malgorzata Bogdan, Emmanuel Candes
arXiv ID
1511.01957
Category
math.ST
Cross-listed
cs.IT,
stat.ML
Citations
199
Venue
arXiv.org
Last Checked
1 month ago
Abstract
In regression settings where explanatory variables have very low correlations and there are relatively few effects, each of large magnitude, we expect the Lasso to find the important variables with few errors, if any. This paper shows that in a regime of linear sparsity---meaning that the fraction of variables with a non-vanishing effect tends to a constant, however small---this cannot really be the case, even when the design variables are stochastically independent. We demonstrate that true features and null features are always interspersed on the Lasso path, and that this phenomenon occurs no matter how strong the effect sizes are. We derive a sharp asymptotic trade-off between false and true positive rates or, equivalently, between measures of type I and type II errors along the Lasso path. This trade-off states that if we ever want to achieve a type II error (false negative rate) under a critical value, then anywhere on the Lasso path the type I error (false positive rate) will need to exceed a given threshold so that we can never have both errors at a low level at the same time. Our analysis uses tools from approximate message passing (AMP) theory as well as novel elements to deal with a possibly adaptive selection of the Lasso regularizing parameter.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β math.ST
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists
R.I.P.
π»
Ghosted
Minimax Optimal Procedures for Locally Private Estimation
R.I.P.
π»
Ghosted
Optimal Best Arm Identification with Fixed Confidence
R.I.P.
π»
Ghosted
Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees
R.I.P.
π»
Ghosted
User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted