Orthogonal Matching Pursuit for Text Classification

July 12, 2018 · Entered Twilight · 🏛 NUT@EMNLP

"Last commit was 6.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .DS_Store, .gitignore, GOMP.m, L1General, OMP.m, README.md, data, demo_gomp.m, demo_omp.m, l1Obj.m, lassoObj.m, logreg_L1.m, logreg_L2.m, logreg_regularized.m, minFunc_2012, myBinomTest.m, penalizedL2.m, predict.m

Authors Konstantinos Skianis, Nikolaos Tziortziotis, Michalis Vazirgiannis arXiv ID 1807.04715 Category cs.LG: Machine Learning Cross-listed cs.CL, stat.ML Citations 5 Venue NUT@EMNLP Repository https://github.com/y3nk0/OMP-for-Text-Classification Last Checked 1 month ago

Abstract

In text classification, the problem of overfitting arises due to the high dimensionality, making regularization essential. Although classic regularizers provide sparsity, they fail to return highly accurate models. On the contrary, state-of-the-art group-lasso regularizers provide better results at the expense of low sparsity. In this paper, we apply a greedy variable selection algorithm, called Orthogonal Matching Pursuit, for the text classification task. We also extend standard group OMP by introducing overlapping Group OMP to handle overlapping groups of features. Empirical analysis verifies that both OMP and overlapping GOMP constitute powerful regularizers, able to produce effective and very sparse models. Code and data are available online: https://github.com/y3nk0/OMP-for-Text-Classification .