Faster Population Counts Using AVX2 Instructions
November 23, 2016 ยท Entered Twilight ยท ๐ Computer/law journal
"Last commit was 7.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .gitignore, LICENSE, Makefile, README.md, benchmarks, compare_compiler.sh, format.py, include, src, test_build.sh, tests
Authors
Wojciech Muลa, Nathan Kurz, Daniel Lemire
arXiv ID
1611.07612
Category
cs.DS: Data Structures & Algorithms
Citations
54
Venue
Computer/law journal
Repository
https://github.com/CountOnes/hamming_weight
โญ 51
Last Checked
1 month ago
Abstract
Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g., popcnt on x64 processors). Maybe surprisingly, we show that a vectorized approach using SIMD instructions can be twice as fast as using the dedicated instructions on recent Intel processors. The benefits can be even greater for applications such as similarity measures (e.g., the Jaccard index) that require additional Boolean operations. Our approach has been adopted by LLVM: it is used by its popular C compiler (clang).
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Data Structures & Algorithms
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
๐ป
Ghosted
Route Planning in Transportation Networks
R.I.P.
๐ป
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
๐ป
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
๐ป
Ghosted