Faster Population Counts Using AVX2 Instructions

November 23, 2016 ยท Entered Twilight ยท ๐Ÿ› Computer/law journal

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 7.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, LICENSE, Makefile, README.md, benchmarks, compare_compiler.sh, format.py, include, src, test_build.sh, tests

Authors Wojciech Muล‚a, Nathan Kurz, Daniel Lemire arXiv ID 1611.07612 Category cs.DS: Data Structures & Algorithms Citations 54 Venue Computer/law journal Repository https://github.com/CountOnes/hamming_weight โญ 51 Last Checked 1 month ago
Abstract
Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g., popcnt on x64 processors). Maybe surprisingly, we show that a vectorized approach using SIMD instructions can be twice as fast as using the dedicated instructions on recent Intel processors. The benefits can be even greater for applications such as similarity measures (e.g., the Jaccard index) that require additional Boolean operations. Our approach has been adopted by LLVM: it is used by its popular C compiler (clang).
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Data Structures & Algorithms