High-Performance Tensor Contraction without Transposition
July 01, 2016 ยท Declared Dead ยท ๐ SIAM Journal on Scientific Computing
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Devin A. Matthews
arXiv ID
1607.00291
Category
cs.MS: Mathematical Software
Cross-listed
cs.DC,
cs.PF
Citations
81
Venue
SIAM Journal on Scientific Computing
Last Checked
1 month ago
Abstract
Tensor computations--in particular tensor contraction (TC)--are important kernels in many scientific computing applications. Due to the fundamental similarity of TC to matrix multiplication (MM) and to the availability of optimized implementations such as the BLAS, tensor operations have traditionally been implemented in terms of BLAS operations, incurring both a performance and a storage overhead. Instead, we implement TC using the flexible BLIS framework, which allows for transposition (reshaping) of the tensor to be fused with internal partitioning and packing operations, requiring no explicit transposition operations or additional workspace. This implementation, TBLIS, achieves performance approaching that of MM, and in some cases considerably higher than that of traditional TC. Our implementation supports multithreading using an approach identical to that used for MM in BLIS, with similar performance characteristics. The complexity of managing tensor-to-matrix transformations is also handled automatically in our approach, greatly simplifying its use in scientific applications.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Mathematical Software
๐
๐
Old Age
๐
๐
Old Age
CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication
R.I.P.
๐ป
Ghosted
Mathematical Foundations of the GraphBLAS
R.I.P.
๐ป
Ghosted
The DUNE Framework: Basic Concepts and Recent Developments
R.I.P.
๐ป
Ghosted
Format Abstraction for Sparse Tensor Algebra Compilers
R.I.P.
๐ป
Ghosted
AMReX: Block-Structured Adaptive Mesh Refinement for Multiphysics Applications
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted