ReLU-KAN: New Kolmogorov-Arnold Networks that Only Need Matrix Addition, Dot Multiplication, and ReLU

June 04, 2024 · Entered Twilight · 🏛 arXiv.org

Repo contents: .idea, README.md, __pycache__, catastrophic_forgetting.py, data, exp_fitting.py, exp_speed.py, fitting_example.py, img, main.py, plt_comp.py, relukan.pdf, requirements.txt, torch_relu_kan.py

Authors Qi Qiu, Tao Zhu, Helin Gong, Liming Chen, Huansheng Ning arXiv ID 2406.02075 Category cs.LG: Machine Learning Cross-listed cs.NE Citations 39 Venue arXiv.org Repository https://github.com/quiqi/relu_kan ⭐ 98 Last Checked 1 month ago

Abstract

Limited by the complexity of basis function (B-spline) calculations, Kolmogorov-Arnold Networks (KAN) suffer from restricted parallel computing capability on GPUs. This paper proposes a novel ReLU-KAN implementation that inherits the core idea of KAN. By adopting ReLU (Rectified Linear Unit) and point-wise multiplication, we simplify the design of KAN's basis function and optimize the computation process for efficient CUDA computing. The proposed ReLU-KAN architecture can be readily implemented on existing deep learning frameworks (e.g., PyTorch) for both inference and training. Experimental results demonstrate that ReLU-KAN achieves a 20x speedup compared to traditional KAN with 4-layer networks. Furthermore, ReLU-KAN exhibits a more stable training process with superior fitting ability while preserving the "catastrophic forgetting avoidance" property of KAN. You can get the code in https://github.com/quiqi/relu_kan