FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission

March 01, 2024 ยท Declared Dead ยท ๐Ÿ› EuroMLSys@EuroSys

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Zeling Zhang, Dongqi Cai, Yiran Zhang, Mengwei Xu, Shangguang Wang, Ao Zhou arXiv ID 2403.00881 Category cs.LG: Machine Learning Cross-listed cs.DC, cs.NI Citations 11 Venue EuroMLSys@EuroSys Last Checked 3 months ago
Abstract
Communication overhead is a significant bottleneck in federated learning (FL), which has been exaggerated with the increasing size of AI models. In this paper, we propose FedRDMA, a communication-efficient cross-silo FL system that integrates RDMA into the FL communication protocol. To overcome the limitations of RDMA in wide-area networks (WANs), FedRDMA divides the updated model into chunks and designs a series of optimization techniques to improve the efficiency and robustness of RDMA-based communication. We implement FedRDMA atop the industrial federated learning framework and evaluate it on a real-world cross-silo FL scenario. The experimental results show that \sys can achieve up to 3.8$\times$ speedup in communication efficiency compared to traditional TCP/IP-based FL systems.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning

Died the same way โ€” ๐Ÿ‘ป Ghosted