R.I.P.
π»
Ghosted
Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup
December 10, 2023 Β· Declared Dead Β· π The Web Conference
Authors
Maolin Wang, Yao Zhao, Jiajia Liu, Jingdong Chen, Chenyi Zhuang, Jinjie Gu, Ruocheng Guo, Xiangyu Zhao
arXiv ID
2312.05795
Category
cs.AI: Artificial Intelligence
Citations
10
Venue
The Web Conference
Repository
https://github.com/MorinW/AntGMM$\_$Pruning}
Last Checked
1 month ago
Abstract
The deployment of Large Multimodal Models (LMMs) within AntGroup has significantly advanced multimodal tasks in payment, security, and advertising, notably enhancing advertisement audition tasks in Alipay. However, the deployment of such sizable models introduces challenges, particularly in increased latency and carbon emissions, which are antithetical to the ideals of Green AI. This paper introduces a novel multi-stage compression strategy for our proprietary LLM, AntGMM. Our methodology pivots on three main aspects: employing small training sample sizes, addressing multi-level redundancy through multi-stage pruning, and introducing an advanced distillation loss design. In our research, we constructed a dataset, the Multimodal Advertisement Audition Dataset (MAAD), from real-world scenarios within Alipay, and conducted experiments to validate the reliability of our proposed strategy. Furthermore, the effectiveness of our strategy is evident in its operational success in Alipay's real-world multimodal advertisement audition for three months from September 2023. Notably, our approach achieved a substantial reduction in latency, decreasing it from 700ms to 90ms, while maintaining online performance with only a slight performance decrease. Moreover, our compressed model is estimated to reduce electricity consumption by approximately 75 million kWh annually compared to the direct deployment of AntGMM, demonstrating our commitment to green AI initiatives. We will publicly release our code and the MAAD dataset after some reviews\footnote{https://github.com/MorinW/AntGMM$\_$Pruning}.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
R.I.P.
π»
Ghosted
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
R.I.P.
π»
Ghosted
Addressing Function Approximation Error in Actor-Critic Methods
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
R.I.P.
π»
Ghosted
Complex Embeddings for Simple Link Prediction
Died the same way β π 404 Not Found
R.I.P.
π
404 Not Found
Deep High-Resolution Representation Learning for Visual Recognition
R.I.P.
π
404 Not Found
HuggingFace's Transformers: State-of-the-art Natural Language Processing
R.I.P.
π
404 Not Found
CCNet: Criss-Cross Attention for Semantic Segmentation
R.I.P.
π
404 Not Found