R.I.P.
๐ป
Ghosted
Modality-Balanced Learning for Multimedia Recommendation
July 26, 2024 ยท Declared Dead ยท ๐ ACM Multimedia
Authors
Jinghao Zhang, Guofan Liu, Qiang Liu, Shu Wu, Liang Wang
arXiv ID
2408.06360
Category
cs.IR: Information Retrieval
Cross-listed
cs.CV
Citations
17
Venue
ACM Multimedia
Repository
https://github.com/CRIPAC-DIG/Balanced-Multimodal-Rec}
Last Checked
1 month ago
Abstract
Many recommender models have been proposed to investigate how to incorporate multimodal content information into traditional collaborative filtering framework effectively. The use of multimodal information is expected to provide more comprehensive information and lead to superior performance. However, the integration of multiple modalities often encounters the modal imbalance problem: since the information in different modalities is unbalanced, optimizing the same objective across all modalities leads to the under-optimization problem of the weak modalities with a slower convergence rate or lower performance. Even worse, we find that in multimodal recommendation models, all modalities suffer from the problem of insufficient optimization. To address these issues, we propose a Counterfactual Knowledge Distillation method that could solve the imbalance problem and make the best use of all modalities. Through modality-specific knowledge distillation, it could guide the multimodal model to learn modality-specific knowledge from uni-modal teachers. We also design a novel generic-and-specific distillation loss to guide the multimodal student to learn wider-and-deeper knowledge from teachers. Additionally, to adaptively recalibrate the focus of the multimodal model towards weaker modalities during training, we estimate the causal effect of each modality on the training objective using counterfactual inference techniques, through which we could determine the weak modalities, quantify the imbalance degree and re-weight the distillation loss accordingly. Our method could serve as a plug-and-play module for both late-fusion and early-fusion backbones. Extensive experiments on six backbones show that our proposed method can improve the performance by a large margin. The source code will be released at \url{https://github.com/CRIPAC-DIG/Balanced-Multimodal-Rec}
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Information Retrieval
R.I.P.
๐ป
Ghosted
LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation
R.I.P.
๐ป
Ghosted
Graph Convolutional Neural Networks for Web-Scale Recommender Systems
๐
๐
Old Age
Neural Graph Collaborative Filtering
R.I.P.
๐ป
Ghosted
Self-Attentive Sequential Recommendation
R.I.P.
๐ป
Ghosted
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
Died the same way โ ๐ 404 Not Found
R.I.P.
๐
404 Not Found
Deep High-Resolution Representation Learning for Visual Recognition
R.I.P.
๐
404 Not Found
HuggingFace's Transformers: State-of-the-art Natural Language Processing
R.I.P.
๐
404 Not Found
CCNet: Criss-Cross Attention for Semantic Segmentation
R.I.P.
๐
404 Not Found