| 651 |
Leveraging Multimodal Data and Side Users for Diffusion Cross-Domain Recommendation
Fan Zhang, Jinpeng Chen, ... (+7 more)
|
👻
Ghosted
|
cs.IR
|
0 |
11 months ago |
| 652 |
Multi-Modal Semantic Parsing for the Interpretation of Tombstone Inscriptions
Xiao Zhang, Johan Bos
|
👻
Ghosted
|
cs.CV
|
0 |
11 months ago |
| 653 |
Multimedia Verification Through Multi-Agent Deep Research Multimodal Large Language Models
Huy Hoan Le, Van Sy Thinh Nguyen, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
0 |
11 months ago |
| 654 |
Dual-Granularity Cross-Modal Identity Association for Weakly-Supervised Text-to-Person Image Matching
Yafei Zhang, Yongle Shang, Huafeng Li
|
👻
Ghosted
|
cs.CV
|
0 |
11 months ago |
| 655 |
Querying Autonomous Vehicle Point Clouds: Enhanced by 3D Object Counting with CounterNet
Xiaoyu Zhang, Zhifeng Bao, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
0 |
10 months ago |
| 656 |
Anchoring Trends: Mitigating Social Media Popularity Prediction Drift via Feature Clustering and Expansion
Chia-Ming Lee, Bo-Cheng Qiu, ... (+6 more)
|
👻
Ghosted
|
cs.MM
|
0 |
10 months ago |
| 657 |
Hot-Swap MarkBoard: An Efficient Black-box Watermarking Approach for Large-scale Model Distribution
Zhicheng Zhang, Peizhuo Lv, ... (+8 more)
|
👻
Ghosted
|
cs.CR
|
0 |
10 months ago |
| 658 |
FedBAP: Backdoor Defense via Benign Adversarial Perturbation in Federated Learning
Xinhai Yan, Libing Wu, ... (+4 more)
|
👻
Ghosted
|
cs.CR
|
0 |
10 months ago |
| 659 |
Towards Blind Bitstream-corrupted Video Recovery via a Visual Foundation Model-driven Framework
Tianyi Liu, Kejun Wu, ... (+4 more)
|
👻
Ghosted
|
eess.IV
|
0 |
10 months ago |
| 660 |
ExplorAR: Assisting Older Adults to Learn Smartphone Apps through AR-powered Trial-and-Error with Interactive Guidance
Jiawei Li, Linjie Qiu, ... (+4 more)
|
👻
Ghosted
|
cs.HC
|
0 |
10 months ago |
| 661 |
Why Generate When You Can Transform? Unleashing Generative Attention for Dynamic Recommendation
Yuli Liu, Wenjun Kong, ... (+2 more)
|
👻
Ghosted
|
cs.IR
|
0 |
10 months ago |
| 662 |
Learning Partially-Decorrelated Common Spaces for Ad-hoc Video Search
Fan Hu, Zijie Xin, Xirong Li
|
👻
Ghosted
|
cs.CV
|
0 |
10 months ago |
| 663 |
VideoForest: Person-Anchored Hierarchical Reasoning for Cross-Video Question Answering
Yiran Meng, Junhong Ye, ... (+5 more)
|
👻
Ghosted
|
cs.CV
|
0 |
10 months ago |
| 664 |
DepthGait: Multi-Scale Cross-Level Feature Fusion of RGB-Derived Depth and Silhouette Sequences for Robust Gait Recognition
Xinzhu Li, Juepeng Zheng, ... (+8 more)
|
👻
Ghosted
|
cs.CV
|
0 |
10 months ago |
| 665 |
VisAug: Facilitating Speech-Rich Web Video Navigation and Engagement with Auto-Generated Visual Augmentations
Baoquan Zhao, Xiaofan Ma, ... (+4 more)
|
👻
Ghosted
|
cs.MM
|
0 |
10 months ago |
| 666 |
Dual Prompt Learning for Adapting Vision-Language Models to Downstream Image-Text Retrieval
Yifan Wang, Tao Wang, ... (+6 more)
|
👻
Ghosted
|
cs.CV
|
0 |
10 months ago |
| 667 |
MSC: A Marine Wildlife Video Dataset with Grounded Segmentation and Clip-Level Captioning
Quang-Trung Truong, Yuk-Kwan Wong, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
0 |
10 months ago |
| 668 |
MetAdv: A Unified and Interactive Adversarial Testing Platform for Autonomous Driving
Aishan Liu, Jiakai Wang, ... (+7 more)
|
👻
Ghosted
|
cs.RO
|
0 |
10 months ago |
| 669 |
PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation
Sihan Zhao, Zixuan Wang, ... (+6 more)
|
👻
Ghosted
|
cs.CV
|
0 |
10 months ago |
| 670 |
Exploring Palette based Color Guidance in Diffusion Models
Qianru Qiu, Jiafeng Mao, Xueting Wang
|
👻
Ghosted
|
cs.GR
|
0 |
10 months ago |
| 671 |
MAGNeT: Multimodal Adaptive Gaussian Networks for Intent Inference in Moving Target Selection across Complex Scenarios
Xiangxian Li, Yawen Zheng, ... (+7 more)
|
👻
Ghosted
|
cs.MM
|
0 |
10 months ago |
| 672 |
Cross-Modal Prototype Augmentation and Dual-Grained Prompt Learning for Social Media Popularity Prediction
Ao Zhou, Mingsheng Tu, ... (+6 more)
|
👻
Ghosted
|
cs.IR
|
0 |
10 months ago |
| 673 |
Generative AI for Multimedia Communication: Recent Advances, An Information-Theoretic Framework, and Future Opportunities
Yili Jin, Xue Liu, Jiangchuan Liu
|
👻
Ghosted
|
cs.MM
|
0 |
9 months ago |
| 674 |
Generative Flow Networks for Personalized Multimedia Systems: A Case Study on Short Video Feeds
Yili Jin, Ling Pan, ... (+3 more)
|
👻
Ghosted
|
cs.MM
|
0 |
9 months ago |
| 675 |
OnlineHOI: Towards Online Human-Object Interaction Generation and Perception
Yihong Ji, Yunze Liu, ... (+5 more)
|
👻
Ghosted
|
cs.CV
|
0 |
9 months ago |
| 676 |
LEAF-Mamba: Local Emphatic and Adaptive Fusion State Space Model for RGB-D Salient Object Detection
Lanhu Wu, Zilin Gao, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
0 |
8 months ago |
| 677 |
Teaching AI to Feel: A Collaborative, Full-Body Exploration of Emotive Communication
Esen K. Tütüncü, Lissette Lemus, ... (+3 more)
|
👻
Ghosted
|
cs.HC
|
0 |
8 months ago |
| 678 |
NeuroSwift: A Lightweight Cross-Subject Framework for fMRI Visual Reconstruction of Complex Scenes
Shiyi Zhang, Dong Liang, Yihang Zhou
|
👻
Ghosted
|
cs.CV
|
0 |
8 months ago |
| 679 |
CHORD: Customizing Hybrid-precision On-device Model for Sequential Recommendation with Device-cloud Collaboration
Tianqi Liu, Kairui Fu, ... (+6 more)
|
👻
Ghosted
|
cs.LG
|
0 |
8 months ago |
| 680 |
Personality-Enhanced Multimodal Depression Detection in the Elderly
Honghong Wang, Jing Deng, Rong Zheng
|
👻
Ghosted
|
cs.SD
|
0 |
8 months ago |
| 681 |
SeeingSounds: Learning Audio-to-Visual Alignment via Text
Simone Carnemolla, Matteo Pennisi, ... (+4 more)
|
👻
Ghosted
|
cs.SD
|
0 |
8 months ago |
| 682 |
Generative Multi-Sensory Meditation: Exploring Immersive Depth and Activation in Virtual Reality
Yuyang Jiang, Binzhu Xie, ... (+5 more)
|
👻
Ghosted
|
cs.HC
|
0 |
8 months ago |
| 683 |
DeLoad: Demand-Driven Short-Video Preloading with Scalable Watch-Time Estimation
Tong Liu, Zhiwei Fan, ... (+6 more)
|
👻
Ghosted
|
cs.MM
|
0 |
8 months ago |
| 684 |
PIRA: Pan-CDN Intra-video Resource Adaptation for Short Video Streaming
Chunyu Qiao, Tong Liu, ... (+5 more)
|
👻
Ghosted
|
cs.MM
|
0 |
8 months ago |
| 685 |
A Matter of Time: Revealing the Structure of Time in Vision-Language Models
Nidham Tekaya, Manuela Waldner, Matthias Zeppelzauer
|
👻
Ghosted
|
cs.CV
|
0 |
7 months ago |
| 686 |
DMC$^3$: Dual-Modal Counterfactual Contrastive Construction for Egocentric Video Question Answering
Jiayi Zou, Chaofan Chen, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
0 |
7 months ago |
| 687 |
Mitigating Cross-modal Representation Bias for Multicultural Image-to-Recipe Retrieval
Qing Wang, Chong-Wah Ngo, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
0 |
7 months ago |
| 688 |
STATUS Bench: A Rigorous Benchmark for Evaluating Object State Understanding in Vision-Language Models
Mahiro Ukai, Shuhei Kurita, Nakamasa Inoue
|
👻
Ghosted
|
cs.CV
|
0 |
7 months ago |
| 689 |
MCIHN: A Hybrid Network Model Based on Multi-path Cross-modal Interaction for Multimodal Emotion Recognition
Haoyang Zhang, Zhou Yang, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
0 |
7 months ago |
| 690 |
MORE: Multi-Organ Medical Image REconstruction Dataset
Shaokai Wu, Yapan Guo, ... (+7 more)
|
👻
Ghosted
|
eess.IV
|
0 |
7 months ago |