| 501 |
MVP: Winning Solution to SMP Challenge 2025 Video Track
Liliang Ye, Yunyao Zhang, ... (+5 more)
|
👻
Ghosted
|
cs.CV
|
5 |
11 months ago |
| 502 |
TinyServe: Query-Aware Cache Selection for Efficient LLM Serving
Dong Liu, Yanxuan Yu
|
👻
Ghosted
|
cs.DC
|
5 |
9 months ago |
| 503 |
Scalable Compression of Deep Neural Networks
Xing Wang, Jie Liang
|
👻
Ghosted
|
cs.CV
|
4 |
9 years ago |
| 504 |
Depth Structure Preserving Scene Image Generation
Wendong Zhang, Bingbing Ni, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
4 |
9 years ago |
| 505 |
An evaluation of large-scale methods for image instance and class discovery
Matthijs Douze, Hervé Jégou, Jeff Johnson
|
👻
Ghosted
|
cs.CV
|
4 |
8 years ago |
| 506 |
Impact of Three-Dimensional Video Scalability on Multi-View Activity Recognition using Deep Learning
Jun-Ho Choi, Manri Cheon, ... (+2 more)
|
👻
Ghosted
|
cs.MM
|
4 |
8 years ago |
| 507 |
Temporal Cross-Media Retrieval with Soft-Smoothing
David Semedo, João Magalhães
|
👻
Ghosted
|
cs.MM
|
4 |
7 years ago |
| 508 |
IntersectGAN: Learning Domain Intersection for Generating Images with Multiple Attributes
Zehui Yao, Boyan Zhang, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
4 |
6 years ago |
| 509 |
BUDA.ART: A Multimodal Content-Based Analysis and Retrieval System for Buddha Statues
Benjamin Renoust, Matheus Oliveira Franca, ... (+7 more)
|
👻
Ghosted
|
cs.CV
|
4 |
6 years ago |
| 510 |
Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds
Zichuan Xu, Jiangkai Wu, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
4 |
5 years ago |
| 511 |
A Simple Baseline for Pose Tracking in Videos of Crowded Scenes
Li Yuan, Shuning Chang, ... (+7 more)
|
👻
Ghosted
|
cs.CV
|
4 |
5 years ago |
| 512 |
TeViS:Translating Text Synopses to Video Storyboards
Xu Gu, Yuchong Sun, ... (+6 more)
|
👻
Ghosted
|
cs.CV
|
4 |
3 years ago |
| 513 |
Cascaded Cross-Modal Transformer for Request and Complaint Detection
Nicolae-Catalin Ristea, Radu Tudor Ionescu
|
👻
Ghosted
|
cs.CL
|
4 |
2 years ago |
| 514 |
MetaCast: A Self-Driven Metaverse Announcer Architecture Based on Quality of Experience Evaluation Model
Zhonghao Lin, Haihan Duan, ... (+3 more)
|
👻
Ghosted
|
cs.MM
|
4 |
2 years ago |
| 515 |
Parameter-Efficient Transfer Learning for Audio-Visual-Language Tasks
Hongye Liu, Xianhai Xie, ... (+3 more)
|
👻
Ghosted
|
cs.MM
|
4 |
2 years ago |
| 516 |
Layout Sequence Prediction From Noisy Mobile Modality
Haichao Zhang, Yi Xu, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
4 |
2 years ago |
| 517 |
CultiVerse: Towards Cross-Cultural Understanding for Paintings with Large Language Model
Wei Zhang, Wong Kam-Kwai, ... (+6 more)
|
👻
Ghosted
|
cs.HC
|
4 |
2 years ago |
| 518 |
Achieving Resolution-Agnostic DNN-based Image Watermarking: A Novel Perspective of Implicit Neural Representation
Yuchen Wang, Xingyu Zhu, ... (+3 more)
|
👻
Ghosted
|
cs.CR
|
4 |
2 years ago |
| 519 |
HSVLT: Hierarchical Scale-Aware Vision-Language Transformer for Multi-Label Image Classification
Shuyi Ouyang, Hongyi Wang, ... (+7 more)
|
👻
Ghosted
|
cs.CV
|
4 |
1 year ago |
| 520 |
An Inverse Partial Optimal Transport Framework for Music-guided Movie Trailer Generation
Yutong Wang, Sidan Zhu, ... (+2 more)
|
👻
Ghosted
|
cs.MM
|
4 |
1 year ago |
| 521 |
Regularized Contrastive Partial Multi-view Outlier Detection
Yijia Wang, Qianqian Xu, ... (+3 more)
|
👻
Ghosted
|
cs.MM
|
4 |
1 year ago |
| 522 |
From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models
Yuying Shang, Xinyi Zeng, ... (+7 more)
|
👻
Ghosted
|
cs.CL
|
4 |
1 year ago |
| 523 |
DGNS: Deformable Gaussian Splatting and Dynamic Neural Surface for Monocular Dynamic 3D Reconstruction
Xuesong Li, Jinguang Tong, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
4 |
1 year ago |
| 524 |
Generating Negative Samples for Multi-Modal Recommendation
Yanbiao Ji, Dan Luo, ... (+7 more)
|
👻
Ghosted
|
cs.IR
|
4 |
1 year ago |
| 525 |
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
Xiangyue Zhang, Jianfang Li, ... (+4 more)
|
👻
Ghosted
|
cs.GR
|
4 |
1 year ago |
| 526 |
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Gaoxiang Cong, Liang Li, ... (+6 more)
|
👻
Ghosted
|
cs.MM
|
4 |
1 year ago |
| 527 |
DualDub: Video-to-Soundtrack Generation via Joint Speech and Background Audio Synthesis
Wenjie Tian, Xinfa Zhu, ... (+7 more)
|
👻
Ghosted
|
cs.MM
|
4 |
11 months ago |
| 528 |
Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security
Muzhi Dai, Shixuan Liu, ... (+4 more)
|
👻
Ghosted
|
cs.CR
|
4 |
10 months ago |
| 529 |
StarStream: Live Video Analytics over Space Networking
Miao Zhang, Jiaxing Li, ... (+3 more)
|
👻
Ghosted
|
cs.NI
|
4 |
10 months ago |
| 530 |
Analyzing structural characteristics of object category representations from their semantic-part distributions
Ravi Kiran Sarvadevabhatla, Venkatesh Babu R
|
👻
Ghosted
|
cs.CV
|
3 |
10 years ago |
| 531 |
Flavour Enhanced Food Recommendation
Nitish Nag, Aditya Bharadwaj, ... (+7 more)
|
👻
Ghosted
|
cs.SI
|
3 |
7 years ago |
| 532 |
Semantics Preserving Hierarchy based Retrieval of Indian heritage monuments
Ronak Gupta, Prerana Mukherjee, ... (+2 more)
|
👻
Ghosted
|
cs.MM
|
3 |
5 years ago |
| 533 |
Exploiting Diverse Feature for Multimodal Sentiment Analysis
Jia Li, Wei Qian, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
3 |
2 years ago |
| 534 |
Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models
Yu-Wei Zhan, Fan Liu, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
3 |
2 years ago |
| 535 |
Adaptive Multi-Modality Prompt Learning
Zongqian Wu, Yujing Liu, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
3 |
2 years ago |
| 536 |
Towards Fast and Stable Federated Learning: Confronting Heterogeneity via Knowledge Anchor
Jinqian Chen, Jihua Zhu, Qinghai Zheng
|
👻
Ghosted
|
cs.LG
|
3 |
2 years ago |
| 537 |
3D Reconstruction and New View Synthesis of Indoor Environments based on a Dual Neural Radiance Field
Zhenyu Bao, Guibiao Liao, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
3 |
2 years ago |
| 538 |
AdaCoder: Adaptive Prompt Compression for Programmatic Visual Question Answering
Mahiro Ukai, Shuhei Kurita, ... (+3 more)
|
👻
Ghosted
|
cs.AI
|
3 |
1 year ago |
| 539 |
MetaDragonBoat: Exploring Paddling Techniques of Virtual Dragon Boating in a Metaverse Campus
Wei He, Xiang Li, ... (+5 more)
|
👻
Ghosted
|
cs.MM
|
3 |
1 year ago |
| 540 |
Personalized Federated Learning via Backbone Self-Distillation
Pengju Wang, Bochao Liu, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
3 |
1 year ago |
| 541 |
LinkThief: Combining Generalized Structure Knowledge with Node Similarity for Link Stealing Attack against GNN
Yuxing Zhang, Siyuan Meng, ... (+4 more)
|
👻
Ghosted
|
cs.CR
|
3 |
1 year ago |
| 542 |
Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models
Yiming Wu, Zhenghao Chen, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
3 |
1 year ago |
| 543 |
CGCOD: Class-Guided Camouflaged Object Detection
Chenxi Zhang, Qing Zhang, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
3 |
1 year ago |
| 544 |
Do Existing Testing Tools Really Uncover Gender Bias in Text-to-Image Models?
Yunbo Lyu, Zhou Yang, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
3 |
1 year ago |
| 545 |
FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion model
Lingzhou Mu, Baiji Liu, ... (+5 more)
|
👻
Ghosted
|
cs.GR
|
3 |
1 year ago |
| 546 |
StePO-Rec: Towards Personalized Outfit Styling Assistant via Knowledge-Guided Multi-Step Reasoning
Yuxi Bi, Yunfan Gao, Haofen Wang
|
👻
Ghosted
|
cs.IR
|
3 |
1 year ago |
| 547 |
MusFlow: Multimodal Music Generation via Conditional Flow Matching
Jiahao Song, Yuzhao Wang
|
👻
Ghosted
|
cs.SD
|
3 |
1 year ago |
| 548 |
SD-VSum: A Method and Dataset for Script-Driven Video Summarization
Manolis Mylonas, Evlampios Apostolidis, Vasileios Mezaris
|
👻
Ghosted
|
cs.CV
|
3 |
1 year ago |
| 549 |
HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs
Zijian Zhang, Xuecheng Wu, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
3 |
1 year ago |
| 550 |
A Satellite-Ground Synergistic Large Vision-Language Model System for Earth Observation
Yuxin Zhang, Jiahao Yang, ... (+4 more)
|
👻
Ghosted
|
cs.NI
|
3 |
11 months ago |