| 551 |
Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration
Zhenyu Zhang, Bowen Yu, ... (+7 more)
|
👻
Ghosted
|
cs.CL
|
6 |
3 years ago |
| 552 |
TinyServe: Query-Aware Cache Selection for Efficient LLM Serving
Dong Liu, Yanxuan Yu
|
👻
Ghosted
|
cs.DC
|
5 |
9 months ago |
| 553 |
Bridging the Modality Gap: Dimension Information Alignment and Sparse Spatial Constraint for Image-Text Matching
Xiang Ma, Xuemei Li, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
5 |
1 year ago |
| 554 |
Semantic-aware Representation Learning for Homography Estimation
Yuhan Liu, Qianxin Huang, ... (+6 more)
|
👻
Ghosted
|
cs.IR
|
5 |
1 year ago |
| 555 |
Monocular Human-Object Reconstruction in the Wild
Chaofan Huo, Ye Shi, Jingya Wang
|
👻
Ghosted
|
cs.CV
|
5 |
1 year ago |
| 556 |
MVP: Winning Solution to SMP Challenge 2025 Video Track
Liliang Ye, Yunyao Zhang, ... (+5 more)
|
👻
Ghosted
|
cs.CV
|
5 |
11 months ago |
| 557 |
MAGIC-TBR: Multiview Attention Fusion for Transformer-based Bodily Behavior Recognition in Group Settings
Surbhi Madan, Rishabh Jain, ... (+3 more)
|
💤
Eternal Rest
|
cs.CV
|
5 |
2 years ago |
| 558 |
Efficient Labelling of Affective Video Datasets via Few-Shot & Multi-Task Contrastive Learning
Ravikiran Parameshwara, Ibrahim Radwan, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
5 |
2 years ago |
| 559 |
Moby: Empowering 2D Models for Efficient Point Cloud Analytics on the Edge
Jingzong Li, Yik Hong Cai, ... (+4 more)
|
👻
Ghosted
|
cs.NI
|
5 |
3 years ago |
| 560 |
Semantics2Hands: Transferring Hand Motion Semantics between Avatars
Zijie Ye, Jia Jia, Junliang Xing
|
👻
Ghosted
|
cs.CV
|
5 |
2 years ago |
| 561 |
RecipeMeta: Metapath-enhanced Recipe Recommendation on Heterogeneous Recipe Network
Jialiang Shi, Takahiro Komamizu, ... (+3 more)
|
👻
Ghosted
|
cs.MM
|
5 |
2 years ago |
| 562 |
Towards Annotation-Free Evaluation of Cross-Lingual Image Captioning
Aozhu Chen, Xinyi Huang, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
5 |
5 years ago |
| 563 |
Diachronic Cross-modal Embeddings
David Semedo, João Magalhães
|
👻
Ghosted
|
cs.MM
|
5 |
6 years ago |
| 564 |
Indefinite Kernel Logistic Regression with Concave-inexact-convex Procedure
Fanghui Liu, Xiaolin Huang, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
5 |
8 years ago |
| 565 |
Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization
Kuan-Yu Chen, Shih-Hung Liu, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
5 |
9 years ago |
| 566 |
Physics-Based Adversarial Attack on Near-Infrared Human Detector for Nighttime Surveillance Camera Systems
Muyao Niu, Zhuoxiao Li, ... (+4 more)
|
💤
Eternal Rest
|
cs.CV
|
5 |
1 year ago |
| 567 |
Phys4DGen: Physics-Compliant 4D Generation with Multi-Material Composition Perception
Jiajing Lin, Zhenzhong Wang, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
5 |
1 year ago |
| 568 |
An Entailment Tree Generation Approach for Multimodal Multi-Hop Question Answering with Mixture-of-Experts and Iterative Feedback Mechanism
Qing Zhang, Haocheng Lv, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
5 |
1 year ago |
| 569 |
Rate-aware Compression for NeRF-based Volumetric Video
Zhiyu Zhang, Guo Lu, ... (+4 more)
|
👻
Ghosted
|
cs.MM
|
5 |
1 year ago |
| 570 |
DanceCamAnimator: Keyframe-Based Controllable 3D Dance Camera Synthesis
Zixuan Wang, Jiayi Li, ... (+5 more)
|
💀
404 Not Found
|
cs.CV
|
5 |
1 year ago |
| 571 |
MVPbev: Multi-view Perspective Image Generation from BEV with Test-time Controllability and Generalizability
Buyu Liu, Kai Wang, ... (+4 more)
|
💀
404 Not Found
|
cs.CV
|
5 |
1 year ago |
| 572 |
RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Tianrui Pan, Jie Liu, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
5 |
1 year ago |
| 573 |
Continual Panoptic Perception: Towards Multi-modal Incremental Interpretation of Remote Sensing Images
Bo Yuan, Danpei Zhao, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
5 |
1 year ago |
| 574 |
CIRP: Cross-Item Relational Pre-training for Multimodal Product Bundling
Yunshan Ma, Yingzhi He, ... (+4 more)
|
👻
Ghosted
|
cs.IR
|
5 |
2 years ago |
| 575 |
Embodied Laser Attack:Leveraging Scene Priors to Achieve Agent-based Robust Non-contact Attacks
Yitong Sun, Yao Huang, Xingxing Wei
|
👻
Ghosted
|
cs.CV
|
5 |
2 years ago |
| 576 |
TrafficMOT: A Challenging Dataset for Multi-Object Tracking in Complex Traffic Scenarios
Lihao Liu, Yanqi Cheng, ... (+7 more)
|
👻
Ghosted
|
cs.CV
|
5 |
2 years ago |
| 577 |
CUCL: Codebook for Unsupervised Continual Learning
Chen Cheng, Jingkuan Song, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
5 |
2 years ago |
| 578 |
In-processing User Constrained Dominant Sets for User-Oriented Fairness in Recommender Systems
Zhongxuan Han, Chaochao Chen, ... (+5 more)
|
👻
Ghosted
|
cs.IR
|
5 |
2 years ago |
| 579 |
Tackling Instance-Dependent Label Noise with Dynamic Distribution Calibration
Manyi Zhang, Yuxin Ren, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
5 |
3 years ago |
| 580 |
Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security
Muzhi Dai, Shixuan Liu, ... (+4 more)
|
👻
Ghosted
|
cs.CR
|
4 |
10 months ago |
| 581 |
Generating Negative Samples for Multi-Modal Recommendation
Yanbiao Ji, Dan Luo, ... (+7 more)
|
👻
Ghosted
|
cs.IR
|
4 |
1 year ago |
| 582 |
StarStream: Live Video Analytics over Space Networking
Miao Zhang, Jiaxing Li, ... (+3 more)
|
👻
Ghosted
|
cs.NI
|
4 |
9 months ago |
| 583 |
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
Xiangyue Zhang, Jianfang Li, ... (+4 more)
|
👻
Ghosted
|
cs.GR
|
4 |
1 year ago |
| 584 |
DualDub: Video-to-Soundtrack Generation via Joint Speech and Background Audio Synthesis
Wenjie Tian, Xinfa Zhu, ... (+7 more)
|
👻
Ghosted
|
cs.MM
|
4 |
11 months ago |
| 585 |
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Gaoxiang Cong, Liang Li, ... (+6 more)
|
👻
Ghosted
|
cs.MM
|
4 |
1 year ago |
| 586 |
CultiVerse: Towards Cross-Cultural Understanding for Paintings with Large Language Model
Wei Zhang, Wong Kam-Kwai, ... (+6 more)
|
👻
Ghosted
|
cs.HC
|
4 |
2 years ago |
| 587 |
MetaCast: A Self-Driven Metaverse Announcer Architecture Based on Quality of Experience Evaluation Model
Zhonghao Lin, Haihan Duan, ... (+3 more)
|
👻
Ghosted
|
cs.MM
|
4 |
2 years ago |
| 588 |
Layout Sequence Prediction From Noisy Mobile Modality
Haichao Zhang, Yi Xu, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
4 |
2 years ago |
| 589 |
Parameter-Efficient Transfer Learning for Audio-Visual-Language Tasks
Hongye Liu, Xianhai Xie, ... (+3 more)
|
👻
Ghosted
|
cs.MM
|
4 |
2 years ago |
| 590 |
StyleEDL: Style-Guided High-order Attention Network for Image Emotion Distribution Learning
Peiguang Jing, Xianyi Liu, ... (+4 more)
|
💤
Eternal Rest
|
cs.CV
|
4 |
2 years ago |
| 591 |
Cascaded Cross-Modal Transformer for Request and Complaint Detection
Nicolae-Catalin Ristea, Radu Tudor Ionescu
|
👻
Ghosted
|
cs.CL
|
4 |
2 years ago |
| 592 |
A Simple Baseline for Pose Tracking in Videos of Crowded Scenes
Li Yuan, Shuning Chang, ... (+7 more)
|
👻
Ghosted
|
cs.CV
|
4 |
5 years ago |
| 593 |
Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds
Zichuan Xu, Jiangkai Wu, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
4 |
5 years ago |
| 594 |
IntersectGAN: Learning Domain Intersection for Generating Images with Multiple Attributes
Zehui Yao, Boyan Zhang, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
4 |
6 years ago |
| 595 |
BUDA.ART: A Multimodal Content-Based Analysis and Retrieval System for Buddha Statues
Benjamin Renoust, Matheus Oliveira Franca, ... (+7 more)
|
👻
Ghosted
|
cs.CV
|
4 |
6 years ago |
| 596 |
iSPA-Net: Iterative Semantic Pose Alignment Network
Jogendra Nath Kundu, Aditya Ganeshan, ... (+3 more)
|
🌅
Old Age
|
cs.CV
|
4 |
7 years ago |
| 597 |
Temporal Cross-Media Retrieval with Soft-Smoothing
David Semedo, João Magalhães
|
👻
Ghosted
|
cs.MM
|
4 |
7 years ago |
| 598 |
An evaluation of large-scale methods for image instance and class discovery
Matthijs Douze, Hervé Jégou, Jeff Johnson
|
👻
Ghosted
|
cs.CV
|
4 |
8 years ago |
| 599 |
Depth Structure Preserving Scene Image Generation
Wendong Zhang, Bingbing Ni, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
4 |
9 years ago |
| 600 |
Impact of Three-Dimensional Video Scalability on Multi-View Activity Recognition using Deep Learning
Jun-Ho Choi, Manri Cheon, ... (+2 more)
|
👻
Ghosted
|
cs.MM
|
4 |
8 years ago |