| 1 |
GRiT: A Generative Region-to-text Transformer for Object Understanding
Jialian Wu, Jianfeng Wang, ... (+5 more)
|
💤
Eternal Rest
|
cs.CV
|
147 |
3 years ago |
| 2 |
ReNoise: Real Image Inversion Through Iterative Noising
Daniel Garibi, Or Patashnik, ... (+3 more)
|
💤
Eternal Rest
|
cs.CV
|
110 |
2 years ago |
| 3 |
Efficient Image Super-Resolution using Vast-Receptive-Field Attention
Lin Zhou, Haoming Cai, ... (+6 more)
|
💤
Eternal Rest
|
eess.IV
|
87 |
3 years ago |
| 4 |
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Tianxing Wu, Chenyang Si, ... (+3 more)
|
💤
Eternal Rest
|
cs.CV
|
81 |
2 years ago |
| 5 |
Global Spectral Filter Memory Network for Video Object Segmentation
Yong Liu, Ran Yu, ... (+5 more)
|
💤
Eternal Rest
|
cs.CV
|
44 |
3 years ago |
| 6 |
Solving Motion Planning Tasks with a Scalable Generative Model
Yihan Hu, Siqi Chai, ... (+7 more)
|
💤
Eternal Rest
|
cs.RO
|
38 |
1 year ago |
| 7 |
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning
Zhecan Wang, Garrett Bingham, ... (+4 more)
|
💤
Eternal Rest
|
cs.CV
|
30 |
1 year ago |
| 8 |
Do text-free diffusion models learn discriminative visual representations?
Soumik Mukhopadhyay, Matthew Gwilliam, ... (+7 more)
|
💤
Eternal Rest
|
cs.CV
|
27 |
2 years ago |
| 9 |
LiDAL: Inter-frame Uncertainty Based Active Learning for 3D LiDAR Semantic Segmentation
Zeyu Hu, Xuyang Bai, ... (+5 more)
|
💤
Eternal Rest
|
cs.CV
|
23 |
3 years ago |
| 10 |
The Surprisingly Straightforward Scene Text Removal Method With Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis
Hyeonsu Lee, Chankyu Choi
|
💤
Eternal Rest
|
cs.CV
|
19 |
3 years ago |
| 11 |
Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents
Yuqi Jia, Saeed Vahidian, ... (+5 more)
|
💤
Eternal Rest
|
cs.LG
|
18 |
2 years ago |
| 12 |
BAFFLE: A Baseline of Backpropagation-Free Federated Learning
Haozhe Feng, Tianyu Pang, ... (+4 more)
|
💤
Eternal Rest
|
cs.LG
|
14 |
3 years ago |
| 13 |
AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents
Duomin Wang, Bin Dai, ... (+2 more)
|
💤
Eternal Rest
|
cs.CV
|
12 |
2 years ago |
| 14 |
CarFormer: Self-Driving with Learned Object-Centric Representations
Shadi Hamdan, Fatma Güney
|
💤
Eternal Rest
|
cs.CV
|
12 |
1 year ago |
| 15 |
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues
Sara Sarto, Marcella Cornia, ... (+2 more)
|
💤
Eternal Rest
|
cs.CV
|
12 |
1 year ago |
| 16 |
AddBiomechanics Dataset: Capturing the Physics of Human Motion at Scale
Keenon Werling, Janelle Kaneda, ... (+16 more)
|
💤
Eternal Rest
|
cs.CV
|
9 |
1 year ago |
| 17 |
VTC: Improving Video-Text Retrieval with User Comments
Laura Hanu, James Thewlis, ... (+2 more)
|
💤
Eternal Rest
|
cs.CV
|
8 |
3 years ago |
| 18 |
SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement
Zhaofan Qiu, Yehao Li, ... (+4 more)
|
💤
Eternal Rest
|
cs.CV
|
8 |
3 years ago |
| 19 |
RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN
Huy Phan, Cong Shi, ... (+8 more)
|
💤
Eternal Rest
|
cs.CR
|
7 |
3 years ago |
| 20 |
Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval
Damianos Galanopoulos, Vasileios Mezaris
|
💤
Eternal Rest
|
cs.CV
|
7 |
3 years ago |
| 21 |
Union-set Multi-source Model Adaptation for Semantic Segmentation
Zongyao Li, Ren Togo, ... (+2 more)
|
💤
Eternal Rest
|
cs.CV
|
5 |
3 years ago |
| 22 |
PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture
Zhuojun Li, Chun Yu, ... (+2 more)
|
💤
Eternal Rest
|
cs.CV
|
4 |
1 year ago |
| 23 |
HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation
Wencan Cheng, Eunji Kim, Jong Hwan Ko
|
💤
Eternal Rest
|
cs.CV
|
3 |
1 year ago |
| 24 |
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo, Jingwen Chen, ... (+5 more)
|
💤
Eternal Rest
|
cs.CV
|
2 |
1 year ago |
| 25 |
DreamDrone: Text-to-Image Diffusion Models are Zero-shot Perpetual View Generators
Hanyang Kong, Dongze Lian, ... (+2 more)
|
💤
Eternal Rest
|
cs.CV
|
1 |
2 years ago |
| 26 |
Flatness-aware Sequential Learning Generates Resilient Backdoors
Hoang Pham, The-Anh Ta, ... (+2 more)
|
💤
Eternal Rest
|
cs.LG
|
1 |
1 year ago |
| 27 |
Differentiable Convex Polyhedra Optimization from Multi-view Images
Daxuan Ren, Haiyi Mei, ... (+4 more)
|
💤
Eternal Rest
|
cs.GR
|
1 |
1 year ago |
| 28 |
MVTN: A Multiscale Video Transformer Network for Hand Gesture Recognition
Mallika Garg, Debashis Ghosh, Pyari Mohan Pradhan
|
💤
Eternal Rest
|
cs.CV
|
1 |
1 year ago |
| 29 |
SpeedUpNet: A Plug-and-Play Adapter Network for Accelerating Text-to-Image Diffusion Models
Weilong Chai, DanDan Zheng, ... (+4 more)
|
💤
Eternal Rest
|
cs.CV
|
0 |
2 years ago |