๐
๐
Old Age
DiffuSAM: Diffusion Guided Zero-Shot Object Grounding for Remote Sensing Imagery
April 20, 2026 ยท Grace Period ยท ๐ ICLR 2026 ML4RS Workshop
Authors
Geet Sethi, Panav Shah, Ashutosh Gandhe, Soumitra Darshan Nayak
arXiv ID
2604.18201
Category
cs.CV: Computer Vision
Cross-listed
cs.LG
Citations
0
Venue
ICLR 2026 ML4RS Workshop
Abstract
Diffusion models have emerged as powerful tools for a wide range of vision tasks, including text-guided image generation and editing. In this work, we explore their potential for object grounding in remote sensing imagery. We propose a hybrid pipeline that integrates diffusion-based localization cues with state-of-the-art segmentation models such as RemoteSAM and SAM3 to obtain more accurate bounding boxes. By leveraging the complementary strengths of generative diffusion models and foundational segmentation models, our approach enables robust and adaptive object localization across complex scenes. Experiments demonstrate that our pipeline significantly improves localization performance, achieving over a 14% increase in Acc@0.5 compared to existing state-of-the-art methods.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
๐
๐
Old Age
Fast R-CNN
๐
๐
Old Age