PanSR: An Object-Centric Mask Transformer for Panoptic Segmentation

December 13, 2024 · Declared Dead · 🏛 arXiv.org

Repo contents: README.md

Authors Lojze Žust, Matej Kristan arXiv ID 2412.10589 Category cs.CV: Computer Vision Citations 1 Venue arXiv.org Repository https://github.com/lojzezust/PanSR ⭐ 8 Last Checked 1 month ago

Abstract

Panoptic segmentation is a fundamental task in computer vision and a crucial component for perception in autonomous vehicles. Recent mask-transformer-based methods achieve impressive performance on standard benchmarks but face significant challenges with small objects, crowded scenes and scenes exhibiting a wide range of object scales. We identify several fundamental shortcomings of the current approaches: (i) the query proposal generation process is biased towards larger objects, resulting in missed smaller objects, (ii) initially well-localized queries may drift to other objects, resulting in missed detections, (iii) spatially well-separated instances may be merged into a single mask causing inconsistent and false scene interpretations. To address these issues, we rethink the individual components of the network and its supervision, and propose a novel method for panoptic segmentation PanSR. PanSR effectively mitigates instance merging, enhances small-object detection and increases performance in crowded scenes, delivering a notable +3.4 PQ improvement over state-of-the-art on the challenging LaRS benchmark, while reaching state-of-the-art performance on Cityscapes. The code and models will be publicly available at https://github.com/lojzezust/PanSR.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 💻 Repository 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Computer Vision

🌅 🌅 Old Age

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, ... (+2 more)

cs.CV 🏛 CVPR 📚 220.4K cites 10 years ago

🌅 🌅 Old Age

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, Kaiming He, ... (+2 more)

cs.CV 🏛 IEEE TPAMI 📚 70.4K cites 10 years ago

R.I.P. 👻 Ghosted

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, Santosh Divvala, ... (+2 more)

cs.CV 🏛 CVPR 📚 43.4K cites 10 years ago

🌅 🌅 Old Age

SSD: Single Shot MultiBox Detector

Wei Liu, Dragomir Anguelov, ... (+5 more)

cs.CV 🏛 ECCV 📚 33.8K cites 10 years ago

🌅 🌅 Old Age

Squeeze-and-Excitation Networks

Jie Hu, Li Shen, ... (+3 more)

cs.CV 🏛 CVPR 📚 32.3K cites 8 years ago

R.I.P. 👻 Ghosted

Rethinking the Inception Architecture for Computer Vision

Christian Szegedy, Vincent Vanhoucke, ... (+3 more)

cs.CV 🏛 CVPR 📚 30.2K cites 10 years ago

Died the same way — 📜 Death by README

R.I.P. 📜 Death by README

Momentum Contrast for Unsupervised Visual Representation Learning

Kaiming He, Haoqi Fan, ... (+3 more)

cs.CV 🏛 CVPR 📚 14.3K cites 6 years ago

R.I.P. 📜 Death by README

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

Peng Gao, Jiaming Han, ... (+10 more)

cs.CV 🏛 arXiv 📚 716 cites 2 years ago

R.I.P. 📜 Death by README

Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach

Lei Chen, Le Wu, ... (+3 more)

cs.IR 🏛 AAAI 📚 609 cites 6 years ago

R.I.P. 📜 Death by README

Diffusion Models for Medical Image Analysis: A Comprehensive Survey

Amirhossein Kazerouni, Ehsan Khodapanah Aghdam, ... (+5 more)

eess.IV 🏛 MedIA 📚 599 cites 3 years ago