HuBo-VLM: Unified Vision-Language Model designed for HUman roBOt interaction tasks

August 24, 2023 ยท Declared Dead ยท ๐Ÿ› arXiv.org

๐Ÿ“œ CAUSE OF DEATH: Death by README
Repo has only a README

Repo contents: README.md

Authors Zichao Dong, Weikun Zhang, Xufeng Huang, Hang Ji, Xin Zhan, Junbo Chen arXiv ID 2308.12537 Category cs.RO: Robotics Cross-listed cs.CV Citations 7 Venue arXiv.org Repository https://github.com/dzcgaara/HuBo-VLM โญ 7 Last Checked 1 month ago
Abstract
Human robot interaction is an exciting task, which aimed to guide robots following instructions from human. Since huge gap lies between human natural language and machine codes, end to end human robot interaction models is fair challenging. Further, visual information receiving from sensors of robot is also a hard language for robot to perceive. In this work, HuBo-VLM is proposed to tackle perception tasks associated with human robot interaction including object detection and visual grounding by a unified transformer based vision language model. Extensive experiments on the Talk2Car benchmark demonstrate the effectiveness of our approach. Code would be publicly available in https://github.com/dzcgaara/HuBo-VLM.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Robotics

Died the same way โ€” ๐Ÿ“œ Death by README