Generating Easy-to-Understand Referring Expressions for Target Identifications

November 29, 2018 ยท Entered Twilight ยท ๐Ÿ› arXiv.org

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 6.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, LICENSE.md, README.md, config.py, demo, eval_comprehension.py, eval_generation.py, misc, models, prepro.py, pyutils, rerank_generated_captions.py, scripts, train.py

Authors Mikihiro Tanaka, Takayuki Itamochi, Kenichi Narioka, Ikuro Sato, Yoshitaka Ushiku, Tatsuya Harada arXiv ID 1811.12104 Category cs.CV: Computer Vision Citations 1 Venue arXiv.org Repository https://github.com/mikittt/easy-to-understand-REG โญ 18 Last Checked 2 months ago
Abstract
This paper addresses the generation of referring expressions that not only refer to objects correctly but also let humans find them quickly. As a target becomes relatively less salient, identifying referred objects itself becomes more difficult. However, the existing studies regarded all sentences that refer to objects correctly as equally good, ignoring whether they are easily understood by humans. If the target is not salient, humans utilize relationships with the salient contexts around it to help listeners to comprehend it better. To derive this information from human annotations, our model is designed to extract information from the target and from the environment. Moreover, we regard that sentences that are easily understood are those that are comprehended correctly and quickly by humans. We optimized this by using the time required to locate the referred objects by humans and their accuracies. To evaluate our system, we created a new referring expression dataset whose images were acquired from Grand Theft Auto V (GTA V), limiting targets to persons. Experimental results show the effectiveness of our approach. Our code and dataset are available at https://github.com/mikittt/easy-to-understand-REG.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision