Unsupervised Visual Sense Disambiguation for Verbs using Multimodal Embeddings

March 30, 2016 ยท Declared Dead ยท ๐Ÿ› North American Chapter of the Association for Computational Linguistics

๐Ÿฆด CAUSE OF DEATH: Skeleton Repo
Boilerplate only, no real code

Repo contents: 3.5k_verse_gold_image_sense_annotations.csv, README.md, gold_all_final_lemma_image_sense_annotations.csv, motion_verbs.csv, non_motion_verbs.csv, sense_specific_search_engine_queries.csv, verse_visualness_labels.csv

Authors Spandana Gella, Mirella Lapata, Frank Keller arXiv ID 1603.09188 Category cs.CL: Computation & Language Cross-listed cs.CV Citations 54 Venue North American Chapter of the Association for Computational Linguistics Repository https://github.com/spandanagella/verse โญ 13 Last Checked 1 month ago
Abstract
We introduce a new task, visual sense disambiguation for verbs: given an image and a verb, assign the correct sense of the verb, i.e., the one that describes the action depicted in the image. Just as textual word sense disambiguation is useful for a wide range of NLP tasks, visual sense disambiguation can be useful for multimodal tasks such as image retrieval, image description, and text illustration. We introduce VerSe, a new dataset that augments existing multimodal datasets (COCO and TUHOI) with sense labels. We propose an unsupervised algorithm based on Lesk which performs visual sense disambiguation using textual, visual, or multimodal embeddings. We find that textual embeddings perform well when gold-standard textual annotations (object labels and image descriptions) are available, while multimodal embeddings perform well on unannotated images. We also verify our findings by using the textual and multimodal embeddings as features in a supervised setting and analyse the performance of visual sense disambiguation task. VerSe is made publicly available and can be downloaded at: https://github.com/spandanagella/verse.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computation & Language

๐ŸŒ… ๐ŸŒ… Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL ๐Ÿ› NeurIPS ๐Ÿ“š 166.0K cites 8 years ago

Died the same way โ€” ๐Ÿฆด Skeleton Repo

R.I.P. ๐Ÿฆด Skeleton Repo

Neural Style Transfer: A Review

Yongcheng Jing, Yezhou Yang, ... (+4 more)

cs.CV ๐Ÿ› IEEE TVCG ๐Ÿ“š 828 cites 8 years ago