A study on the Interpretability of Neural Retrieval Models using DeepSHAP

July 15, 2019 · Declared Dead · 🏛 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Zeon Trevor Fernando, Jaspreet Singh, Avishek Anand arXiv ID 1907.06484 Category cs.IR: Information Retrieval Cross-listed cs.LG Citations 77 Venue Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Last Checked 3 months ago

Abstract

A recent trend in IR has been the usage of neural networks to learn retrieval models for text based adhoc search. While various approaches and architectures have yielded significantly better performance than traditional retrieval models such as BM25, it is still difficult to understand exactly why a document is relevant to a query. In the ML community several approaches for explaining decisions made by deep neural networks have been proposed -- including DeepSHAP which modifies the DeepLift algorithm to estimate the relative importance (shapley values) of input features for a given decision by comparing the activations in the network for a given image against the activations caused by a reference input. In image classification, the reference input tends to be a plain black image. While DeepSHAP has been well studied for image classification tasks, it remains to be seen how we can adapt it to explain the output of Neural Retrieval Models (NRMs). In particular, what is a good "black" image in the context of IR? In this paper we explored various reference input document construction techniques. Additionally, we compared the explanations generated by DeepSHAP to LIME (a model agnostic approach) and found that the explanations differ considerably. Our study raises concerns regarding the robustness and accuracy of explanations produced for NRMs. With this paper we aim to shed light on interesting problems surrounding interpretability in NRMs and highlight areas of future work.