Using Image Captions and Multitask Learning for Recommending Query Reformulations

European Conference on Information Retrieval (ECIR)

Published April 14, 2020

Gaurav Verma, Vishwa Vinay, Sahil Bansal*, Shashank Oberoi*, Makkunda Sharma*, Prakhar Gupta


Interactive search sessions often contain multiple queries, where the user submits a reformulated version of the previous query in response to the original results. We aim to enhance the query recommendation experience for a commercial image search engine. Our proposed methodology incorporates current state-of-the-art practices from relevant literature – the use of generation-based sequence-to-sequence models that capture session context, and a multitask architecture that simultaneously optimizes the ranking of results. We extend this setup by driving the learning of such a model with captions of clicked images as the target, instead of using the subsequent query within the session. Since these captions tend to be linguistically richer, the reformulation mechanism can be seen as assistance to construct more descriptive queries. In addition, via the use of a pairwise loss for the secondary ranking task, we show that the generated reformulations are more diverse.

Learn More