Search for a command to run...
Neural sequence models are widely used to model time-series data. Equally\nubiquitous is the usage of beam search (BS) as an approximate inference\nalgorithm to decode output sequences from these models. BS explores the search\nspace in a greedy left-right fashion retaining only the top-B candidates -\nresulting in sequences that differ only slightly from each other. Producing\nlists of nearly identical sequences is not only computationally wasteful but\nalso typically fails to capture the inherent ambiguity of complex AI tasks. To\novercome this problem, we propose Diverse Beam Search (DBS), an alternative to\nBS that decodes a list of diverse outputs by optimizing for a\ndiversity-augmented objective. We observe that our method finds better top-1\nsolutions by controlling for the exploration and exploitation of the search\nspace - implying that DBS is a better search algorithm. Moreover, these gains\nare achieved with minimal computational or memory over- head as compared to\nbeam search. To demonstrate the broad applicability of our method, we present\nresults on image captioning, machine translation and visual question generation\nusing both standard quantitative metrics and qualitative human studies.\nFurther, we study the role of diversity for image-grounded language generation\ntasks as the complexity of the image changes. We observe that our method\nconsistently outperforms BS and previously proposed techniques for diverse\ndecoding from neural sequence models.\n