Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence\n Models

2016358 citationsPreprintgreen Open Access

Authors

Ashwin K. Vijayakumar

Michael Cogswell

Ramprasath R. Selvaraju

Abstract

Neural sequence models are widely used to model time-series data. Equally\nubiquitous is the usage of beam search (BS) as an approximate inference\nalgorithm to decode output sequences from these models. BS explores the search\nspace in a greedy left-right fashion retaining only the top-B candidates -\nresulting in sequences that differ only slightly from each other. Producing\nlists of nearly identical sequences is not only computationally wasteful but\nalso typically fails to capture the inherent ambiguity of complex AI tasks. To\novercome this problem, we propose Diverse Beam Search (DBS), an alternative to\nBS that decodes a list of diverse outputs by optimizing for a\ndiversity-augmented objective. We observe that our method finds better top-1\nsolutions by controlling for the exploration and exploitation of the search\nspace - implying that DBS is a better search algorithm. Moreover, these gains\nare achieved with minimal computational or memory over- head as compared to\nbeam search. To demonstrate the broad applicability of our method, we present\nresults on image captioning, machine translation and visual question generation\nusing both standard quantitative metrics and qualitative human studies.\nFurther, we study the role of diversity for image-grounded language generation\ntasks as the complexity of the image changes. We observe that our method\nconsistently outperforms BS and previously proposed techniques for diverse\ndecoding from neural sequence models.\n

Topics & Keywords

Multimodal Machine Learning Applications Topic Modeling Natural Language Processing Techniques

UN Sustainable Development Goals

Quality Education

Publication Details

Published in: arXiv (Cornell University)

DOI: 10.48550/arxiv.1610.02424