Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box\n Machine Learning Models

2017889 citationsPreprintgreen Open Access

Authors

Abstract

Many machine learning algorithms are vulnerable to almost imperceptible\nperturbations of their inputs. So far it was unclear how much risk adversarial\nperturbations carry for the safety of real-world machine learning applications\nbecause most methods used to generate such perturbations rely either on\ndetailed model information (gradient-based attacks) or on confidence scores\nsuch as class probabilities (score-based attacks), neither of which are\navailable in most real-world scenarios. In many such cases one currently needs\nto retreat to transfer-based attacks which rely on cumbersome substitute\nmodels, need access to the training data and can be defended against. Here we\nemphasise the importance of attacks which solely rely on the final model\ndecision. Such decision-based attacks are (1) applicable to real-world\nblack-box models such as autonomous cars, (2) need less knowledge and are\neasier to apply than transfer-based attacks and (3) are more robust to simple\ndefences than gradient- or score-based attacks. Previous attacks in this\ncategory were limited to simple models or simple datasets. Here we introduce\nthe Boundary Attack, a decision-based attack that starts from a large\nadversarial perturbation and then seeks to reduce the perturbation while\nstaying adversarial. The attack is conceptually simple, requires close to no\nhyperparameter tuning, does not rely on substitute models and is competitive\nwith the best gradient-based attacks in standard computer vision tasks like\nImageNet. We apply the attack on two black-box algorithms from Clarifai.com.\nThe Boundary Attack in particular and the class of decision-based attacks in\ngeneral open new avenues to study the robustness of machine learning models and\nraise new questions regarding the safety of deployed machine learning systems.\nAn implementation of the attack is available as part of Foolbox at\nhttps://github.com/bethgelab/foolbox .\n

Topics & Keywords

Adversarial Robustness in Machine Learning Anomaly Detection Techniques and Applications COVID-19 diagnosis using AI

UN Sustainable Development Goals

Peace, Justice and strong institutions

Publication Details

Published in: arXiv (Cornell University)

DOI: 10.48550/arxiv.1712.04248