Adversarial ML — Evasion Attacks

A perturbation too small for a human to see can flip a model from "panda, 58%" to "gibbon, 99%". Evasion attacks nudge an input across the model's decision boundary at inference time: FGSM does it in one gradient step, PGD in many, and the result often transfers to a model you never touched. Here is how the gradient-of-the-input trick works, and why robustness is an arms race.

Related Articles