A deep learning image classifier that works well when the test image is clear and like what it has seen before. However, adversarial attacks will calculate noise put into images, making them misclassify, and humans don’t notice if something changes in an image. Previous works propose several defenses, such as adversarial training, model architecture changes, detection methods, and adversarial purification. However, they still have limitations like high cost, complicatedness, and limited robustness.
Another approach to defend against adversarial attacks is proposed in the paper “STRAP-ViT: Segregated Tokens with Randomized Transformations for Defense against Adversarial Patches in ViTs.” First, they will patch localization by identifying tokens with the highest entropy, that is, which one changes compared to a clean image, and after that, when they can detect the token that was attacked, the system will clean noise without removing the token by applying a randomized combination of mathematical transformations only to neutralize the adversarial noise. After tokens are transformed and neutralized, they will combine with the rest of the untouched clean tokens and then be sent to the vision transformer for prediction. It only modifies hacked tokens, making it minimal cost with no extra training.
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (0)