Most security thinking still guards the doors and the data. The attack that should worry you walks past both and rewrites the model itself, quietly, while every dashboard stays green.
We have spent a decade learning to defend the inputs to artificial intelligence and almost no time defending the model itself. Weight tampering is the breach you will not detect, because the system keeps answering and the answers look fine. The only durable defence is a record of what the model was and what it did, signed before it acts and verifiable without trusting the vendor.
Originally published on mickai.co.uk. This is a cross-post; the canonical version, with the full body, footnotes and references, lives on the mickai.co.uk article page.

Top comments (0)