The liar's dividend has a second payout, and devs helped build it

#ai #security #llm #machinelearning

The liar's dividend has a second payout, and devs helped build it

TL;DR: The "liar's dividend" isn't just about faking things. It's about claiming real things are fake. Detection infrastructure the very thing we built to fight deepfakes is now being used as cover. This is a systems design problem as much as a machine learning one.

I've been sitting with a Forbes piece on digital forensics and deepfakes for a few days, and the part that stuck wasn't the forensics. It was a phrase: "the liar's dividend's second payout."

The first payout, if you haven't heard the term, comes from Chesney and Citron's 2019 paper on deepfakes and democracy. The idea is simple and brutal: once people know synthetic media exists, a bad actor can claim any real, damaging media is fake. You don't need to make a convincing deepfake. You just need enough public doubt to muddy the water.

The second payout is what we built next. And I mean "we" literally — developers, ML engineers, product teams. We built detection tools. Classification APIs. Real-time flagging pipelines. And in doing so, we handed the liars a new prop.

How the escape hatch works

Consider the logic a bad actor now has available:

IF  (incriminating_media EXISTS)
AND (public_awareness_of_deepfakes == HIGH)
AND (detection_tools PRODUCE != 100% certainty)
THEN
   claim "this is AI-generated"
   point to ambiguous classifier output as "proof"
   wait for news cycle to move on

This isn't hypothetical. In 2023, a Slovakian election audio clip of a candidate allegedly discussing election fraud circulated two days before polls opened. The candidate's party called it AI-generated. Analysts were split. The election happened before anyone reached consensus.

That's the second payout: the detection ecosystem itself becomes the alibi. A shrug from a classifier is now a press release.

What I actually see when I run stuff through detection

I use AI or Not when something looks off to me — it handles images, video, and audio, which covers most of what circulates on social platforms. The output is a confidence score, not a verdict. That matters.

A 73% "likely AI" rating on a clip is meaningful signal. It is not a court finding. The problem is that a 73% rating is also something a bad actor can screenshot and frame as "even the detectors aren't sure."

This isn't a flaw in AI or Not specifically. It's a fundamental property of probabilistic classification. Every detection system that produces a confidence score below 100% will have that score weaponized by someone. We built the weapon while trying to build the shield.

The four things I'd do differently (as a builder)

If I were shipping something in this space today, here's where I'd change my assumptions:

Design for legal weight, not just accuracy. A 92% confidence score means nothing in a courtroom without a chain of custody, a known model version, and a documented methodology. If your output might ever be used as evidence, treat it that way from day one — not as an afterthought.
Log model provenance explicitly. Which version of the detector flagged this? What training data was it exposed to? These questions matter the moment someone disputes a finding in public. Most APIs I've worked with don't surface this at all.
Build in uncertainty communication by default. Instead of a single score, surface a distribution. "This result falls in a range where the model produces false positives 18% of the time under these image conditions." Harder to misquote.
Think about the adversarial UI, not just the adversarial input. We spend a lot of time thinking about adversarial examples that fool detectors. We spend almost no time thinking about how bad actors will present detector output to audiences who don't understand what it means.

The forensics paradox

Here's the thing about digital forensics being the "only sure answer" to deepfakes: it requires a trusted institution to perform it, a trusted chain of custody for the media, and a public that believes the institution. All three of those are eroding simultaneously.

A forensic finding from a university lab means less when half your audience thinks universities are politically captured. A chain of custody argument lands differently when the platform hosting the media is actively in a political fight.

I'm not saying detection tools are useless — I keep using AI or Not because the signal is real and it's gotten my antenna up on things I would have scrolled past. But I've started thinking of detection as one input into a much larger trust problem, not as a solution to it.

The liar's dividend was always about epistemics, not technology. We built better detectors and handed the epistemics problem a new set of props.

What actually changes the calculus

A few things that seem underbuilt relative to the detection side:

Provenance standards. The C2PA spec attaches cryptographic provenance to media at capture time. If the camera signs the frame and the signature breaks on edit, that's a different kind of evidence than a classifier score. It's not widespread yet, but it's the right direction.
Legal frameworks for false claims of AI generation. Right now there's almost no cost to wrongly claiming something is a deepfake. A few jurisdictions are looking at this; none have moved fast enough.
Adversarial red-teaming of the human layer. We red-team models constantly. We almost never red-team how users and journalists will misread or be manipulated by model output.