DEV Community

Authora Dev
Authora Dev

Posted on

Why on-device AI is a supply chain problem now (and how to fix it)

Last month, a team shipped an on-device support agent for field laptops. It was supposed to summarize logs and suggest fixes offline.

Instead, it became a blind spot.

A model file got swapped during an internal test. The app still ran. The UI still looked normal. The agent still had access to local files, cached tokens, and a few “temporary” admin actions nobody had removed yet. No breach headline, no movie-style hack — just a supply chain problem hiding inside an AI feature.

That’s the part people miss about on-device inference: moving AI to the edge can reduce cloud exposure, but it also creates a new trust problem.

Which model is actually running? Who gave this agent its permissions? Can you prove what happened after the fact?

If you’re shipping local copilots, desktop agents, mobile inference, or embedded AI features, identity matters just as much as quantization and latency.

The new supply chain: model + agent + tools

Most teams already think about software supply chain security:

  • signed artifacts
  • SBOMs
  • CI provenance
  • dependency scanning

But on-device AI adds a second chain of trust:

Developer/Org
    |
    v
Signed app ----> Signed model ----> Agent identity ----> Tool access
                                            |
                                            v
                                      Audit trail
Enter fullscreen mode Exit fullscreen mode

If any link is fuzzy, your “local AI” can become “unverifiable code with permissions.”

A few common failure modes:

  • Model substitution: a different weight file gets loaded than the one you tested
  • Permission creep: the local agent inherits broad filesystem or API access
  • Tool spoofing: the agent connects to an MCP/tool endpoint you didn’t intend
  • No delegation trail: a human approved one thing, but the agent did another
  • No auditability: after an incident, you can’t answer who invoked what

This is why identity for AI agents can’t just be “here’s an API key” anymore.

What “good” looks like

For on-device inference, I’d aim for four boring, practical controls:

  1. A cryptographic identity for the agent
  2. Short-lived delegated access instead of long-lived secrets
  3. Policy checks before sensitive tool use
  4. Audit logs that tie actions back to identity

You do not need to buy a giant platform to start doing this. If OPA fits your stack, use OPA. If signed manifests and mTLS solve 80% of your problem, start there.

The key is to stop treating local agents as anonymous helpers.

A simple pattern

Here’s the pattern we’ve seen work:

[Signed App Bundle]
      |
      +-- verifies model checksum/signature
      |
      +-- loads agent keypair / device-bound identity
      |
      +-- requests short-lived delegated token
      |
      +-- calls approved tools only
      |
      +-- emits signed audit events
Enter fullscreen mode Exit fullscreen mode

That gives you separation between:

  • what code is running
  • which model is loaded
  • which identity is acting
  • what permissions were delegated

Even if the model runs fully offline, the identity and policy pieces still matter whenever that agent touches local files, enterprise APIs, or MCP servers.

One small runnable example

If you want a fast sanity check on the tool side of your AI supply chain, scanning your MCP surface is a good start.

npm install -g @authora/agent-audit
agent-audit scan https://your-mcp-server.example.com --format table
Enter fullscreen mode Exit fullscreen mode

That gives you a quick read on security issues, exposure, and spec compliance. In CI, you can also gate builds with a minimum grade.

Not a full identity system, of course — but it’s a very practical way to find “why is this tool endpoint publicly reachable?” before your agent does.

What to verify on-device

If I were reviewing an on-device inference deployment this week, I’d ask:

  • Is the model artifact pinned by hash or signature?
  • Does the agent have a stable cryptographic identity?
  • Are tool permissions explicit, or inherited accidentally?
  • Are delegated permissions time-bound?
  • Can we revoke access without shipping a whole new app?
  • Do logs show which agent identity performed an action?
  • Are local secrets isolated from the model runtime?

That last one matters more than people expect. A local model process with broad disk access can become a very effective secret-harvesting layer, even without “malware” behavior.

The uncomfortable truth

A lot of “AI security” discussion is still focused on prompts, jailbreaks, and output filtering.

Those matter.

But for on-device inference, the more immediate risk is often much less glamorous: identity and provenance are missing entirely.

If your local agent can edit code, read files, call tools, or trigger actions, then you have a supply chain. And supply chains need trust boundaries.

Try it yourself

A few free tools if you want to pressure-test your setup:

None of these replace a real review of your trust model, but they’re a good way to find obvious gaps quickly.

The good news: you don’t need perfect architecture to get safer. Start by giving your agents real identities, reducing long-lived credentials, and treating tool access like production access.

That alone will put you ahead of a surprising number of deployments.

How are you handling agent identity for on-device inference today? Drop your approach below.

-- Authora team

This post was created with AI assistance.

Top comments (0)