Authora Dev

Posted on Apr 2

Why on-device AI is a supply chain problem now (and how to fix it)

#ai #programming #devops #security

Last month, a team shipped an on-device support agent for field laptops. It was supposed to summarize logs and suggest fixes offline.

Instead, it became a blind spot.

A model file got swapped during an internal test. The app still ran. The UI still looked normal. The agent still had access to local files, cached tokens, and a few “temporary” admin actions nobody had removed yet. No breach headline, no movie-style hack — just a supply chain problem hiding inside an AI feature.

That’s the part people miss about on-device inference: moving AI to the edge can reduce cloud exposure, but it also creates a new trust problem.

Which model is actually running? Who gave this agent its permissions? Can you prove what happened after the fact?

If you’re shipping local copilots, desktop agents, mobile inference, or embedded AI features, identity matters just as much as quantization and latency.

The new supply chain: model + agent + tools

Most teams already think about software supply chain security:

signed artifacts
SBOMs
CI provenance
dependency scanning

But on-device AI adds a second chain of trust:

Developer/Org
    |
    v
Signed app ----> Signed model ----> Agent identity ----> Tool access
                                            |
                                            v
                                      Audit trail

If any link is fuzzy, your “local AI” can become “unverifiable code with permissions.”

A few common failure modes:

Model substitution: a different weight file gets loaded than the one you tested
Permission creep: the local agent inherits broad filesystem or API access
Tool spoofing: the agent connects to an MCP/tool endpoint you didn’t intend
No delegation trail: a human approved one thing, but the agent did another
No auditability: after an incident, you can’t answer who invoked what

This is why identity for AI agents can’t just be “here’s an API key” anymore.

What “good” looks like

For on-device inference, I’d aim for four boring, practical controls:

A cryptographic identity for the agent
Short-lived delegated access instead of long-lived secrets
Policy checks before sensitive tool use
Audit logs that tie actions back to identity

You do not need to buy a giant platform to start doing this. If OPA fits your stack, use OPA. If signed manifests and mTLS solve 80% of your problem, start there.

The key is to stop treating local agents as anonymous helpers.

A simple pattern

Here’s the pattern we’ve seen work:

[Signed App Bundle]
      |
      +-- verifies model checksum/signature
      |
      +-- loads agent keypair / device-bound identity
      |
      +-- requests short-lived delegated token
      |
      +-- calls approved tools only
      |
      +-- emits signed audit events

That gives you separation between:

what code is running
which model is loaded
which identity is acting
what permissions were delegated

Even if the model runs fully offline, the identity and policy pieces still matter whenever that agent touches local files, enterprise APIs, or MCP servers.

One small runnable example

If you want a fast sanity check on the tool side of your AI supply chain, scanning your MCP surface is a good start.

npm install -g @authora/agent-audit
agent-audit scan https://your-mcp-server.example.com --format table

That gives you a quick read on security issues, exposure, and spec compliance. In CI, you can also gate builds with a minimum grade.

Not a full identity system, of course — but it’s a very practical way to find “why is this tool endpoint publicly reachable?” before your agent does.

What to verify on-device

If I were reviewing an on-device inference deployment this week, I’d ask:

Is the model artifact pinned by hash or signature?
Does the agent have a stable cryptographic identity?
Are tool permissions explicit, or inherited accidentally?
Are delegated permissions time-bound?
Can we revoke access without shipping a whole new app?
Do logs show which agent identity performed an action?
Are local secrets isolated from the model runtime?

That last one matters more than people expect. A local model process with broad disk access can become a very effective secret-harvesting layer, even without “malware” behavior.

The uncomfortable truth

A lot of “AI security” discussion is still focused on prompts, jailbreaks, and output filtering.

Those matter.

But for on-device inference, the more immediate risk is often much less glamorous: identity and provenance are missing entirely.

If your local agent can edit code, read files, call tools, or trigger actions, then you have a supply chain. And supply chains need trust boundaries.

Try it yourself

A few free tools if you want to pressure-test your setup:

Want to check your MCP server? Try https://tools.authora.dev
Run a codebase scan for agent security issues: npx @authora/agent-audit
Add a verified badge to your agent: https://passport.authora.dev
More resources, papers, and tools: https://github.com/authora-dev/awesome-agent-security

None of these replace a real review of your trust model, but they’re a good way to find obvious gaps quickly.

The good news: you don’t need perfect architecture to get safer. Start by giving your agents real identities, reducing long-lived credentials, and treating tool access like production access.

That alone will put you ahead of a surprising number of deployments.

How are you handling agent identity for on-device inference today? Drop your approach below.

-- Authora team

This post was created with AI assistance.

DEV Community