Kaeso

Posted on Mar 28

How to Design OAuth for AI Agents Without Creating a Permission Mess

#agents #ai #security #systemdesign

Everyone is building AI agents.

They can write code, summarize documents, search the web, create tickets, update databases, send messages, and automate workflows. But the moment an agent has to interact with real services like GitHub, Slack, Google Drive, Notion, or Stripe, the same problem appears:

authorization becomes messy very fast.

Most teams treat OAuth as a solved problem because traditional SaaS apps have used it for years. But AI agents change the shape of the problem.

A normal app usually performs a limited set of actions with a relatively clear boundary. An agent platform is different. It is dynamic, multi-step, sometimes autonomous, and often sits between users, third-party services, and other applications. That adds a completely different permission model.

If you do not design that model carefully, you end up with one of two bad outcomes:

a terrible user experience with constant re-auth and unclear permissions
or a dangerous system where apps and agents get far more access than they should

I have been thinking a lot about this while designing Kaeso, and one conclusion became very clear:

OAuth for AI agents should not be treated like ordinary app OAuth.

The real problem is not just authentication

When people talk about OAuth, they often simplify it to “sign in with X” or “connect your account.”

That is not the hard part.

The hard part is everything after that:

what exactly is connected
what scopes were granted
which application is allowed to use that access
whether access is read-only or write-capable
whether the access is one-time or continuous
whether background execution is allowed
how the user can review and revoke all of it later

With AI agents, this becomes even more important because the actor using the access is often not just a static UI. It may be an agent runtime, a workflow engine, a background job, or a third-party app calling into your platform.

So the real design question is not:

“How do we let users connect GitHub?”

It is:

“How do we let users delegate the minimum necessary access to a system that may act on their behalf across multiple services, while keeping that understandable, revocable, and secure?”

That is a much harder problem.

Why traditional OAuth patterns break down

A lot of existing OAuth implementations assume a much simpler world.

Typical flow:

User clicks “Connect GitHub”
App redirects to GitHub
User approves scopes
App gets tokens
App uses those tokens directly

That works fine for a single product with a narrow purpose.

But now imagine an AI platform where:

the user connects multiple services
third-party apps run on top of the platform
agents may use those services in the background
each app may need a different subset of permissions
and the user expects to understand what is happening

If you use the same simple pattern, you quickly get permission chaos.

For example:

one app gets a broad GitHub token even though it only needs repo metadata
another app reuses the same connection without the user fully realizing it
users do not know whether Slack access was granted to the platform, to the app, or to both
the system cannot cleanly explain what is already authorized and what is still missing

At that point, the product becomes hard to trust.

The three-layer model

The cleanest mental model I have found is to separate authorization into three distinct layers.

1. User to platform

This is just identity.

The user signs into your platform. In my case, that is Kaeso. This layer answers:

who is the user
what account do they have
what apps and services are associated with them

This is not the same thing as service authorization.

2. User to service through the platform

This is where the user connects GitHub, Slack, Google, and so on.

The platform receives provider-specific OAuth tokens and stores them securely. At this point, the user has not necessarily granted every app permission to use those services. They have only granted the platform the ability to broker access.

This is a critical distinction.

3. User to app through the platform

This is where a Kaeso app, agent, or integration asks for access to certain capabilities.

The app should not just get raw provider tokens by default. Instead, it should request platform-level permissions, and the platform should decide whether the required service connections and scopes already exist.

If they do, the user can approve the app.

If they do not, the platform should guide the user through the missing service authorization or scope upgrade.

This creates a much cleaner architecture because every permission request has a clear place in the system.

The mistake: giving apps raw provider tokens

This is the easiest bad design to fall into.

A user connects GitHub to your platform, and then you let apps directly receive or use that GitHub token with minimal separation.

It feels convenient, but it creates several problems:

poor permission isolation
weak auditability
harder revocation
provider-specific logic leaking everywhere
much larger blast radius if an app is compromised

The better model is to let the platform act as a broker.

That means:

users connect services to the platform
the platform stores and refreshes provider tokens
apps receive only platform-issued tokens
apps call the platform API
the platform performs downstream calls to GitHub, Slack, Google, and others

This preserves a clean boundary.

The app is authorized to use platform capabilities. It is not automatically entitled to direct possession of all underlying service credentials.

That one design choice makes the whole system easier to secure, explain, and operate.

The permission model should be platform-native

This is where I think most agent infrastructure will need to evolve.

Apps should request platform-native scopes, not raw provider scopes.

For example, instead of exposing a GitHub scope directly, the platform might define scopes like:

services.github.repositories.read
services.github.repositories.write
services.slack.channels.read
services.google.drive.files.read
agents.jobs.background
identity.profile.read

Then the platform maps each platform scope to whatever provider scopes and service states are actually required underneath.

This gives you several advantages:

a consistent permission language across providers
a better user experience
tighter product control
less provider-specific complexity for developers
clearer audit logs

Users should see permissions in the language of what the app is trying to do, not just in the language of whatever a provider happened to name its scopes.

What the user should actually see

A lot of consent screens are technically correct but product-wise terrible.

For agent systems, the consent UI has to be extremely explicit.

A good consent screen should answer all of these questions immediately:

what app is asking
who built it
what actions it wants to perform
which connected services are involved
whether those permissions already exist
what is still missing
whether access is read-only or write-enabled
whether background execution is allowed

Something like this is much clearer than a generic “Authorize app” button:

This app wants to:

Read selected GitHub repositories
Read selected Slack channels
Run background sync jobs

Kaeso will access on your behalf:

GitHub: repository metadata and contents
Slack: channel list and messages in approved channels

Missing requirements:

Slack is not connected yet
GitHub connection needs an upgraded scope

Actions:

Connect Slack
Upgrade GitHub access
Approve app
Cancel

That is the kind of UX I think agent platforms need.

The goal is not just to be compliant with an OAuth flow. The goal is to make the system understandable enough that the user can make a real decision.

Incremental authorization matters a lot

One of the biggest UX mistakes is forcing users through the entire permission process again every time something changes.

A better model is incremental authorization.

If the user already connected GitHub and already approved a previous app for repository read access, and a new app requests the same capability, the system should say so clearly.

If a new app requires one additional permission, the user should only see that delta.

That keeps consent manageable and avoids training users to blindly click approve.

For AI agents, this matters even more because workflows often evolve over time. A user might start with read-only access and later enable background jobs or write access. The system should support that progression cleanly instead of treating every change as a brand-new opaque authorization event.

Resource narrowing is as important as scopes

OAuth scopes are often too broad for what users actually want.

Saying “this app can read Slack” is not enough. In practice, users often want more granular control, such as:

only this workspace
only these channels
only this GitHub organization
only these repositories
only this Drive account
only this folder

That is why product-layer restrictions matter.

The OAuth provider might only offer broad scopes, but your platform can still narrow the effective access by introducing resource-level controls. For agent platforms, this is one of the most valuable trust features you can build.

It makes authorization much closer to what users actually intend.

Online access and background access should be separate permissions

This is another place where agent systems differ from normal apps.

A lot of actions happen asynchronously:

scheduled syncs
monitoring
notifications
long-running tasks
automatic follow-up actions

That means an app may need continued access even when the user is not actively using it.

That should never be hidden inside a generic permission grant.

Background execution is a materially different level of authority and should be presented separately.

For example:

use my connected services during this session
keep access for background automations
allow recurring sync
allow write actions without manual confirmation

Those are very different trust levels.

If users cannot distinguish them, the platform is not communicating honestly.

Revocation is part of the product, not an afterthought

A good authorization system is not only about granting access. It is also about removing it cleanly.

Users should be able to open one place and see:

connected services
scopes granted to the platform
apps with current access
which services each app can use
whether background access exists
when each app last used that access
buttons to revoke service access, app access, or both

Without this, even a technically strong OAuth system will feel unsafe.

For agent infrastructure, auditability and revocation are not “enterprise extras.” They are part of the core product.

What I think the correct architecture looks like

This is the model I currently believe makes the most sense for a system like Kaeso:

Identity layer

The user signs into the platform.

Connection layer

The user connects external services like GitHub, Slack, or Google.

Token vault

Provider tokens are stored securely by the platform and refreshed when needed.

App authorization layer

Apps request platform-native scopes, not raw provider scopes.

Consent engine

The platform compares app requirements against existing service connections and scopes, detects missing requirements, and presents a clear consent screen.

Broker API

Apps call the platform API, and the platform performs downstream service actions on the user’s behalf.

Audit and revocation layer

Every action is attributable, reviewable, and revocable.

That architecture is more work than a basic “connect service and store token” implementation, but it solves the actual problem instead of pushing it downstream.

Why this matters more as agents get better

As models improve, infrastructure weaknesses become more visible.

When an agent is limited, weak authorization design is partly masked because the agent cannot do much anyway.

When agents become more capable, the opposite happens. The cost of unclear permissions goes up.

A system that can reason, plan, and act across multiple tools is only as safe and usable as the authorization model underneath it.

So I think one of the next important infrastructure layers in the AI ecosystem is not just better agents.

It is better authorization systems for agents.

Not more impressive demos.

Not broader access by default.

Not “just give the model a token and see what happens.”

A real permission architecture.

Closing thought

The interesting part of AI agent infrastructure is not only what the agent can do.

It is also:

what it is allowed to do
who approved that
through which app
using which service
with which scope
for how long
and how easily that can be reviewed or revoked

That is why I think the right model is not “every app handles OAuth however it wants.”

It is a brokered system where the platform becomes the control layer between users, apps, and external services.

That is the direction I am exploring with Kaeso.

Because if AI agents are going to become real infrastructure, their authorization layer has to become real infrastructure too.

If you are building in this area, I would love to know how you are thinking about it.

Are you giving apps direct provider access, or are you building a broker layer in between?

DEV Community