Damla Hamurcu

Posted on Mar 2

Why My Multi-Tenant Chatbot Needed Two Types of API Keys

#fastapi #security #python #webdev

I'm building a multi-tenant AI chatbot. Businesses sign up, share their documents and get a chat widget to embed on their website. The widget talks to their knowledge base and only their knowledge base. I covered the WebSocket authentication side of this in a previous post. This one is about the problem that lives on the other end — in the browser.

The widget needs to know which tenant it belongs to. That means some form of credential has to exist in the frontend code. As we all know, frontend code is not a friend who can keep secrets. The code is public. Anyone can open DevTools, inspect the network tab, and read the script tag. Whatever key you put there is fully exposed from day one.

So the question becomes: how do you authenticate something that can never be secret?

Where Do You Even Store This Stuff?

Now I need to make a tiny side track here. Before I could even get to the API key problem, I had a session management problem. When an anonymous user opens the chat widget, the backend creates a session and returns three things: a session_id, an anonymous_user_id, and a JWT token. The widget needs to hold onto these so that if the user closes the tab and comes back twenty minutes later, they can pick up where they left off instead of starting a blank conversation.

That ruled out sessionStorage immediately — it dies when the tab closes. I needed localStorage.

But localStorage is scoped per origin, not per page. If a user happens to visit two different websites that both embed my chatbot widget (unlikely, but not impossible), the widget JavaScript runs on their respective domains. That's fine — different origins, different storage. The real edge case is during development and testing, or if the widget is loaded from a shared domain. I didn't want to leave that door open.

The fix was to namespace the storage keys using a portion of the API key. The publishable key is already in the widget's script tag — it's the one identifier that's guaranteed to be unique per tenant and available before any API call happens. So the storage looks like this:

// API key from the script tag: pk_live_a8Kx9mNq...
const prefix = apiKey.slice(8, 14); // "a8Kx9m"

localStorage.setItem(`${prefix}_session_id`, sessionId);
localStorage.setItem(`${prefix}_anonymous_user_id`, visitorId);
localStorage.setItem(`${prefix}_token`, jwt);

Now two different tenants' widgets on the same origin write to different keys. No collisions, no cross-contamination.

When the widget loads, it checks for existing keys under its prefix. If it finds them, it sends them to the backend to resume the session. If the session has expired, the widget creates a new one but reuses the same anonymous_user_id so the tenant can still recognize a returning visitor.

The Key That Does Too Much

At this point I had one API key per tenant. It authenticated the widget, identified the tenant, and in theory would also authenticate the admin dashboard I was planning. You know… The interface where tenants manage their uploaded documents, update their system prompt, tweak their model settings.

Hopefully this is where it hits us if not earlier. The key in the widget's script tag is permanently exposed. It's not a leak, it's by design! But if that same key can also delete documents or change the AI's behavior, then anyone who reads the page source can do those things too.

One way to limit this could be managing allowed origins accessing the endpoints. It sounds more like a band-aid solution, or a helper at best. Cannot be the main solution for sure; origins can be spoofed, and it doesn't help if the key is used from a tool like Postman or curl.

I needed the widget key to be able to create chat sessions and send messages. Nothing more. And I needed a separate key for administrative operations that would never appear in frontend code. One that's shown to the tenant exactly once when it's generated and never transmitted over a public channel again.

Two different security contexts. Two different keys.

Splitting Publishable and Secret Keys

So… I landed on a naming convention borrowed from online examples: pk_live_ for publishable keys and sk_live_ for secret keys.

The publishable key (pk_live_) goes in the widget. It's public by nature. Its scopes are limited to sessions:create and sessions:read. It can initialize a chat session, refresh a JWT, and resume a conversation. It cannot touch documents, settings, or anything administrative.

The secret key (sk_live_) is for the admin dashboard and backend integrations. It's shown once at creation time. Until I build the admin UI with proper login, the tenant copies it, stores it somewhere secure, and it never appears in a browser. It carries the admin scope, which grants access to document uploads, deletions, configuration changes, and later things like OAuth connections for external integrations.

Both key types live in a dedicated api_keys table:

class APIKey(Base):
    __tablename__ = "api_keys"

    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid4)
    tenant_id = Column(UUID(as_uuid=True),
        ForeignKey("tenants.id", ondelete="CASCADE"), nullable=False)
    key_hash = Column(String(64), nullable=False)    # SHA-256, never store raw
    key_type = Column(String(20), nullable=False)     # "publishable" or "secret"
    scopes = Column(JSON, nullable=False, default=list)
    is_active = Column(Boolean, default=True, nullable=False)
    created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
    last_used_at = Column(DateTime(timezone=True), nullable=True)
    revoked_at = Column(DateTime(timezone=True), nullable=True)

I deliberately made this a separate table rather than adding key columns to the tenant table. A tenant can have multiple keys — maybe an old publishable key that's still in a cached version of someone's website, plus the current one, plus a secret key that was rotated after an employee left. Each key has its own lifecycle: creation time, last used time, revocation time. Putting that on the tenant table would mean either a mess of nullable columns or losing the ability to track key history entirely.

The raw key is only ever returned once, at generation time. What gets stored is a SHA-256 hash:

async def generate_for_tenant(self, tenant_id, key_type, scopes):
    if key_type == APIKeyType.PUBLISHABLE:
        raw_key = generate_api_key(prefix="pk_live")
    else:
        raw_key = generate_api_key(prefix="sk_live")

    key_hash = hashlib.sha256(raw_key.encode()).hexdigest()

    await self.repo.create(
        tenant_id=tenant_id,
        key_hash=key_hash,
        key_type=key_type.value,
        scopes=scopes
    )

    return raw_key  # Only time this is returned

When a request comes in, the provided key is hashed and looked up. If the hash matches an active, non-revoked key, the request is authenticated and the tenant is identified. If the key is compromised, you deactivate it and generate a new one. The tenant record is untouched — only the api_keys table changes.

Validation checks everything in sequence: does the hash exist, is the key active, is it not revoked, is the tenant active, does it have the required scopes (for simplicity we'll return None in this snippet):

async def validate_key(self, raw_key, required_scopes=None):
    key_hash = hash_api_key(raw_key)
    api_key = await self.repo.get_by_hash_with_tenant(key_hash)

    if not api_key:              return None
    if not api_key.is_active:    return None
    if api_key.revoked_at:       return None
    if not api_key.tenant.is_active: return None

    if required_scopes:
        if not api_key.has_any_scope(required_scopes):
            return None

    return {
        "tenant_id": str(api_key.tenant_id),
        "tenant": api_key.tenant,
        "key_type": api_key.key_type,
        "scopes": api_key.scopes
    }

In production, each of those return None cases logs a different reason — expired key, revoked key, inactive tenant — so I can debug authentication failures without exposing details to the caller.

Two Apps, One Backend

The key split had a natural downstream effect on the application structure. The widget and the admin dashboard have different authentication mechanisms, different CORS policies, and different route sets. Cramming them into one FastAPI app with a bunch of conditional middleware felt wrong.

Instead, I split them into two sub-applications mounted under the same FastAPI instance:

# main.py
app = FastAPI(title="RAG Chatbot API")

app.mount("/widget", widget_app)
app.mount("/admin", admin_app)

The widget app allows only specific origins (the tenant's website), exposes only X-Chatbot-Key in its CORS headers, and serves two routes: session initialization and WebSocket connections. Its authentication dependency reads the pk_live_ key from the X-Chatbot-Key header:

# widget_app.py
widget_app.add_middleware(
    CORSMiddleware,
    allow_origins=settings.widget_cors_origins_list,
    allow_credentials=False,
    allow_methods=["GET", "POST", "OPTIONS"],
    allow_headers=["Content-Type", "X-Chatbot-Key"],
)

The admin app has wider CORS allowances (for the dashboard frontend), uses X-API-Key with sk_live_ keys, and will eventually serve document management, tenant configuration, and OAuth connection routes.

This separation also made the WebSocket flow cleaner. The widget app's WebSocket endpoint delegates to a SessionManager that orchestrates the entire connection lifecycle:

context = await session_manager.establish_session(websocket, token)
if not context:
    return  # Connection already closed with reason

# context now holds: session_id, tenant_id, tenant config, websocket
# Ready to process messages

Under the hood, establish_session verifies the JWT, validates the session, loads the tenant config, and registers the connection. Each step can fail independently and close the connection with a specific WebSocket close code. Without the sub-app boundary making ownership clear, this orchestration logic would have been tangled up with admin concerns.

What I'd Do Differently

Rate limiting per publishable key from day one. Right now, rate limiting happens per session. But a bad actor with the publishable key could create sessions in a loop. I'd add a rate limit at the key level — say, 100 session creations per hour per publishable key — before it becomes a problem.

A tighter localStorage cleanup strategy. The namespaced keys work, but old sessions accumulate. I'd add a TTL check on widget initialization that clears stale entries — anything with an expired token older than 24 hours.

Consider key rotation automation. Right now, key rotation is manual: revoke the old one, generate a new one, update the widget embed. For enterprise tenants, an automated rotation flow with a grace period (both old and new keys active for 24 hours) would reduce friction.

Recap

Here's the chain of decisions, each one causing the next:

The widget needed to survive tab closes → localStorage, not sessionStorage.
Two tenants on the same origin would collide → namespace storage keys with the API key prefix.
The API key is permanently public in the widget → it shouldn't have admin permissions.
You need admin operations too → create a second key type with separate scopes.
Two key types need lifecycle tracking → dedicated api_keys table, not columns on the tenant.
Two authentication contexts → two sub-apps with isolated middleware and routes.

None of this was planned upfront. Each decision was forced by the one before it. That's usually a sign you're solving real problems rather than imaginary ones.

If you've hit similar decisions building multi-tenant systems or solved them differently, I'd love to hear about it in the comments.

This is Part 2 in a series on building a multi-tenant AI chatbot platform. Part 1 covered How I Solved WebSocket Authentication in FastAPI (And Why Depends() Wasn't Enough). The stack is Python/FastAPI, Supabase (Postgres + pgvector), and Gemini/OpenAI for LLM and embeddings.