I spent the past weeks integrating Supabase as an external data source into Keboola Connection. The result is a complete driver with OAuth 2.0, automatic schema discovery, and Supabase Marketplace integration. I also hit some bugs that nearly drove me crazy -- here's the whole story.
Supabase + Keboola
Supabase is an open-source Firebase alternative built on PostgreSQL -- authentication, real-time subscriptions, storage, and most importantly a full-featured PostgreSQL database. Keboola is a data pipeline platform for ETL processes over various data sources.
The goal was simple: let users connect their Supabase project as an external data source in Keboola and have data automatically flow into their pipelines.
From CLI to Web Controller
Originally I wanted a CLI command to register OAuth credentials. But testing a full OAuth flow from the terminal is painful -- you're copying URLs back and forth, handling redirects manually, no visual feedback at all.
So I pivoted early: instead of CLI, I built a web-based test harness at /supabase/connect. A simple form where you enter client_id and client_secret, it generates the redirect URL, sends you to Supabase for authorization, and handles the callback. This turned out to be crucial -- the iterative debugging that followed would have been impossible from a terminal.
The Stateless OAuth Puzzle
The first real challenge came from Symfony's security architecture. Routes with #[AsPublicAction] run on a stateless firewall -- no session available. But OAuth flows typically need to store the PKCE verifier between the authorization request and the callback. Symfony threw: "Session was used while the request was declared stateless."
So what do you do? Encode everything -- client credentials, PKCE verifier, redirect URI -- directly into the OAuth state parameter, signed with HMAC-SHA256 using kernel.secret. The callback decodes the state, verifies the signature, and extracts the PKCE verifier. No session, no database, HMAC prevents tampering, and the state parameter inherently protects against CSRF. Elegant.
OAuth API Quirks
Then came two smaller hurdles:
The approval_prompt mystery. The League OAuth2 library adds approval_prompt to authorization requests by default (a Google OAuth convention). Supabase rejected it: "Unrecognized key(s) in object: 'approval_prompt'." Fix: override getAuthorizationParameters() and filter it out.
Project vs. account OAuth apps. I created the OAuth app at project level (/project/{ref}/auth/oauth-apps) and kept getting "Unrecognized client_id." Turns out Supabase has two distinct OAuth scopes -- project-specific apps and integration apps at the account level (/account/integrations). For a marketplace integration with cross-project access, you need the latter. Easy to miss in the docs.
The PKCE Double-Handling Bug
This was the trickiest bug of the entire project. After solving the stateless flow and API quirks, token exchange kept failing: "Invalid or expired OAuth authorization."
I tried everything systematically:
- Detailed error logging -- same error
- HTTP Basic Auth for the token endpoint -- still nothing
-
Accept: application/jsonheader -- still nothing - A completely custom
exchangeCodeForTokens()bypassing the library -- still nothing
The authorization flow worked. The callback received the code. But token exchange kept returning the same cryptic error. I was going mad.
The breakthrough: the League OAuth2 library has built-in PKCE support. My code was also handling PKCE manually for the stateless flow. Two correct PKCE implementations running simultaneously -- the library sent one code_verifier, my stateless code sent a different one. Supabase saw a mismatch and rejected the exchange.
The fix? One line:
protected $pkceMethod = null; // Disable library's built-in PKCE
A classic integration bug -- two systems, each correct on its own, breaking when combined. Debugging took hours because every individual piece looked fine.
Two Connection Modes
The driver supports two ways to access Supabase data:
Direct PostgreSQL connection -- the classic approach via connection pooler (port 6543). Full SQL access, good for larger data volumes.
REST API (PostgREST) -- via Supabase REST endpoint with a service_role key. Simpler setup without exposing database passwords.
Both modes support encrypted credential storage with separate encryption keys for passwords, API keys, and OAuth tokens.
Automatic Schema Discovery
After a successful OAuth connection, a SupabaseProjectSetupJob runs in the background:
- Calls
ListSchemasCommandon the Supabase driver via protobuf - Discovers available database schemas
- Creates an external bucket in Keboola for each schema
- Sets up auto-refresh every 6 hours
The protobuf communication is architecturally neat -- commands are defined in a shared storage-driver-common monorepo and implemented in storage-driver-postgres. Clean boundaries, no direct coupling.
Users see their data in Keboola immediately after connecting, no manual configuration needed.
Supabase Management API
The integration includes a client for the Supabase Management API:
- Project details (region, configuration)
- API keys (
anon,service_role) - Connection pooler configuration
- List of projects and organizations
This data is used during automatic credential setup after the OAuth flow.
The Full User Flow
Here's what the end-to-end experience looks like:
- User clicks "Connect Supabase" and authorizes via OAuth
- Callback stores tokens, redirects to setup page
- User picks their Supabase project, optionally provides a PostgreSQL DSN
- Keboola creates an organization and project with a Supabase backend
- Background job discovers tables, registers buckets, sets up auto-refresh
- User lands on their project dashboard with data already flowing
Scale
The entire driver was built in one week, roughly 60 hours of intensive work with Claude Code. The result is around 25,000 changes -- classes, controllers, migrations, tests, CLI commands, and documentation. From zero to a complete integration.
The workflow was highly iterative: figure out what's needed, implement it, test immediately in the browser, debug from real error messages. Fast feedback loops made it possible to crack even the PKCE double-handling issue in hours rather than days.
The AI agent massively accelerated repetitive patterns (controllers, DTOs, tests), boilerplate generation, and navigating the large Keboola codebase. But key architectural decisions and the security model obviously required human judgment.
What I Learned
- Stateless OAuth works -- encode everything in an HMAC-signed state parameter, no session needed
- Watch your libraries -- the PKCE double-handling taught me to always check what the library does automatically behind your back
-
OAuth APIs differ wildly between providers --
approval_prompt, auth methods, app scopes, nothing is standard - PKCE is a must for flows where the client secret can't be fully secured
- A web test harness beats CLI testing for OAuth flows by an order of magnitude
- AI agents change the game -- 25,000 changes in a week wouldn't have been possible without one
The integration is currently in pull request and going through code review. Looking forward to seeing it in production.
Top comments (0)