Prithvi S

Posted on Apr 23

The Credential Vending Revolution: How Polaris Eliminates Long-Lived Keys

#iceberg #data #architecture #database

The Problem Nobody Wants to Talk About

You're a data engineer at a mid-sized company. Your team needs access to production data for analytics, ML pipelines, and ad-hoc queries. So you do what everyone does: you create long-lived AWS credentials (access key + secret), store them in a vault (or worse, environment variables), and distribute them to your team.

Then you pray.

You pray nobody copies them to Slack. You pray an engineer doesn't accidentally commit them to GitHub. You pray that when someone leaves, you remember to rotate them. You pray a compromised machine doesn't expose them to attackers.

This is the status quo. And it's broken.

For years, data catalogs have accepted this reality: want to access data? Here's a credential. Use it however you want. Rotation? Access control? Audit trails? Maybe in the next version.

Apache Polaris just threw that playbook in the trash.

Instead of handing out credentials, Polaris mints them on demand. Every request for data gets a fresh, short-lived token scoped to exactly what's needed. No long-lived keys. No distribution. No prayer.

This is credential vending. And it's about to change how we think about data security.

What Is Credential Vending?

Credential vending is simple in concept but elegant in execution: instead of pre-issuing static credentials, a system dynamically generates temporary, scoped credentials when you request access to data.

Here's how it works in Polaris:

You request access: A Spark engine asks Polaris for permission to read a specific table
Polaris authorizes: It checks your identity, verifies your role, confirms you have TABLE_READ_DATA privilege
Polaris mints a credential: It calls AWS STS, GCS token service, or Azure token service and gets a temporary credential
Polaris scopes it: That credential is locked to the specific table path, read-only, and expires in ~15 minutes
Your engine uses it: Spark gets the scoped token, reads exactly what it needs, then the credential expires automatically

No long-lived keys. No distribution. No rotation headaches.

The genius is in the details.

Why This Matters: The Security Cascade

1. No Long-Lived Credentials to Steal

Traditional approach: You create an AWS access key with read+write permissions to your S3 bucket. You give it to 20 engineers. It lives in vaults, notebooks, CI/CD pipelines. Anywhere there's a copy, there's a vulnerability.

Attack surface = number of copies × time each copy exists.

Polaris approach: Your team never touches long-lived credentials. The only keys that exist are:

Polaris' own cloud provider credentials (stored securely, rotated regularly)
Temporary tokens minted per-request, valid for 15 minutes, then deleted

Attack surface = 1 system × 15 minutes at a time.

That's a 1000x reduction in exposure.

2. Instant Revocation

With long-lived credentials, revoking access means updating IAM policies, rotating keys, or waiting for vault secrets to refresh. By then, a compromised key might already be in use.

With credential vending, revocation is instant: Polaris stops issuing credentials for that principal, and their next data request fails immediately. No key in circulation. No 15-minute grace period for attackers to exploit.

3. Path-Level Scoping

Your Spark job needs to read s3://data-lake/customers/transactions. With traditional credentials, you'd give it broad S3 access. Polaris? It mints a credential valid only for that exact table path.

Even if that credential leaks, an attacker can only read transactions, not your employee records, financial data, or anything else in the bucket.

Write operations get the same treatment: TABLE_WRITE_DATA privilege generates a credential that can only write to that specific table, not drop it, not truncate it, not write to other tables.

Privilege mapped to cloud permissions. Boundaries enforced at the storage layer.

4. Compliance & Audit Trail

Regulators (GDPR, HIPAA, SOX) love paper trails. Credential vending creates one automatically:

Every credential vend is logged with who requested it, what table, when it was issued, when it expired
You can correlate data access with user identity without relying on IAM logs that are often delayed or incomplete
Breaches are traceable: "Which credentials were active when this data left the building?" Answer: the ones issued to that specific user for that specific 15-minute window

No more "we distributed keys, so we don't know who accessed what" handwaving.

Under the Hood: How Polaris Mints Credentials

Let's get technical. Here's the flow for an S3-backed Polaris catalog:

The Request

A Trino query hits your Polaris catalog asking to read table prod.warehouse.users. Polaris receives:

Your identity (service principal or user)
The table you want to access
The operation (read, write, etc.)

Authorization Check

Polaris checks:

Does this principal have a role assigned?
Does that role have TABLE_READ_DATA privilege for this table (or its parent namespace)?
If yes, proceed. If no, fail fast.

This is enforced by Polaris' two-tier RBAC model: Principal Roles (identity) separate from Catalog Roles (permissions). More on that in a future post, but the key insight: authorization happens before any credential is minted.

Credential Minting

Assuming authorization passes, Polaris looks up the storage configuration for this catalog. For S3, it has:

An AWS account ID
A role ARN (e.g., arn:aws:iam::123456789:role/polaris-data-lake)
An external ID (for added security)

Polaris calls STS:AssumeRole with parameters:

Role ARN
Duration: 900 seconds (15 minutes)
Session policy: A JSON policy restricting the token to s3:GetObject on paths matching s3://data-lake/prod/warehouse/users/*

AWS returns a temporary security credential (access key, secret key, session token).

Scoping the Credential

Here's where it gets clever. Polaris doesn't just pass through the STS response. It crafts a session policy that restricts the token further:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::data-lake/prod/warehouse/users/*"
      ]
    }
  ]
}

For TABLE_WRITE_DATA, the policy includes s3:PutObject:

{
  "Effect": "Allow",
  "Action": ["s3:GetObject", "s3:ListBucket", "s3:PutObject"],
  "Resource": [
    "arn:aws:s3:::data-lake/prod/warehouse/users/*"
  ]
}

Notice: s3:DeleteObject is never included. You can write to the table, but you can't delete it or its backing files. Polaris itself controls deletes through atomic metadata operations.

Response to Engine

Polaris returns the temporary credential to your Spark/Trino/Flink engine. The engine uses it to read data. After 15 minutes, the token expires automatically.

No revocation needed. No cleanup. No leftover keys.

The Federation Game Changer (v1.3.0)

Here's where it gets even more interesting.

Many organizations don't run pure Iceberg. You have Snowflake for analytics, Delta Lake in Databricks, Hudi on Hadoop, and Iceberg in your data lake. Managing credentials across all of them is a nightmare.

Polaris v1.3.0 introduces federated credential vending.

Instead of each external system managing its own credentials, Polaris can mint credentials on behalf of external catalogs. Your Snowflake-to-Iceberg migration? Polaris handles credential vending for both. Your Databricks Delta table accessed through Polaris? Same story.

This is huge for:

Data mesh architectures: One source of truth for credential vending across multiple catalog types
Migrations: Seamlessly bridge old systems (Snowflake, Glue) with new ones (Iceberg) without credential sprawl
Multi-cloud setups: Mint GCS credentials for BigQuery, S3 credentials for Iceberg, all from a single Polaris instance

Performance: The Caching Question

"But doesn't minting a credential for every request add latency?"

Yes. Each STS call takes 100-200ms.

Polaris solves this with intelligent caching:

For repeated requests from the same principal to the same table, Polaris reuses the cached credential (until near expiration)
This reduces cloud provider API calls significantly
The tradeoff: earlier revocation (e.g., if permissions change mid-session) requires a cache flush

For most workloads (batch jobs, dashboards, recurring queries), this is a net win. Latency is imperceptible; security is massively improved.

Implementing Credential Vending in Your Catalog

If you're building a catalog or evaluating options, here's what to look for:

Cloud-native credential generation: Does the system call your cloud provider's token service (STS, GCS, Azure), or does it generate its own tokens? Cloud-native is better (leverages existing IAM, auditable).
Scoping mechanism: Are credentials scoped to table paths, operations, or both? Path + operation = maximum security.
Expiration: How short can tokens be? 15 minutes is ideal for security; anything longer risks exposing stale credentials.
Caching strategy: How does the system balance revocation latency with performance? Intelligent caching (by principal + table) is the sweet spot.
Multi-cloud support: Do you need GCS, S3, and Azure all at once? Credential vending should work across cloud providers.

Polaris nails all five. That's why it's becoming the standard for open-source Iceberg deployments.

What's Next?

Credential vending is the foundation. On top of it, Polaris builds:

Two-tier RBAC for fine-grained access control
OPA integration for externalized policies
Metrics reporting for observability
Generic table support for non-Iceberg formats

But credential vending is the security core. It's why Polaris is uniquely positioned for zero-trust data architectures.

If your organization is scaling data access, if compliance is a concern, or if you're tired of rotating long-lived credentials, Polaris' credential vending approach is worth the migration.

No more praying.

Questions? Thoughts on credential vending, Polaris, or data security architecture?

Drop a comment below or find me on GitHub: https://github.com/iprithv

About the author: I'm Prithvi S, Staff Software Engineer at Cloudera and Opensource Enthusiast. Follow my work on GitHub.

DEV Community