DEV Community

duke
duke

Posted on

[Databricks on AWS #2] RBAC on Databricks: Function-Role Groups, Workspace Assignment, and Why USER/ADMIN Isn't the Whole Story

📚 Series: Databricks on AWS (Part 2)

  1. Building a Databricks AI Platform on AWS
  2. RBAC with Function-Role Groups ← you are here
  3. Compute Governance: Pools, Policies, Clusters
  4. The BOOTSTRAP_TIMEOUT Mystery
  5. Fixing It with AWS PrivateLink
  6. How We Structure the Terraform

Part 1 built the environment. Now we hand out the keys — three account-level groups, two workspaces, and a permission model that's mostly not something you invent.

Here's the trap most Databricks RBAC posts fall into: they treat access control like a thing you design from scratch. You don't. Databricks already hands you USER and ADMIN at the workspace level, entitlements, object ACLs, and Unity Catalog grants — all built in. The only piece you actually create is the groups. Get that mental split right and RBAC stops feeling like a maze.

If you're standing up a fresh Databricks account and wondering where to draw the first lines, this is the layer to get right before you touch a single catalog.


The model in one line

Everything flows through groups:


User ──▶ Function-Role group ──▶ workspace (USER / ADMIN) + (later) data grants
Enter fullscreen mode Exit fullscreen mode

A user gets nothing directly. They land in a function-role group, and the group carries the permissions. That indirection is the whole point — it means "who is this person" and "what can this role do" are two separate problems you can solve independently.

We started with the smallest set that still maps to real jobs:

Group Intent Workspace level
ai_admin Platform admins — run the place ADMIN
ai_engineer ML / data engineers — build things USER
ai_analyst Analysts — read and query USER

Three groups. Not thirty. You can add ai_scientist, ai_guest, whatever later — each is one line of YAML plus an assignment. Resist the urge to pre-build a role for every hypothetical persona; churn kills that plan fast.

These are account-level groups, not workspace-local ones. That matters: one group definition can be assigned to many workspaces, which is exactly what you want when you have more than one.

Groups are the only thing you create

This is the part worth internalizing. Line up the permission layers and mark who owns each:

Layer Values Who defines it
Workspace assignment USER / ADMIN Databricks built-in
Entitlements workspace_access, databricks_sql_access, allow_cluster_create, ... Databricks built-in
Object ACLs CAN_MANAGE / CAN_USE / CAN_ATTACH_TO / ... Databricks built-in
Unity Catalog grants USE CATALOG / SELECT / MODIFY / ALL Databricks built-in
Function-role groups ai_admin, ai_engineer, ... You

Four of the five rows are Databricks primitives. You don't design SELECT or CAN_MANAGE — they exist. What you design is the subjects: the groups those permissions attach to. Everything else in this series (entitlements in Part 3, Unity Catalog grants later) is you handing built-in permissions to the groups you made here.

So your IaC surface for RBAC is genuinely small. You define the groups in Terraform, and you wire them to workspaces. Membership — the actual humans — lives somewhere else. More on that in a second.

USER vs ADMIN: built-in, and you just pick

Once the groups exist, you assign each one to a workspace at a permission level. Databricks gives you exactly two at this layer:

  • ADMIN — full workspace admin. Manage users, clusters, settings, everything.
  • USER — can log in and work, no admin surface.

That's it. You're not inventing a permission scheme; you're choosing which of two built-in levels each group gets, per workspace. In Terraform this is databricks_mws_permission_assignment — an account-level resource (multi-workspace mode) that maps a group to a workspace at a level.

With two workspaces — call them landing and pipeline — the matrix is small and readable:

Group landing pipeline
ai_admin ADMIN ADMIN
ai_engineer USER USER
ai_analyst USER USER

Admins are admins everywhere; engineers and analysts are users everywhere. The point isn't the specific grid — it's that a whole workspace access policy is this compact when the subjects are groups instead of individuals.

Where membership actually lives (hint: not IaC)

Here's the decision that surprises people: user-to-group membership does not go in Terraform.

It's tempting. You've got groups in code, you've got assignments in code — why not list the members in code too? Because joiners and leavers churn constantly, and every one of them would be a pull request, a plan, an apply. You'd be running infrastructure deploys to onboard an intern. That's the wrong tool.

So membership lives in the Account Console / SCIM, managed by ops (or synced from your IdP):

  1. Account Console → User management → Users → Add user (company email = login ID).
  2. Open the target group → Members tab → Add members.

One gotcha worth calling out: add people on the Members tab, not the Roles tab. The Roles tab is account-level roles (account admin, etc.) — a completely different thing, and easy to click by accident. And if you have SSO/SCIM provisioning, let the IdP own membership; manual adds will fight the sync.

The clean split, then:

  • In IaC (rarely changes): group definitions, workspace assignments, and later the grants.
  • Out of IaC (changes daily): who's in each group.

Structure in code, people in the console. That line is the single most useful RBAC decision in this whole post.

The apply-order gotcha that eats an afternoon

Terraform (via Terragrunt) resolves this in two stacks: one that creates the groups, one that creates the workspace assignments. The assignment stack depends on the group stack's output — it needs the group IDs to bind them to workspaces.

Here's the trap. If you apply the assignment stack before the groups exist, it doesn't error. It quietly resolves the dependency to a mock (empty) output and produces:

assignments = {}
Enter fullscreen mode Exit fullscreen mode

An empty result. No groups, no bindings, no complaint. You think you shipped RBAC; you shipped nothing. Then you spend an hour wondering why nobody can log in.

The order is non-negotiable:

# 1. groups FIRST
atlantis apply -d .../groups

# 2. THEN the assignment (references group IDs)
atlantis apply -d .../workspace/workspace-assignment
Enter fullscreen mode Exit fullscreen mode

If you ever see assignments = {} on a plan you expected to be full, this is why: the group output wasn't there yet, the dependency fell back to its mock, and the plan built against thin air. Apply groups, then re-plan. It's the RBAC cousin of "create the table before you grant on it."

Takeaways

  • Groups are the only thing you invent. USER/ADMIN, entitlements, ACLs, and Unity Catalog grants are all Databricks built-ins — you attach them, you don't design them.
  • Function-role groups are account-level, so one definition assigns cleanly to many workspaces. Start with three (admin/engineer/analyst); add more as one-liners when you actually need them.
  • USER vs ADMIN is a built-in binary at the workspace layer — pick per group, per workspace, and let the group indirection keep the matrix tiny.
  • Membership belongs in the console/SCIM, not IaC. Joiner/leaver churn would turn onboarding into infrastructure deploys. Structure in code, people out of code.
  • Create groups before workspace assignment. Do it backwards and the dependency resolves to a mock, assignments = {} ships silently, and nobody can log in.

With the subjects in place and the workspaces handed out, the next question is what those users are actually allowed to do with compute — who can spin up a cluster, which entitlements gate SQL, and how you keep costs from getting out of hand. That's Part 3.

Next: Compute governance — entitlements, cluster policies, and keeping a self-serve platform from becoming a self-serve bill.

Top comments (0)