Week 2 of a series on AWS Lake Formation — from fundamentals to real-world implementation
If you read Week 1, you know Lake Formation gives you a centralized place to manage data lake permissions. This week we get into the decision that will shape how your entire permission model scales — or doesn't.
Lake Formation gives you two ways to grant permissions: Named Resource (NBAC) and Tag-Based (TBAC). Most teams start with NBAC because it's intuitive. Most teams that scale past a few dozen tables eventually wish they hadn't.
Let me explain why.
Named Resource Access Control (NBAC): The Default
NBAC is exactly what it sounds like. You grant a permission by naming the resource explicitly.
Grant SELECT on database: analytics, table: orders → role: data_science_role
Simple. Readable. Works fine when you have 3 databases and 2 teams.
The problem is that permissions are attached to individual resources. Every time you create a new table, you have to go back and grant access to every role that should see it. New database? Same thing. Spinning up a new team that needs read access across 30 tables? Thirty grants.
At scale, NBAC becomes a maintenance problem disguised as a permissions model. You end up with hundreds of individual grants, no clear pattern, and the dreaded question: "does team X have access to this table?" becomes genuinely difficult to answer.
Tag-Based Access Control (TBAC): The Model That Actually Scales
TBAC flips the model. Instead of granting access to named resources, you:
- Attach LF-tags (key-value pairs) to your databases, tables, and columns
- Grant principals permission to resources that match a tag expression
A concrete example. You tag all tables in your finance domain:
domain = finance
sensitivity = confidential
Then you grant:
role: finance_analyst → SELECT on resources where domain=finance AND sensitivity=confidential
Now every table you create in the future that gets tagged domain=finance, sensitivity=confidential is automatically accessible to finance_analyst. No additional grants. No drift. No manual catch-up.
The permission lives at the tag level, not the resource level.
The Three Things That Make TBAC Worth It
1. New resources inherit access automatically
This is the one that matters most in a fast-moving data environment. When your data engineering team creates a new table in the finance database and tags it correctly, access provisioning is done. The grants already exist.
2. Cross-account sharing becomes tractable
In a multi-account architecture — say a central data lake account sharing to consumer accounts — NBAC requires you to manage explicit grants across account boundaries for every resource. With TBAC, you share tag definitions and tag expressions. One cross-account grant covering a tag expression handles an entire domain's worth of tables.
3. Permissions become auditable and intentional
With NBAC, understanding "who has access to what" requires reading through hundreds of individual grants. With TBAC, your access model is expressed in a handful of tag expressions. It reads like policy. It's reviewable. It's explainable to stakeholders.
When NBAC Still Makes Sense
TBAC isn't always the right choice. Use NBAC when:
- You have a small number of resources (under 20–30 tables) with stable access patterns
- You need one-off, exception-style grants — specific roles that need access to exactly one table for a specific reason
- You're just getting started and want to validate your Lake Formation setup before investing in a tagging taxonomy
NBAC is also the right tool for mixing with TBAC. Your baseline access model can be TBAC-driven; edge cases and exceptions can be handled with targeted NBAC grants on top.
The Mistake Most Teams Make
They start with NBAC because it's intuitive and the AWS console makes it the default path. Then 6 months later they have 200+ grants, no coherent structure, and a bloated permissions model that nobody wants to touch.
Refactoring from NBAC to TBAC after the fact is painful. You're redesigning your taxonomy, re-tagging resources, migrating grants, and hoping nothing breaks in the process.
If you're building a new data lake or are early in your Lake Formation adoption: design your tag taxonomy first. Even a simple one — domain, sensitivity, environment — gives you a structure that will hold as you scale.
What a Minimal Tag Taxonomy Looks Like
You don't need to over-engineer this. A starting point:
| Tag Key | Example Values |
|---|---|
domain |
finance, marketing, ops, engineering |
sensitivity |
public, internal, confidential, restricted |
environment |
dev, staging, prod |
Three tag keys. That's enough to cover most access patterns in a mid-sized organization. Add more only when you have a concrete reason.
Coming Up
Next week: cross-account data sharing in Lake Formation. The documentation makes it look straightforward. It isn't. I'll walk through the actual architecture, the grant sequence that trips people up, and a real bug I hit that cost me hours to trace.
Follow along if you want the practical version of Lake Formation — not the tutorial, the real thing.
Top comments (0)