<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Soumyadeep Basu</title>
    <description>The latest articles on DEV Community by Soumyadeep Basu (@soumyadeep_basu_3101d3ac1).</description>
    <link>https://dev.to/soumyadeep_basu_3101d3ac1</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3870577%2F1faa3124-25b0-4484-b519-5d8789705689.jpg</url>
      <title>DEV Community: Soumyadeep Basu</title>
      <link>https://dev.to/soumyadeep_basu_3101d3ac1</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/soumyadeep_basu_3101d3ac1"/>
    <language>en</language>
    <item>
      <title>AWS Lake Formation: TBAC vs NBAC — The Permission Model Decision That Will Define Your Data Lake</title>
      <dc:creator>Soumyadeep Basu</dc:creator>
      <pubDate>Thu, 16 Apr 2026 18:01:34 +0000</pubDate>
      <link>https://dev.to/soumyadeep_basu_3101d3ac1/aws-lake-formation-tbac-vs-nbac-the-permission-model-decision-that-will-define-your-data-lake-38l7</link>
      <guid>https://dev.to/soumyadeep_basu_3101d3ac1/aws-lake-formation-tbac-vs-nbac-the-permission-model-decision-that-will-define-your-data-lake-38l7</guid>
      <description>&lt;p&gt;&lt;em&gt;Week 2 of a series on AWS Lake Formation — from fundamentals to real-world implementation&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;If you read Week 1, you know Lake Formation gives you a centralized place to manage data lake permissions. This week we get into the decision that will shape how your entire permission model scales — or doesn't.&lt;/p&gt;

&lt;p&gt;Lake Formation gives you two ways to grant permissions: &lt;strong&gt;Named Resource&lt;/strong&gt; (NBAC) and &lt;strong&gt;Tag-Based&lt;/strong&gt; (TBAC). Most teams start with NBAC because it's intuitive. Most teams that scale past a few dozen tables eventually wish they hadn't.&lt;/p&gt;

&lt;p&gt;Let me explain why.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Named Resource Access Control (NBAC): The Default&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;NBAC is exactly what it sounds like. You grant a permission by naming the resource explicitly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Grant SELECT on database: analytics, table: orders → role: data_science_role
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple. Readable. Works fine when you have 3 databases and 2 teams.&lt;/p&gt;

&lt;p&gt;The problem is that permissions are &lt;strong&gt;attached to individual resources&lt;/strong&gt;. Every time you create a new table, you have to go back and grant access to every role that should see it. New database? Same thing. Spinning up a new team that needs read access across 30 tables? Thirty grants.&lt;/p&gt;

&lt;p&gt;At scale, NBAC becomes a maintenance problem disguised as a permissions model. You end up with hundreds of individual grants, no clear pattern, and the dreaded question: "does team X have access to this table?" becomes genuinely difficult to answer.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tag-Based Access Control (TBAC): The Model That Actually Scales&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;TBAC flips the model. Instead of granting access to named resources, you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Attach &lt;strong&gt;LF-tags&lt;/strong&gt; (key-value pairs) to your databases, tables, and columns&lt;/li&gt;
&lt;li&gt;Grant &lt;strong&gt;principals permission to resources that match a tag expression&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A concrete example. You tag all tables in your finance domain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;domain&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;finance&lt;/span&gt;
&lt;span class="py"&gt;sensitivity&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;confidential&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then you grant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;role: finance_analyst → SELECT on resources where domain=finance AND sensitivity=confidential
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now every table you create in the future that gets tagged &lt;code&gt;domain=finance, sensitivity=confidential&lt;/code&gt; is &lt;strong&gt;automatically accessible&lt;/strong&gt; to &lt;code&gt;finance_analyst&lt;/code&gt;. No additional grants. No drift. No manual catch-up.&lt;/p&gt;

&lt;p&gt;The permission lives at the tag level, not the resource level.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The Three Things That Make TBAC Worth It&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. New resources inherit access automatically&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the one that matters most in a fast-moving data environment. When your data engineering team creates a new table in the &lt;code&gt;finance&lt;/code&gt; database and tags it correctly, access provisioning is done. The grants already exist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Cross-account sharing becomes tractable&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In a multi-account architecture — say a central data lake account sharing to consumer accounts — NBAC requires you to manage explicit grants across account boundaries for every resource. With TBAC, you share tag definitions and tag expressions. One cross-account grant covering a tag expression handles an entire domain's worth of tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Permissions become auditable and intentional&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With NBAC, understanding "who has access to what" requires reading through hundreds of individual grants. With TBAC, your access model is expressed in a handful of tag expressions. It reads like policy. It's reviewable. It's explainable to stakeholders.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;When NBAC Still Makes Sense&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;TBAC isn't always the right choice. Use NBAC when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have a small number of resources (under 20–30 tables) with stable access patterns&lt;/li&gt;
&lt;li&gt;You need one-off, exception-style grants — specific roles that need access to exactly one table for a specific reason&lt;/li&gt;
&lt;li&gt;You're just getting started and want to validate your Lake Formation setup before investing in a tagging taxonomy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;NBAC is also the right tool for &lt;strong&gt;mixing with TBAC&lt;/strong&gt;. Your baseline access model can be TBAC-driven; edge cases and exceptions can be handled with targeted NBAC grants on top.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The Mistake Most Teams Make&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;They start with NBAC because it's intuitive and the AWS console makes it the default path. Then 6 months later they have 200+ grants, no coherent structure, and a bloated permissions model that nobody wants to touch.&lt;/p&gt;

&lt;p&gt;Refactoring from NBAC to TBAC after the fact is painful. You're redesigning your taxonomy, re-tagging resources, migrating grants, and hoping nothing breaks in the process.&lt;/p&gt;

&lt;p&gt;If you're building a new data lake or are early in your Lake Formation adoption: &lt;strong&gt;design your tag taxonomy first&lt;/strong&gt;. Even a simple one — &lt;code&gt;domain&lt;/code&gt;, &lt;code&gt;sensitivity&lt;/code&gt;, &lt;code&gt;environment&lt;/code&gt; — gives you a structure that will hold as you scale.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What a Minimal Tag Taxonomy Looks Like&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You don't need to over-engineer this. A starting point:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tag Key&lt;/th&gt;
&lt;th&gt;Example Values&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;domain&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;finance, marketing, ops, engineering&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sensitivity&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;public, internal, confidential, restricted&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;environment&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;dev, staging, prod&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three tag keys. That's enough to cover most access patterns in a mid-sized organization. Add more only when you have a concrete reason.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Coming Up&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Next week: cross-account data sharing in Lake Formation. The documentation makes it look straightforward. It isn't. I'll walk through the actual architecture, the grant sequence that trips people up, and a real bug I hit that cost me hours to trace.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow along if you want the practical version of Lake Formation — not the tutorial, the real thing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>dataengineering</category>
      <category>terraform</category>
      <category>cloud</category>
    </item>
    <item>
      <title>AWS Lake Formation: Why Your Data Lake Permissions Are Probably a Mess (And How to Fix That)</title>
      <dc:creator>Soumyadeep Basu</dc:creator>
      <pubDate>Thu, 09 Apr 2026 21:34:51 +0000</pubDate>
      <link>https://dev.to/soumyadeep_basu_3101d3ac1/aws-lake-formation-why-your-data-lake-permissions-are-probably-a-mess-and-how-to-fix-that-1pl</link>
      <guid>https://dev.to/soumyadeep_basu_3101d3ac1/aws-lake-formation-why-your-data-lake-permissions-are-probably-a-mess-and-how-to-fix-that-1pl</guid>
      <description>&lt;p&gt;Week 1 of a series on AWS Lake Formation — from fundamentals to real-world implementation&lt;br&gt;
If you've been working with AWS data infrastructure for any length of time, you've probably set up an S3-based data lake. You created some buckets, wrote IAM policies, maybe added some bucket policies on top — and it worked.&lt;br&gt;
Until it didn't.&lt;br&gt;
Maybe your team grew. Maybe you added a second AWS account. Maybe someone asked "who exactly has access to this data?" and you spent two hours trying to piece together an answer from a maze of IAM policies.&lt;br&gt;
That's the moment AWS Lake Formation starts making sense.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Lake Formation Actually Is&lt;/strong&gt;&lt;br&gt;
Lake Formation is not a storage service. Your data still lives in S3. Lake Formation is a permissions and governance layer that sits on top of your data lake and gives you one centralized place to define, manage, and audit who can access what.&lt;br&gt;
Think of S3 + IAM as the raw infrastructure. Lake Formation is the control plane on top of it.&lt;br&gt;
With Lake Formation you can grant permissions at the:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Database level&lt;/strong&gt;&lt;br&gt;
 — this team can see this entire database&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Table level&lt;/strong&gt;&lt;br&gt;
 — this role can query this specific table&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Column level&lt;/strong&gt;&lt;br&gt;
 — this user can see everything except these sensitive columns&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Row level&lt;/strong&gt;&lt;br&gt;
 — this team can only see rows where region = 'US'&lt;br&gt;
That kind of granularity with pure IAM is technically possible. In practice it becomes unmaintainable fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why IAM Alone Breaks at Scale&lt;/strong&gt;&lt;br&gt;
Let me give you a concrete scenario.&lt;br&gt;
You have a data lake with 40 tables across 6 databases. You have 5 teams — data science, analytics, finance, marketing, and engineering. Each team needs different access to different tables, and some tables have columns that only certain roles should see.&lt;br&gt;
With pure IAM you're writing and maintaining dozens of policies, attaching them to roles, keeping bucket policies in sync, and hoping nothing drifts. When someone gets an access denied error at 9am on a Monday, good luck tracing exactly which policy is the problem.&lt;br&gt;
With Lake Formation, you open one console (or run one API call / Terraform resource) and say: data science role has SELECT on these 12 tables, excluding this column. Done. Auditable. Revokable in seconds.&lt;br&gt;
The mental model shift is significant: you stop thinking about who can access S3 paths and start thinking about who can access data assets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Core Concepts You Need to Know&lt;/strong&gt;&lt;br&gt;
Before you touch anything in Lake Formation, get these four concepts clear:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Data Catalog&lt;/strong&gt;&lt;br&gt;
 — the metadata layer. Lake Formation uses the AWS Glue Data Catalog to store database and table definitions. You register your S3 locations here and Lake Formation takes over governing access to them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Data Lake Administrator&lt;/strong&gt;&lt;br&gt;
 — a special role that has full control over Lake Formation. You'll set this up first. Don't confuse it with an IAM admin — it's a separate Lake Formation concept.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Permissions&lt;/strong&gt;&lt;br&gt;
 — Lake Formation has its own permission model (DESCRIBE, SELECT, ALTER, DROP, etc.) that maps roughly to what you'd expect from a database. These are what you grant to IAM principals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Registered Locations&lt;/strong&gt;&lt;br&gt;
 — before Lake Formation can govern an S3 path, you have to register it. This tells LF "I want you to manage access to data in this location." After that, IAM alone is no longer enough to read the data — LF permissions are required too.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What This Series Covers&lt;/strong&gt;&lt;br&gt;
This is week one of a practical series on Lake Formation. No fluff — just the things that actually matter when you're implementing this in a real environment.&lt;br&gt;
&lt;strong&gt;Here's where we're going:&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Week 2 — TBAC vs NBAC&lt;/strong&gt;: the two permission models, when to use each, and why the choice matters more than AWS lets on&lt;br&gt;
&lt;strong&gt;Week 3 — Cross-account data sharing&lt;/strong&gt;: the gotchas, the traps, and a real bug that cost me hours&lt;br&gt;
&lt;strong&gt;Week 4 — Automating Lake Formation with Terraform&lt;/strong&gt;: what works, what doesn't, and how to structure your modules&lt;br&gt;
If you work with AWS data infrastructure — whether you're building a data lake from scratch or trying to bring governance to an existing one — this series is for you.&lt;br&gt;
&lt;em&gt;&lt;strong&gt;I'm a Data Platform Engineer working with AWS data infrastructure daily. Follow along if you want the practical version of Lake Formation — not the tutorial, the real thing.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>awsdatalake</category>
      <category>aws</category>
    </item>
  </channel>
</rss>
