<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Philip Hern</title>
    <description>The latest articles on DEV Community by Philip Hern (@shrouwoods).</description>
    <link>https://dev.to/shrouwoods</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3858493%2F9a8ff83c-d5c3-493f-9943-b91a2f0a61d8.jpg</url>
      <title>DEV Community: Philip Hern</title>
      <link>https://dev.to/shrouwoods</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shrouwoods"/>
    <language>en</language>
    <item>
      <title>back at it</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Thu, 23 Apr 2026 12:27:13 +0000</pubDate>
      <link>https://dev.to/shrouwoods/back-at-it-4aa7</link>
      <guid>https://dev.to/shrouwoods/back-at-it-4aa7</guid>
      <description>&lt;h2&gt;
  
  
  thesis
&lt;/h2&gt;

&lt;p&gt;this is a small checkpoint post. the heavy lift is not finished, but i am out of the weeds for now, and that is worth naming out loud.&lt;/p&gt;

&lt;h2&gt;
  
  
  context
&lt;/h2&gt;

&lt;p&gt;the same work i was carrying in &lt;a href="https://philliant.com/posts/20260416-stick-with-it/" rel="noopener noreferrer"&gt;stick with it&lt;/a&gt; kept growing in weight and surface area. for a while it felt like one endless tangle. i stayed with it anyway, and eventually i approached it the way i should have from the start, as smaller chunks that stack into the much larger change. each piece still had to be real, but the sequencing and scope finally matched how my head and the system can tolerate change.&lt;/p&gt;

&lt;h2&gt;
  
  
  argument
&lt;/h2&gt;

&lt;p&gt;getting to a stable point did not erase the backlog. i still have more testing to run, more simulations to exercise, and real user acceptance testing ahead. the difference is that the foundation is no longer thrashing. errors and surprises have a place to land without undoing everything at once.&lt;/p&gt;

&lt;p&gt;that stability is what gave me room to breathe. i can take a short break on purpose, look at the whole arc with a little distance, and come back to the tuning work with less panic and more optimism. the remaining work is still serious, but it is the kind of serious that fits a calendar instead of the kind that owns every waking hour.&lt;/p&gt;

&lt;h3&gt;
  
  
  tension or counterpoint
&lt;/h3&gt;

&lt;p&gt;a stable checkpoint is not the same as done. if i confuse relief for completion, i will skip validation i still need. the discipline now is to rest without pretending the job is closed.&lt;/p&gt;

&lt;h2&gt;
  
  
  closing
&lt;/h2&gt;

&lt;p&gt;so i am back at it in a different posture, not firefighting the whole shape at once, but finishing the test matrix, listening to users, and dialing things in with a clearer mind. sticking with it got me here. the next stretch is about proving it in the world, calmly.&lt;/p&gt;

&lt;h2&gt;
  
  
  further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Chunking_(psychology)" rel="noopener noreferrer"&gt;chunking (psychology)&lt;/a&gt;, on breaking information and work into manageable units&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related on this site
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260416-stick-with-it/" rel="noopener noreferrer"&gt;stick with it&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260406-little-by-little-a-little-becomes-a-lot/" rel="noopener noreferrer"&gt;little by little, a little becomes a lot&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/series/commentary/" rel="noopener noreferrer"&gt;commentary series&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>persistence</category>
      <category>workflow</category>
      <category>testing</category>
      <category>stress</category>
    </item>
    <item>
      <title>stick with it</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Fri, 17 Apr 2026 13:24:41 +0000</pubDate>
      <link>https://dev.to/shrouwoods/stick-with-it-3c6i</link>
      <guid>https://dev.to/shrouwoods/stick-with-it-3c6i</guid>
      <description>&lt;h2&gt;
  
  
  thesis
&lt;/h2&gt;

&lt;p&gt;this one is for me as much as anyone reading. the single most important thing i can do on a long, hard project is keep showing up for it. motivation rises and falls, energy comes in waves, and neither of those things matter as much as continuity. if i stay with the work long enough, the payoff arrives, even when progress is invisible for stretches in the middle.&lt;/p&gt;

&lt;h2&gt;
  
  
  context
&lt;/h2&gt;

&lt;p&gt;i am in the middle of a very heavy lift right now. it started as a change i thought i would finish quickly, and it has turned into something much bigger. the effort, concentration, and validation required are more than i am used to, and the timeline has stretched well past what a typical change would take. the stress is real. i feel it in how i think about the project before bed and how quickly i reach for my laptop in the morning.&lt;/p&gt;

&lt;p&gt;i am still in it, though, because the value at the end is worth the cost. when this lands, i will have more stable and explainable historical data, which means my ongoing workload of troubleshooting data validity questions drops. less firefighting later is worth more pressure now, and that tradeoff is the only reason i would keep going through a change this heavy.&lt;/p&gt;

&lt;h2&gt;
  
  
  argument
&lt;/h2&gt;

&lt;h3&gt;
  
  
  continuity beats intensity
&lt;/h3&gt;

&lt;p&gt;motivation is a wave, not a rope. it pulls me forward for a while, then it lets go, then it comes back later with a different shape. if i tie my progress to the wave, i stop whenever the wave stops. if i tie my progress to the habit of showing up, the wave cannot take the project down with it. that is the same pattern i wrote about in &lt;a href="https://philliant.com/posts/20260406-little-by-little-a-little-becomes-a-lot/" rel="noopener noreferrer"&gt;little by little, a little becomes a lot&lt;/a&gt;, just applied to a single long problem instead of a daily practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  isolate your changes, even in your own playground
&lt;/h3&gt;

&lt;p&gt;the hardest lesson from this round is about isolation. i have been testing work in an environment i consider my playground, and for a long time that has been fine. this time, my changes broke downstream consumers, and the pressure immediately escalated because other people were suddenly blocked. the takeaway is simple. if my changes can reach downstream consumers, i need to separate my testing from a shared test environment, regardless of how freely i am used to moving in that space. a playground still has neighbors.&lt;/p&gt;

&lt;h3&gt;
  
  
  do not try to lift several objects at once
&lt;/h3&gt;

&lt;p&gt;i also tried to move multiple pieces of the system at the same time. i thought bundling them would be faster. what actually happened is that each piece depended on the others in a way that made every single one harder to validate, and the total stress grew faster than the total work. smaller, sequential chunks would have finished sooner and felt calmer. one object at a time, even if it feels slower on paper, is almost always faster in practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  better preparation shrinks the stress
&lt;/h3&gt;

&lt;p&gt;the last lesson is about preparation. i went in expecting a small change and i prepared like it was a small change. when the scope grew, my preparation did not grow with it, and that mismatch is where the break points appeared. better preparation up front, regardless of how small i thought the task was, would have reduced both the stress and the number of places things could go wrong. the cost of preparing for a bigger job than you need is tiny. the cost of not preparing for the job you actually have is not.&lt;/p&gt;

&lt;h3&gt;
  
  
  tension or counterpoint
&lt;/h3&gt;

&lt;p&gt;persistence is not the same as refusing to reassess. sticking with every hard thing forever is just sunk cost fallacy wearing a motivational t-shirt. the honest check i keep running is whether the value at the end is still real and still mine. if the answer is yes, i keep going. if the answer turns into no, i stop, and that is not quitting, that is discernment.&lt;/p&gt;

&lt;p&gt;there is also a stress cost to "push through" language. if the pressure is spilling into health, relationships, or judgment, that is a signal to change the pace, not a signal to try harder. pushing through is a tool, not a strategy, and it only works when i also rest and isolate the work properly. that is part of why i think it helps to get &lt;a href="https://philliant.com/posts/20260410-comfortable-being-uncomfortable/" rel="noopener noreferrer"&gt;comfortable being uncomfortable&lt;/a&gt; without confusing discomfort for permission to keep grinding.&lt;/p&gt;

&lt;h2&gt;
  
  
  closing
&lt;/h2&gt;

&lt;p&gt;so this is my note to myself. keep going. the work is real, the value is real, and the lessons i am collecting on the way are already paying off for the next change. next time i will isolate my testing better, break the work into one object at a time, and prepare like the task is bigger than i think it is, because it almost always is.&lt;/p&gt;

&lt;p&gt;and if the wave of motivation dips again tomorrow, that is fine. waves dip. what matters is that i still show up, finish one more piece, and trust that continuity is the actual engine. stick with it.&lt;/p&gt;

&lt;h2&gt;
  
  
  further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Grit_(personality_trait)" rel="noopener noreferrer"&gt;grit (personality trait)&lt;/a&gt;, angela duckworth on long-term persistence&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Sunk_cost" rel="noopener noreferrer"&gt;sunk cost fallacy&lt;/a&gt;, useful balance for deciding when to keep going versus when to stop&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related on this site
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260406-little-by-little-a-little-becomes-a-lot/" rel="noopener noreferrer"&gt;little by little, a little becomes a lot&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260410-comfortable-being-uncomfortable/" rel="noopener noreferrer"&gt;comfortable being uncomfortable&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260327-adaptability/" rel="noopener noreferrer"&gt;adaptability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/series/commentary/" rel="noopener noreferrer"&gt;commentary series&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>persistence</category>
      <category>consistency</category>
      <category>stress</category>
      <category>selftalk</category>
    </item>
    <item>
      <title>comfortable being uncomfortable</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Fri, 10 Apr 2026 19:29:32 +0000</pubDate>
      <link>https://dev.to/shrouwoods/comfortable-being-uncomfortable-5dac</link>
      <guid>https://dev.to/shrouwoods/comfortable-being-uncomfortable-5dac</guid>
      <description>&lt;h2&gt;
  
  
  thesis
&lt;/h2&gt;

&lt;p&gt;i want to normalize a simple idea that still feels hard in practice. getting outside of your comfort zone is not a side quest. it is the main mechanism by which you stretch, learn, and see the world with more room in it for other people. yes, it is uncomfortable, and that is exactly the point.&lt;/p&gt;

&lt;h2&gt;
  
  
  context
&lt;/h2&gt;

&lt;p&gt;most of us are trained to seek stability. stability is not bad, but it is also not where the adaptation happens. when everything feels familiar, your brain is mostly rehearsing what it already knows. the moment you step into something new, the cost shows up immediately as awkwardness, uncertainty, or fear of looking foolish. that friction is not a sign you chose wrong. it is often a sign you chose honestly.&lt;/p&gt;

&lt;h2&gt;
  
  
  argument
&lt;/h2&gt;

&lt;p&gt;change is disruptive by definition. if it did not interrupt your default patterns, it would not be change. i think we should embrace that disruption more often, because it is where new experiences actually enter your life. without that interruption, you mostly get repetition with better packaging.&lt;/p&gt;

&lt;p&gt;the growth part is not theoretical. discomfort is where skills get pressure-tested. you learn not only how to do things, but how &lt;strong&gt;not&lt;/strong&gt; to do things, which is just as valuable and often faster feedback. mistakes in public or under stress are expensive emotionally, but they are also unusually clear. they show you boundaries, preferences, and limits in a way that a comfortable afternoon rarely will.&lt;/p&gt;

&lt;p&gt;more experiences also broaden your worldview in a practical sense. when you have seen more contexts, constraints, and ways people solve problems, it becomes harder to treat your own habits as universal law. that widening tends to produce more tolerant and compassionate attitudes, not because tolerance is a slogan, but because you have more firsthand evidence that reasonable people can live and work in very different, equally valid ways.&lt;/p&gt;

&lt;p&gt;so my encouragement is simple. expose yourself to new experiences on purpose. seek situations where the pressure is on you to perform, because that is where you rise to the occasion and discover how capable you can be. it is also where you might discover that this is not your thing, and you should move on. either outcome is a win, because both give you self-insight you cannot fake. you learn what energizes you, what drains you, and what you are willing to practice until it gets easier.&lt;/p&gt;

&lt;p&gt;this connects to how i think about &lt;a href="https://philliant.com/posts/20260327-adaptability/" rel="noopener noreferrer"&gt;adaptability&lt;/a&gt; in general. comfort is a resting state. adaptation requires movement.&lt;/p&gt;

&lt;h3&gt;
  
  
  tension or counterpoint
&lt;/h3&gt;

&lt;p&gt;there is a real downside to glorifying discomfort without boundaries. not every challenge is worth the cost, and not every "growth opportunity" is ethical or safe. pushing yourself is different from letting yourself be pushed past your values or health. the goal is not suffering for its own sake. the goal is chosen stretch, with recovery and discernment built in.&lt;/p&gt;

&lt;h2&gt;
  
  
  closing
&lt;/h2&gt;

&lt;p&gt;i am not asking for constant chaos. i am asking for a bias toward the new when you can afford it, and toward the high-stakes try when you are ready. the uncomfortable path is where you find out who you are when the easy defaults are not available, and that knowledge is about as practical as it gets.&lt;/p&gt;

&lt;h2&gt;
  
  
  further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Comfort_zone" rel="noopener noreferrer"&gt;comfort zone&lt;/a&gt; (psychology of performance and anxiety)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related on this site
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260327-adaptability/" rel="noopener noreferrer"&gt;adaptability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260406-little-by-little-a-little-becomes-a-lot/" rel="noopener noreferrer"&gt;little by little, a little becomes a lot&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/series/commentary/" rel="noopener noreferrer"&gt;commentary series&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>growth</category>
      <category>change</category>
      <category>comfortzone</category>
      <category>learning</category>
    </item>
    <item>
      <title>dbt snapshots: moving from merges to native history</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Fri, 10 Apr 2026 19:29:31 +0000</pubDate>
      <link>https://dev.to/shrouwoods/dbt-snapshots-moving-from-merges-to-native-history-cjd</link>
      <guid>https://dev.to/shrouwoods/dbt-snapshots-moving-from-merges-to-native-history-cjd</guid>
      <description>&lt;h2&gt;
  
  
  quick answer
&lt;/h2&gt;

&lt;p&gt;dbt snapshots provide a native way to track slowly changing dimensions over time. by migrating from custom merge statements to native dbt snapshots, you can simplify your codebase, rely on built-in history tracking, and ensure your downstream models always have access to point-in-time records.&lt;/p&gt;

&lt;h2&gt;
  
  
  who this is for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;audience: data engineers and analytics engineers using dbt&lt;/li&gt;
&lt;li&gt;prerequisites: basic knowledge of dbt models, sql, and data warehousing concepts&lt;/li&gt;
&lt;li&gt;when to use this guide: when you need to track historical changes to mutable source records and want to move away from manual merge logic&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  why this matters
&lt;/h2&gt;

&lt;p&gt;tracking historical changes is a common requirement in data warehousing. building custom merge logic to handle inserts, updates, and history tracking is error-prone and difficult to maintain. dbt snapshots handle the heavy lifting of history tracking out of the box. this ensures you do not lose historical context when source systems overwrite data.&lt;/p&gt;

&lt;h2&gt;
  
  
  moving from merge to snapshot
&lt;/h2&gt;

&lt;p&gt;recently, i migrated several historical tables from a custom merge strategy to native dbt snapshots. the previous approach relied on complex merge statements that manually checked for changes and inserted or updated rows to maintain history. this was difficult to read and even harder to debug.&lt;/p&gt;

&lt;p&gt;by adopting native dbt snapshots, the logic became declarative. instead of writing the exact update and insert commands, i only needed to define the source query and configure how dbt should detect changes. the downstream consumer views then filter the snapshot output to return the current row or a point-in-time record.&lt;/p&gt;

&lt;h3&gt;
  
  
  the core shift in thinking
&lt;/h3&gt;

&lt;p&gt;when using snapshots, your snapshot definition should remain source-representative. do not apply business date-window filtering in the snapshot definition itself. instead, capture the raw history and apply your logic for which rows to return in downstream consumer views.&lt;/p&gt;

&lt;p&gt;for example, to get the current row in a downstream model, you filter using the sentinel value:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="p"&gt;{{&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'my_snapshot_st'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;dbt_valid_to&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'9999-12-31 23:59:59'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to get a freeze record for a specific point in time, you derive a freeze timestamp and filter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="p"&gt;{{&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'my_snapshot_st'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;dbt_valid_from&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;freeze_ts&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;dbt_valid_to&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;freeze_ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  basic example
&lt;/h2&gt;

&lt;p&gt;here is a basic example of a dbt snapshot using the check strategy. this snapshot tracks changes to a practice affiliation table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;snapshot&lt;/span&gt; &lt;span class="n"&gt;practice_affiliation_st&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;{{&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;target_schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'snapshots'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'check'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;unique_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'fmno'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'cycle'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'committee'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'hierarchy'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;check_cols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'all'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;hard_deletes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'invalidate'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;dbt_valid_to_current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;"to_timestamp_ntz('9999-12-31 23:59:59')"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}}&lt;/span&gt;

&lt;span class="k"&gt;select&lt;/span&gt;
    &lt;span class="n"&gt;fmno&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;cycle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;committee&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;practice_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;hierarchy&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="p"&gt;{{&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'source_practice_affiliation_v'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;endsnapshot&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  configuration options
&lt;/h2&gt;

&lt;p&gt;dbt snapshots offer several configuration options that control how changes are detected and recorded. you can read more about these in the &lt;a href="https://docs.getdbt.com/docs/build/snapshots" rel="noopener noreferrer"&gt;official dbt snapshot documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;here are the key options and what they control:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;target_schema&lt;/strong&gt;: the schema where the snapshot table will be built&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;strategy&lt;/strong&gt;: determines how dbt detects changes, with the two main options being timestamp and check&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;unique_key&lt;/strong&gt;: the primary key of the record, which can be a single column or a list of columns for a composite key&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;check_cols&lt;/strong&gt;: used with the check strategy to specify which columns to monitor for changes, accepting a list of column names or the word all&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;updated_at&lt;/strong&gt;: used with the timestamp strategy to specify the column that indicates when the source row was last modified&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;hard_deletes&lt;/strong&gt;: controls how dbt handles rows that disappear from the source, such as setting it to invalidate to close the current row when a key is no longer present&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;dbt_valid_to_current&lt;/strong&gt;: overrides the default null value for current records, allowing you to set a far-future date to make downstream filtering easier&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  timestamp vs check strategy
&lt;/h3&gt;

&lt;p&gt;the choice between timestamp and check strategies is critical.&lt;/p&gt;

&lt;p&gt;use the timestamp strategy when your source has a reliable updated column that changes whenever the row changes. dbt compares the source timestamp to the snapshot timestamp to decide if a new version is needed.&lt;/p&gt;

&lt;p&gt;use the check strategy when you do not have a reliable updated timestamp, or when you want to detect any change in a specific set of columns. dbt compares the actual values of the check columns between the source and the current snapshot row. if any checked column differs, dbt closes the current row and inserts a new version.&lt;/p&gt;

&lt;p&gt;in my recent work, i found that the check strategy with all columns checked and a composite unique key was the most robust approach for sources where the updated timestamp was synthetic or not authoritative.&lt;/p&gt;

&lt;h2&gt;
  
  
  gotchas and lessons learned
&lt;/h2&gt;

&lt;p&gt;migrating to snapshots surfaced a few important lessons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;upstream scope gating&lt;/strong&gt;: if your upstream source query includes filters that remove keys, and you have hard deletes configured to invalidate, dbt will intentionally close the current rows for those missing keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;composite keys&lt;/strong&gt;: dbt fully supports composite unique keys, and passing a list of columns ensures that dbt tracks history at the correct grain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;duplicate source rows&lt;/strong&gt;: snapshots expect the source data to be unique at the unique key grain, so if your source contains duplicate keys, the snapshot will fail or bloat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;defensive deduplication&lt;/strong&gt;: in some cases, i had to add a defensive qualify row number guard in the snapshot definition to collapse known duplicate-key source rows before dbt processed them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sentinel values&lt;/strong&gt;: using a sentinel value for current rows instead of null makes downstream queries much cleaner, allowing you to use an equals operator instead of checking for nulls&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  deployment and automation
&lt;/h2&gt;

&lt;p&gt;snapshots are not updated automatically when you run your standard dbt build commands. they require a dedicated command: &lt;code&gt;dbt snapshot&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;if you do not automate this, your history tracking will be manual and prone to gaps. to ensure continuous history capture, you must schedule the snapshot command to run on a regular cadence.&lt;/p&gt;

&lt;p&gt;in a production environment, this usually means setting up a continuous integration workflow or an orchestrator task. for example, you can use automated workflows to run snapshot tags on daily, hourly, or monthly schedules.&lt;/p&gt;

&lt;p&gt;a typical automated workflow might look like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;a scheduled trigger fires the workflow&lt;/li&gt;
&lt;li&gt;the workflow checks out the repository and sets up the dbt environment&lt;/li&gt;
&lt;li&gt;the workflow executes the snapshot command for specific tags&lt;/li&gt;
&lt;li&gt;dbt connects to the warehouse, compares the source data to the existing snapshot tables, and applies any necessary inserts or updates&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;by decoupling the snapshot schedule from your standard model runs, you can capture history at the exact frequency your business logic requires.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/docs/build/snapshots" rel="noopener noreferrer"&gt;dbt snapshots documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/reference/snapshot-configs" rel="noopener noreferrer"&gt;dbt snapshot configurations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/" rel="noopener noreferrer"&gt;dbt models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dbt</category>
      <category>dataengineering</category>
      <category>snowflake</category>
      <category>snapshots</category>
    </item>
    <item>
      <title>little by little, a little becomes a lot</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Tue, 07 Apr 2026 23:13:24 +0000</pubDate>
      <link>https://dev.to/shrouwoods/little-by-little-a-little-becomes-a-lot-2acf</link>
      <guid>https://dev.to/shrouwoods/little-by-little-a-little-becomes-a-lot-2acf</guid>
      <description>&lt;h2&gt;
  
  
  thesis
&lt;/h2&gt;

&lt;p&gt;the importance of just trying to do a little each day cannot be overstated. we often overestimate what we can accomplish in a single afternoon, but we vastly underestimate what we can build over a year of sustained effort. incremental changes add up, and what seems like a drop in the bucket today becomes a reservoir over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  context
&lt;/h2&gt;

&lt;p&gt;whether it is work, fitness, or building new routines, habit forming often feels like it takes forever to take hold. we live in a world that expects immediate results, and we naturally get frustrated when the scale does not move or the project does not finish overnight. when that initial burst of motivation inevitably fades, the reality of the daily grind sets in, and that is exactly when most people decide to walk away.&lt;/p&gt;

&lt;h2&gt;
  
  
  argument
&lt;/h2&gt;

&lt;p&gt;this process is a little easier when you understand you are working toward continuity and not perfection. it is not about executing flawlessly every single day, nor is it about never taking a break. it is about showing up and putting in the reps, even when the effort feels small or uninspired. missing one day is just a bump in the road, as long as you do not let it become two days in a row.&lt;/p&gt;

&lt;p&gt;i am starting to see the rewards of that mindset now. little by little, my experience has added up into a foundation i can actually rely on. little by little, i have started sharing via my website, turning scattered thoughts into a structured body of work. little by little, i am starting to have a greater reach and help more people, simply because i chose to publish something small rather than waiting for the perfect masterpiece.&lt;/p&gt;

&lt;h3&gt;
  
  
  tension or counterpoint
&lt;/h3&gt;

&lt;p&gt;the hardest part is trusting the process when the visible progress is zero. it is incredibly easy to quit when you do not see the immediate payoff of your daily effort. it feels like you are just watering dirt for weeks on end. but the compounding effect of showing up is real, even if it remains completely invisible in the short term.&lt;/p&gt;

&lt;h2&gt;
  
  
  closing
&lt;/h2&gt;

&lt;p&gt;so right in theme, i will keep this short and keep focusing on the small, daily inputs rather than the distant outputs. the goal is simply to keep the chain going, trusting that a little becomes a lot when you give it enough time.&lt;/p&gt;

&lt;h2&gt;
  
  
  further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://jamesclear.com/atomic-habits" rel="noopener noreferrer"&gt;atomic habits&lt;/a&gt; (james clear)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.tinyhabits.com/" rel="noopener noreferrer"&gt;tiny habits&lt;/a&gt; (bj fogg)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Small_wins" rel="noopener noreferrer"&gt;small wins&lt;/a&gt; (karl weick, organizational change)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related on this site
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260402-sharing-is-caring/" rel="noopener noreferrer"&gt;sharing is caring&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260405-automated-devto-linkedin-visibility/" rel="noopener noreferrer"&gt;how i automated dev.to and linkedin publishing so visibility stops depending on memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260327-adaptability/" rel="noopener noreferrer"&gt;adaptability&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>habits</category>
      <category>consistency</category>
      <category>growth</category>
    </item>
    <item>
      <title>how i automated dev.to and linkedin publishing so visibility stops depending on memory</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Sun, 05 Apr 2026 14:03:19 +0000</pubDate>
      <link>https://dev.to/shrouwoods/how-i-automated-devto-and-linkedin-publishing-so-visibility-stops-depending-on-memory-2g2i</link>
      <guid>https://dev.to/shrouwoods/how-i-automated-devto-and-linkedin-publishing-so-visibility-stops-depending-on-memory-2g2i</guid>
      <description>&lt;p&gt;after i started writing more consistently, it became obvious that writing is only half the work; distribution is the other half. i wanted a system where i can publish from one canonical source and let automation push the same story to dev.to and linkedin.&lt;/p&gt;

&lt;h2&gt;
  
  
  quick answer
&lt;/h2&gt;

&lt;p&gt;i set up two publish automations that watch my post changes and sync them to dev.to and linkedin. the first publish creates the post on each platform, and later edits update the same external post instead of creating duplicates. this gives me consistent visibility without adding manual publishing steps after every article.&lt;/p&gt;

&lt;h2&gt;
  
  
  who this is for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;people who publish technical writing and keep forgetting cross-posting&lt;/li&gt;
&lt;li&gt;creators who want one canonical source plus repeatable distribution&lt;/li&gt;
&lt;li&gt;builders who care about discoverability as much as writing quality&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  why this matters
&lt;/h2&gt;

&lt;p&gt;if distribution is manual, it eventually slips. then strong posts sit unread because i forgot to copy, paste, format, and re-share them across platforms. automation solves that by making visibility part of the same delivery path as the content itself.&lt;/p&gt;

&lt;p&gt;this is the same pattern i described in &lt;a href="https://philliant.com/posts/20260319-practical-ai-workflow-jira-github-mcp/" rel="noopener noreferrer"&gt;a practical ai workflow: jira, github, and mcp&lt;/a&gt;, define one clear source of truth, then automate the handoff steps so i can spend more time on thinking and less time on clerical work.&lt;/p&gt;

&lt;h2&gt;
  
  
  step-by-step
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) define the starting point
&lt;/h3&gt;

&lt;p&gt;i chose my site post as the only canonical source. every external platform receives content from that source, not from separate drafts. this keeps language, links, and updates aligned over time.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) apply the change
&lt;/h3&gt;

&lt;p&gt;i added automation for both targets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;trigger on post updates and support manual runs when i want a full backfill&lt;/li&gt;
&lt;li&gt;create posts when no external mapping exists&lt;/li&gt;
&lt;li&gt;update existing external posts when a mapping already exists&lt;/li&gt;
&lt;li&gt;keep a small state map so each canonical url stays attached to one external post id&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;the practical result is that i can keep writing in one place and trust the sync layer to handle distribution. this complements the writing habits from &lt;a href="https://philliant.com/posts/20260313-my-cursor-setup/" rel="noopener noreferrer"&gt;my cursor setup&lt;/a&gt;, where reusable workflows remove repeated manual work.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) validate the result
&lt;/h3&gt;

&lt;p&gt;i test in three passes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;dry run to confirm detection and decisions without publishing&lt;/li&gt;
&lt;li&gt;publish-all run to verify initial backfill behavior&lt;/li&gt;
&lt;li&gt;normal change-trigger run to verify incremental updates on later edits&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;when all three pass, i know the pipeline is reliable enough for daily use.&lt;/p&gt;

&lt;h2&gt;
  
  
  faq
&lt;/h2&gt;

&lt;h3&gt;
  
  
  what was the biggest setup mistake?
&lt;/h3&gt;

&lt;p&gt;token and redirect mismatches during oauth were the main failure point at first. once i aligned scopes, callback values, and secret placement, the automation became stable.&lt;/p&gt;

&lt;h3&gt;
  
  
  should i keep manual publishing as a fallback?
&lt;/h3&gt;

&lt;p&gt;yes, especially while you are in early setup. after the workflow proves stable, manual publishing becomes a recovery path instead of a default habit.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://developers.forem.com/api" rel="noopener noreferrer"&gt;dev.to api docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/developers/" rel="noopener noreferrer"&gt;linkedin developer platform&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.github.com/en/actions" rel="noopener noreferrer"&gt;github actions documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260319-practical-ai-workflow-jira-github-mcp/" rel="noopener noreferrer"&gt;a practical ai workflow: jira, github, and mcp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260313-my-cursor-setup/" rel="noopener noreferrer"&gt;my cursor setup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260315-starter-templates-for-ai-rules-skills-and-commands/" rel="noopener noreferrer"&gt;starter templates for ai rules, skills, and commands&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>automation</category>
      <category>devto</category>
      <category>linkedin</category>
      <category>publishing</category>
    </item>
    <item>
      <title>the future of data engineering workflows with ai</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Fri, 03 Apr 2026 14:11:59 +0000</pubDate>
      <link>https://dev.to/shrouwoods/the-future-of-data-engineering-workflows-with-ai-42mb</link>
      <guid>https://dev.to/shrouwoods/the-future-of-data-engineering-workflows-with-ai-42mb</guid>
      <description>&lt;h2&gt;
  
  
  quick answer
&lt;/h2&gt;

&lt;p&gt;the future of data engineering workflows with ai is about moving from manual coding to intelligent orchestration. ai agents will handle boilerplate code, pipeline generation, and data quality checks, allowing data engineers to focus on architecture, governance, and business value.&lt;/p&gt;

&lt;h2&gt;
  
  
  who this is for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;audience: data engineers, analytics engineers, data architects, and technical leaders.&lt;/li&gt;
&lt;li&gt;prerequisites: an understanding of modern data stack concepts and basic ai principles.&lt;/li&gt;
&lt;li&gt;when to use this guide: when planning your data strategy and evaluating how to integrate ai into your engineering practices.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  why this matters
&lt;/h2&gt;

&lt;p&gt;the volume and complexity of data are growing faster than engineering teams can scale. relying solely on manual workflows leads to bottlenecks, technical debt, and delayed insights. embracing ai is not just about efficiency, it is a strategic imperative to remain competitive.&lt;/p&gt;

&lt;h2&gt;
  
  
  step-by-step
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) define the starting point
&lt;/h3&gt;

&lt;p&gt;traditionally, data engineering has been a highly manual discipline. engineers spend countless hours writing sql, configuring orchestrators like airflow, and debugging failed pipelines. this approach is brittle and scales poorly as the organization grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) apply the change
&lt;/h3&gt;

&lt;p&gt;the integration of ai changes this paradigm. large language models can now generate complex sql queries, translate between dialects, and even suggest optimal data models based on source schemas. ai agents can monitor pipeline health, automatically retry transient failures, and alert engineers only when human intervention is necessary. this shift transforms the engineer from a coder into a system architect.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) validate the result
&lt;/h3&gt;

&lt;p&gt;the impact of this transformation is measurable. development cycles shorten, data quality improves through automated testing, and the overall reliability of the platform increases. engineers spend less time firefighting and more time building scalable, resilient architectures that drive business decisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  faq
&lt;/h2&gt;

&lt;h3&gt;
  
  
  what is the most important caveat?
&lt;/h3&gt;

&lt;p&gt;ai is a tool, not a replacement for fundamental engineering principles. you still need a strong understanding of data modeling, governance, and security to build a robust platform.&lt;/p&gt;

&lt;h3&gt;
  
  
  what should i do first?
&lt;/h3&gt;

&lt;p&gt;start by identifying the most repetitive tasks in your workflow, such as writing documentation or basic transformations. experiment with ai tools to automate these specific areas before attempting to overhaul your entire architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://a16z.com/2020/10/15/the-emerging-architectures-for-modern-data-infrastructure/" rel="noopener noreferrer"&gt;the modern data stack&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260318-from-prototype-to-production-ai/" rel="noopener noreferrer"&gt;from prototype to production ai&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dataengineering</category>
      <category>ai</category>
      <category>workflow</category>
      <category>future</category>
    </item>
    <item>
      <title>how i use cursor and ai agents to write dbt tests and documentation</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Fri, 03 Apr 2026 14:07:49 +0000</pubDate>
      <link>https://dev.to/shrouwoods/how-i-use-cursor-and-ai-agents-to-write-dbt-tests-and-documentation-46od</link>
      <guid>https://dev.to/shrouwoods/how-i-use-cursor-and-ai-agents-to-write-dbt-tests-and-documentation-46od</guid>
      <description>&lt;h2&gt;
  
  
  quick answer
&lt;/h2&gt;

&lt;p&gt;writing dbt tests and documentation is often the most neglected part of data engineering. i use cursor and custom ai agents to automate this process by reading my sql models, inferring the business logic, and generating the corresponding yaml files. this ensures high-quality data pipelines without the manual overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  who this is for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;audience: data engineers, analytics engineers, and developers using dbt&lt;/li&gt;
&lt;li&gt;prerequisites: basic knowledge of dbt, sql, and cursor&lt;/li&gt;
&lt;li&gt;when to use this guide: when you want to scale your data engineering practices and reduce the time spent on writing boilerplate yaml&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  why this matters
&lt;/h2&gt;

&lt;p&gt;documentation and testing are critical for data trust, but they are tedious to write manually. when these steps are skipped, data quality suffers and debugging becomes a nightmare. by automating this with ai, you get the benefits of rigorous testing and clear documentation while freeing up your time for higher-value architectural work.&lt;/p&gt;

&lt;h2&gt;
  
  
  step-by-step
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) define the starting point
&lt;/h3&gt;

&lt;p&gt;most data engineers start with a raw sql model and a blank slate for their &lt;code&gt;schema.yml&lt;/code&gt; file. the traditional approach requires manually typing out every column name, description, and test. this is prone to human error and inconsistency, plus almost always falls out of sync with current models with the first change.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) apply the change
&lt;/h3&gt;

&lt;p&gt;i use cursor to bridge this gap. by creating specific ai rules and skills, i can highlight a dbt model and ask the agent to generate the documentation. the agent reads the sql, understands the joins and transformations, and produces a complete yaml file with standard tests like &lt;code&gt;not_null&lt;/code&gt; and &lt;code&gt;unique&lt;/code&gt;. it can even infer complex relationships and suggest custom tests based on the data domain.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) validate the result
&lt;/h3&gt;

&lt;p&gt;once the ai generates the yaml, i review it for accuracy. i then run &lt;code&gt;dbt test&lt;/code&gt; and &lt;code&gt;dbt docs generate&lt;/code&gt; to ensure everything compiles correctly. the ai rarely makes syntax errors, so the validation step is mostly about confirming the business logic aligns with the documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  faq
&lt;/h2&gt;

&lt;h3&gt;
  
  
  what is the most important caveat?
&lt;/h3&gt;

&lt;p&gt;you must still review the generated output. ai is excellent at scaffolding and inferring patterns, but it does not possess the full business context that you do.&lt;/p&gt;

&lt;h3&gt;
  
  
  what should i do first?
&lt;/h3&gt;

&lt;p&gt;start by creating a simple cursor skill that defines your team's standards for dbt documentation. feed it a few examples of your best &lt;code&gt;schema.yml&lt;/code&gt; files so it learns your preferred style.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/" rel="noopener noreferrer"&gt;dbt documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260313-my-cursor-setup/" rel="noopener noreferrer"&gt;my cursor setup&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dbt</category>
      <category>cursor</category>
      <category>ai</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>sharing is caring</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Thu, 02 Apr 2026 12:11:05 +0000</pubDate>
      <link>https://dev.to/shrouwoods/sharing-is-caring-2303</link>
      <guid>https://dev.to/shrouwoods/sharing-is-caring-2303</guid>
      <description>&lt;h2&gt;
  
  
  the value of early adoption
&lt;/h2&gt;

&lt;p&gt;i have always found a unique kind of energy in being an early adopter. when a new tool emerges, especially something as transformative as cursor and artificial intelligence, diving in headfirst is not just about personal efficiency. it is about understanding the landscape before the map is fully drawn. by spending the hours required to become a high-level user, i build a deep familiarity with the edges of what the technology can do.&lt;/p&gt;

&lt;p&gt;this mastery translates directly into value for my colleagues. when you understand the high-level nuance of a complex tool, you naturally become the point person for your team. people have onboarding questions, they hit roadblocks, and they need someone who has already navigated those early frustrations. being that resource is incredibly rewarding. it shifts my role from an individual contributor to a multiplier, helping the entire team elevate their workflow and avoid the pitfalls i have already solved.&lt;/p&gt;

&lt;h2&gt;
  
  
  the responsibility to share
&lt;/h2&gt;

&lt;p&gt;this dynamic reminds me of a principle i have heard often over the years regarding the importance of using your voice and your platform to share. this is exactly why i started this website. i wanted a dedicated space to share my voice, my knowledge, my opinions, my experience, and the solutions i have discovered along the way.&lt;/p&gt;

&lt;p&gt;when you hold onto knowledge, its impact is limited to your own output. when you share it, the impact scales infinitely. writing about these tools, documenting my workflows, and answering the nuanced questions my colleagues ask are all extensions of the same core belief. knowledge is meant to be distributed.&lt;/p&gt;

&lt;h2&gt;
  
  
  stepping into mentorship
&lt;/h2&gt;

&lt;p&gt;i have reached a point of mastery and experience where the natural next step for me is to mentor others and deliberately increase my visibility and presence. it is no longer enough to simply be good at what i do behind the scenes. the real work now is in lifting others up.&lt;/p&gt;

&lt;p&gt;in fact, i am starting to feel the weight of this realization. it feels almost selfish not to share what i have learned. when you spend years honing a craft or mastering a paradigm-shifting tool like ai-assisted development, you accumulate a wealth of invisible context. keeping that context locked away serves no one. stepping into a mentorship role, both directly with my colleagues and publicly through this platform, is how i honor the effort it took to gain that experience in the first place.&lt;/p&gt;

&lt;h2&gt;
  
  
  looking forward
&lt;/h2&gt;

&lt;p&gt;my goal is to continue exploring the bleeding edge of these tools, but with a renewed focus on how i can translate those discoveries into accessible guidance for others. whether it is through answering a quick onboarding question about cursor, writing a detailed guide on this site, or simply being a sounding board for a colleague, the objective remains the same. i want to use my experience to make the path easier for those who follow.&lt;/p&gt;

&lt;h2&gt;
  
  
  further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.cursor.com/" rel="noopener noreferrer"&gt;cursor documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related on this site
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/" rel="noopener noreferrer"&gt;post title&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>mentorship</category>
      <category>ai</category>
      <category>cursor</category>
      <category>earlyadoption</category>
    </item>
    <item>
      <title>what is art?</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Mon, 30 Mar 2026 23:38:30 +0000</pubDate>
      <link>https://dev.to/shrouwoods/what-is-art-1ofe</link>
      <guid>https://dev.to/shrouwoods/what-is-art-1ofe</guid>
      <description>&lt;h2&gt;
  
  
  thesis
&lt;/h2&gt;

&lt;p&gt;i keep pondering lately, what are we actually defending when we say "ai art is not real art"?&lt;/p&gt;

&lt;p&gt;i do not have a final position yet. i am writing this to think in public, not to close the debate.&lt;/p&gt;

&lt;h2&gt;
  
  
  context
&lt;/h2&gt;

&lt;p&gt;while driving on a family vacation, i asked my wife to fulfill her duty as the passenger and dj some motown bangers. she searched on spotify and found something that seemed to fit the bill. most of the songs were recognizable, memories from my childhood, riding in the backseat listening to my parents' favorites. however, the first song on the playlist was by an artist called the &lt;strong&gt;19s soulers&lt;/strong&gt;, which was an artist i did not recognize. this was a user created playlist so not everything might fit perfectly into the motown mold i was asking for, and that was ok. the song started and it was &lt;strong&gt;SOOO&lt;/strong&gt; good. &lt;strong&gt;TOO&lt;/strong&gt; good. i had my suspicions, but the music caught me so hard that i completely forgot. i asked my boys in the back seat to look up the artist and they did not even search and just responded "AI DAD - IT IS AI". i felt so many conflicting emotions, including one of pride that my boys could tell the difference and they have some defense against being fooled.&lt;/p&gt;

&lt;p&gt;the conversation around ai-generated images and music feels hotter every week, especially when a new ai music act gets attention or a contract. the reaction is often immediate and predictable outrage, fear, dismissal, and arguments about stolen style.&lt;/p&gt;

&lt;p&gt;at the same time, many of us use ai to help write code, review pull requests, or shape architecture notes without the same emotional response. that contrast is interesting to me.&lt;/p&gt;

&lt;p&gt;if i call code a craft, and sometimes an art form, then why does ai help feel acceptable there for so many people, but unacceptable when the ai helps write song lyrics? and if code can be expressive, why is the outrage concentrated in painting, illustration, and music.&lt;/p&gt;

&lt;p&gt;my code has my fingerprints all over it, just as much as this website and the way i speak and write. it defintely qualifies as expressive. i make stylistic, logic, function, etc. choices that suit my style. how is this different from writing a book? but if you asked which one is acceptable to use ai and which one is not, i could guess your answer 99% of the time.&lt;/p&gt;

&lt;h2&gt;
  
  
  argument
&lt;/h2&gt;

&lt;p&gt;i see a few possible reasons, and none of them feel complete on their own:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;visual art and music are tied to identity in a very direct way&lt;/li&gt;
&lt;li&gt;audiences often connect to the "maker story", not only the artifact&lt;/li&gt;
&lt;li&gt;creative labor markets in those fields already felt fragile before ai&lt;/li&gt;
&lt;li&gt;software teams have normalized tool-assisted output for decades&lt;/li&gt;
&lt;li&gt;code is often judged by function first, while art is judged by intention and feeling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;still, even with those differences, i cannot shake the inconsistency.&lt;/p&gt;

&lt;p&gt;when i use ai in code, i still feel like the author because i set constraints, reject bad output, and own the result. i do not think that is very different from guiding a visual generator, editing outputs, and curating a final piece. maybe the difference is only social permission, not creative mechanics.&lt;/p&gt;

&lt;p&gt;this question also links to my concern about ownership in &lt;a href="https://philliant.com/posts/20260326-the-danger-of-trusting-the-ai-agent/" rel="noopener noreferrer"&gt;the danger of trusting the ai agent&lt;/a&gt;, where speed is useful but responsibility still has to stay human.&lt;/p&gt;

&lt;h3&gt;
  
  
  tension or counterpoint
&lt;/h3&gt;

&lt;p&gt;there is also a strong counterpoint i take seriously: in code, wrong answers fail in visible ways. tests fail, services break, users complain, and teams can trace accountability. in art, value is less binary, and that makes authorship feel more central and more vulnerable.&lt;/p&gt;

&lt;p&gt;another counterpoint is economic, not philosophical. people may not be reacting to "is this art" at all. they may be reacting to "will this replace my livelihood".&lt;/p&gt;

&lt;p&gt;both of those points feel real to me.&lt;/p&gt;

&lt;p&gt;and i think the later point is one worth exploring because the wide-spread availability of ai has "democratized" creativity, technical endeavors, etc. for people who might have great ideas, but not the musical or technical skill to carry out the plan. well, now they do. and that instant competition that was not present before can certainly feel intimidating and encroaching.&lt;/p&gt;

&lt;p&gt;i am currently mostly pro-ai, but with caution. we should have caution regarding how the models are being trained (and on what data) and regulated. we should exercise caution surrounding &lt;em&gt;who&lt;/em&gt; is doing the regulating, as well. ai is a powerful assistant, and as we all know from spiderman, with great power comes great responsibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  closing
&lt;/h2&gt;

&lt;p&gt;i am left with questions, not conclusions.&lt;/p&gt;

&lt;p&gt;maybe we value human touch most where we believe the human story is the product. maybe we accept ai more where we believe the product is utility. maybe those boundaries are changing and we are all reacting in real time.&lt;/p&gt;

&lt;p&gt;for now, i am trying to keep the question open.....when ai is part of the process, what still makes something mine, yours, or ours?&lt;/p&gt;

&lt;h2&gt;
  
  
  further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.copyright.gov/ai/" rel="noopener noreferrer"&gt;copyright and artificial intelligence, u.s. copyright office&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Generative_art" rel="noopener noreferrer"&gt;generative art&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Computer_music" rel="noopener noreferrer"&gt;computer music&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related on this site
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260318-from-prototype-to-production-ai/" rel="noopener noreferrer"&gt;from prototype to production: my early adopter view of ai&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260326-the-danger-of-trusting-the-ai-agent/" rel="noopener noreferrer"&gt;the danger of trusting the ai agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/series/commentary/" rel="noopener noreferrer"&gt;commentary series&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>art</category>
      <category>creativity</category>
      <category>music</category>
    </item>
    <item>
      <title>dbt tests</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Mon, 30 Mar 2026 23:34:36 +0000</pubDate>
      <link>https://dev.to/shrouwoods/dbt-tests-26bb</link>
      <guid>https://dev.to/shrouwoods/dbt-tests-26bb</guid>
      <description>&lt;p&gt;we all know testing is valuable, but almost all dbt projects still underinvest in it. i am guilty of this, shipping the model, promising i will add tests later, then moving on to the next urgent request.&lt;/p&gt;

&lt;p&gt;that pattern feels fast in the moment, but it is expensive over time. dbt tests are one of the easiest ways to protect trust in your data, and the setup cost is usually smaller than people expect.&lt;/p&gt;

&lt;h2&gt;
  
  
  quick answer
&lt;/h2&gt;

&lt;p&gt;dbt tests are assertions about your data that run inside your transformation workflow. they verify assumptions like uniqueness, non-null keys, valid categorical values, and referential integrity. when a test fails, dbt surfaces the exact failing records so you can debug quickly. if you are new to the feature, start with the official &lt;a href="https://docs.getdbt.com/docs/build/data-tests" rel="noopener noreferrer"&gt;dbt data tests documentation&lt;/a&gt; and add a few high-signal tests to your most consumed models first.&lt;/p&gt;

&lt;h2&gt;
  
  
  who this is for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;analytics engineers who already build dbt models but still rely on manual spot checks&lt;/li&gt;
&lt;li&gt;data teams that have recurring data quality incidents in dashboards or reports&lt;/li&gt;
&lt;li&gt;anyone who wants a practical starting point instead of a perfect testing framework&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  why this matters
&lt;/h2&gt;

&lt;p&gt;the cost of bad data is usually delayed, not immediate. a broken metric can sit in production for days before someone notices, and by then that number has already been used in a deck, decision, or leadership update.&lt;/p&gt;

&lt;p&gt;the frustrating part is that most of these issues are predictable. duplicate primary keys, null foreign keys, unexpected status values, and invalid date ranges are all common failures. dbt tests can catch these early, near the model that introduced the issue.&lt;/p&gt;

&lt;p&gt;i also think testing helps teams move faster, not slower. when tests are in place, i can refactor a model with more confidence because i have a safety net. without tests, every change feels risky and review cycles become slower because everyone is relying on intuition.&lt;/p&gt;

&lt;h2&gt;
  
  
  step-by-step
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) define the starting point
&lt;/h3&gt;

&lt;p&gt;pick one model that is heavily consumed, for example an order fact table or a customer dimension. identify three assumptions that must always be true:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the model key is unique&lt;/li&gt;
&lt;li&gt;important keys are never null&lt;/li&gt;
&lt;li&gt;status fields only contain known values&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;then encode those assumptions directly in your yml.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) apply the change
&lt;/h3&gt;

&lt;p&gt;start with generic tests in your schema file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;

&lt;span class="na"&gt;models&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;fct_orders&lt;/span&gt;
    &lt;span class="na"&gt;columns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;order_id&lt;/span&gt;
        &lt;span class="na"&gt;tests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;not_null&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;unique&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;customer_id&lt;/span&gt;
        &lt;span class="na"&gt;tests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;not_null&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;relationships&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;to&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ref('dim_customers')&lt;/span&gt;
              &lt;span class="na"&gt;field&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;customer_id&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;order_status&lt;/span&gt;
        &lt;span class="na"&gt;tests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;accepted_values&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;values&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;placed"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shipped"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cancelled"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;returned"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;this single block gives you strong baseline coverage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;not_null&lt;/code&gt; protects required fields&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;unique&lt;/code&gt; protects grain&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;relationships&lt;/code&gt; protects joins and referential integrity&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;accepted_values&lt;/code&gt; protects enum-like business states&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;next, add one singular test for a business rule that generic tests cannot express cleanly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- tests/orders_non_negative_amount.sql&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;order_amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;{{&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'fct_orders'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_amount&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;this test fails only when the query returns rows. singular tests are ideal for custom rules like range checks, cross-column logic, and impossible combinations.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) validate the result
&lt;/h3&gt;

&lt;p&gt;run tests in a tight loop while you are developing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dbt &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;--select&lt;/span&gt; fct_orders
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;for a broader gate in CI or before merging, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dbt build &lt;span class="nt"&gt;--select&lt;/span&gt; fct_orders+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;dbt build&lt;/code&gt; runs models and tests together, which is useful when you want to validate both transformation logic and data quality in one pass. the &lt;code&gt;+&lt;/code&gt; after the model name tells dbt to also run any downstream models according to the dag to make sure dependencies also pass.&lt;/p&gt;

&lt;h2&gt;
  
  
  a practical prioritization rule
&lt;/h2&gt;

&lt;p&gt;when time is limited, i prioritize tests in this order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;key integrity on high-consumption models (&lt;code&gt;unique&lt;/code&gt; plus &lt;code&gt;not_null&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;foreign key integrity (&lt;code&gt;relationships&lt;/code&gt;) on joins that power dashboards&lt;/li&gt;
&lt;li&gt;controlled fields (&lt;code&gt;accepted_values&lt;/code&gt;) where business logic depends on a finite set of values&lt;/li&gt;
&lt;li&gt;one custom singular test for the highest-risk metric or business rule&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;this sequence catches a large share of real incidents with minimal setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  faq
&lt;/h2&gt;

&lt;h3&gt;
  
  
  what should i test first in a mature project with little coverage?
&lt;/h3&gt;

&lt;p&gt;start where breakage is most expensive, not where modeling is most elegant. choose one or two heavily consumed models and add key integrity plus relationships first. then add one singular test for the business rule that has caused the most historical pain.&lt;/p&gt;

&lt;h3&gt;
  
  
  do dbt tests slow down delivery?
&lt;/h3&gt;

&lt;p&gt;they add some upfront work, but they usually reduce cycle time later. test failures are cheaper during development than after release, and tests make refactors safer because you can verify assumptions continuously instead of rediscovering issues in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/docs/build/data-tests" rel="noopener noreferrer"&gt;dbt docs, data tests&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/reference/resource-properties/data-tests" rel="noopener noreferrer"&gt;dbt docs, test properties&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/best-practices/writing-custom-generic-tests" rel="noopener noreferrer"&gt;dbt docs, writing custom generic data tests&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260330-dbt-docs/" rel="noopener noreferrer"&gt;dbt docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/series/dbt/" rel="noopener noreferrer"&gt;dbt series&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dbt</category>
      <category>testing</category>
      <category>dataquality</category>
      <category>analyticsengineering</category>
    </item>
    <item>
      <title>dbt docs</title>
      <dc:creator>Philip Hern</dc:creator>
      <pubDate>Mon, 30 Mar 2026 11:39:08 +0000</pubDate>
      <link>https://dev.to/shrouwoods/dbt-docs-3h12</link>
      <guid>https://dev.to/shrouwoods/dbt-docs-3h12</guid>
      <description>&lt;p&gt;most data engineers i know will spend hours getting a model right, then skip the one step that makes it discoverable to everyone else. dbt docs are that step, and they are worth the effort.&lt;/p&gt;

&lt;h2&gt;
  
  
  quick answer
&lt;/h2&gt;

&lt;p&gt;dbt docs is a built-in feature that generates a browsable website from your dbt project. it pulls descriptions from your yml files, renders a searchable model catalog, and draws a lineage graph showing how every model connects. running &lt;code&gt;dbt docs generate&lt;/code&gt; followed by &lt;code&gt;dbt docs serve&lt;/code&gt; gives you a local site instantly. the real payoff is that teammates who never open your sql files can still understand what each model does, what columns it exposes, and where the data comes from. this is especially useful to downstream consumers of your objects because they can see the exact format of the object, including column names and data types, etc.&lt;/p&gt;

&lt;h2&gt;
  
  
  who this is for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;data engineers who use dbt but skip writing descriptions&lt;/li&gt;
&lt;li&gt;analysts, product managers, or business users who need to understand available data without reading sql&lt;/li&gt;
&lt;li&gt;team leads looking for a low-effort way to make data structures discoverable&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  why this matters
&lt;/h2&gt;

&lt;p&gt;when you build a dbt project, the models represent real business concepts. customers, products, sales, inventory, whatever the domain is. the people who consume those models in dashboards or reports often have no involvement in building them and no reason to read raw sql.&lt;/p&gt;

&lt;p&gt;without documentation, those consumers rely on tribal knowledge, slack messages, and guesswork. that does not scale. dbt docs solve this by turning the metadata you already maintain (yml files, project config, source definitions) into a navigable reference that anyone on the team can use. the effort to write a good description is small, and the compound value to the rest of the organization grows with every model you add.&lt;/p&gt;

&lt;p&gt;i think of it like this: if i write a model and do not document it, the only person who truly understands it is me, and even that fades after a few months. if i write a two-sentence description and add column-level context, that knowledge lives in the project permanently and serves everyone who touches the data.&lt;/p&gt;

&lt;h2&gt;
  
  
  what dbt docs generates
&lt;/h2&gt;

&lt;p&gt;when you run &lt;code&gt;dbt docs generate&lt;/code&gt;, dbt produces two main artifacts in your &lt;code&gt;target/&lt;/code&gt; directory:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;manifest.json&lt;/strong&gt; contains the full project graph, including every model, source, seed, snapshot, and macro, along with their descriptions, tags, and configuration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;catalog.json&lt;/strong&gt; contains the schema-level metadata pulled from your warehouse, including column names, data types, and row counts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;together with a bundled &lt;code&gt;index.html&lt;/code&gt;, these files power a static site that you can open locally with &lt;code&gt;dbt docs serve&lt;/code&gt; or host anywhere that serves static files.&lt;/p&gt;

&lt;h3&gt;
  
  
  the site includes
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;a searchable list of every model and data source in your project&lt;/li&gt;
&lt;li&gt;model-level and column-level descriptions pulled from your yml files&lt;/li&gt;
&lt;li&gt;the full sql compiled for each model (in each environment)&lt;/li&gt;
&lt;li&gt;a lineage graph (dag - directed acyclic graph) that shows upstream sources, intermediate models, and downstream consumers for any selected node&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;the lineage graph is especially useful when someone asks "where does this column come from" or "what breaks if i change this source table". instead of tracing through sql files manually, the graph answers it visually.&lt;/p&gt;

&lt;h2&gt;
  
  
  how to document your models
&lt;/h2&gt;

&lt;p&gt;dbt reads documentation from yml files that live alongside your models. you are probably already using these for configuration and &lt;a href="https://philliant.com/posts/20260328-what-is-sql-and-why-it-still-works/" rel="noopener noreferrer"&gt;source definitions&lt;/a&gt;, so adding descriptions is a natural extension.&lt;/p&gt;

&lt;h3&gt;
  
  
  model and column descriptions in yml
&lt;/h3&gt;

&lt;p&gt;the most common approach is adding &lt;code&gt;description&lt;/code&gt; fields directly in your yml files. here is what that looks like for a view in an information delivery layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;

&lt;span class="na"&gt;models&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ORDER_SUMMARY_V&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="s"&gt;aggregated view of customer orders with totals and status&lt;/span&gt;
      &lt;span class="s"&gt;breakdowns, consumed by the reporting dashboard and the&lt;/span&gt;
      &lt;span class="s"&gt;customer details page in the application.&lt;/span&gt;
    &lt;span class="na"&gt;columns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CUSTOMER_ID&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unique&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;customer"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TOTAL_ORDERS&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;of&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;orders&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;placed&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;by&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;this&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;customer"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TOTAL_SPEND&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sum&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;of&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;order&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;amounts&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;across&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;completed&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;orders"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LAST_ORDER_DATE&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;most&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;recent&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;order&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;date&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;this&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;customer"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;every model and every column can have a description. the more specific you are, the more useful the generated docs become. "id" as a column description does not help anyone. "unique identifier for the customer, sourced from the application database" does.&lt;/p&gt;

&lt;h3&gt;
  
  
  source descriptions
&lt;/h3&gt;

&lt;p&gt;sources benefit from the same treatment. when your project ingests raw data from an external system, describing those sources in your shared sources file makes the lineage graph meaningful from the very first node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;

&lt;span class="na"&gt;sources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RAW_ORDERS&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;raw&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;order&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ingested&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;from&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;transactional&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;database&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;via&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;kafka"&lt;/span&gt;
    &lt;span class="na"&gt;database&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ANALYTICS_DEV_DB&lt;/span&gt;
    &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RAW_DATA&lt;/span&gt;
    &lt;span class="na"&gt;tables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ORDERS&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;one&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;row&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;per&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;order,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;includes&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;order&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;timestamps"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ORDER_ITEMS&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;line&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;items&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;each&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;order,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;one&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;row&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;per&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;product&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;per&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;order"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  doc blocks for longer descriptions
&lt;/h3&gt;

&lt;p&gt;when a model needs more than a sentence or two of context, dbt supports doc blocks. these are markdown files (&lt;code&gt;.md&lt;/code&gt;) that live in your project and can be referenced from yml descriptions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;{% docs order_summary_description %}

this view surfaces the aggregated order history for each customer.
it joins the hub, satellite, and link tables from the refined data
layer to produce a single wide row per customer.

&lt;span class="gs"&gt;**grain:**&lt;/span&gt; one row per customer.

&lt;span class="gs"&gt;**consumers:**&lt;/span&gt; reporting dashboard, customer details api endpoint.

{% enddocs %}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;then in your yml:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;models&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ORDER_SUMMARY_V&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;doc("order_summary_description")&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;doc blocks are useful when the context is long enough that embedding it inline in yaml becomes awkward. they also let you reuse the same description across multiple references if needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  a few useful options in dbt docs
&lt;/h2&gt;

&lt;h3&gt;
  
  
  persist_docs
&lt;/h3&gt;

&lt;p&gt;by default, dbt docs only live in the generated static site. if you want the descriptions to also appear in your warehouse catalog (so someone querying &lt;a href="https://philliant.com/posts/20260328-the-difference-between-snowflake-and-the-other-databases/" rel="noopener noreferrer"&gt;snowflake&lt;/a&gt; information_schema can see them), you can enable &lt;code&gt;persist_docs&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;models&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;my_data_product&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;+persist_docs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;relation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;columns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with this enabled, &lt;code&gt;dbt run&lt;/code&gt; pushes your yml descriptions into the &lt;code&gt;COMMENT&lt;/code&gt; property on the table or view and on each column in the warehouse. this is valuable because it means the documentation is available even outside the dbt docs site, directly in the database catalog that tools like snowflake and bi platforms already read.&lt;/p&gt;

&lt;h3&gt;
  
  
  the lineage graph
&lt;/h3&gt;

&lt;p&gt;the generated site includes an interactive dag that visualizes every model, source, and their connections. you can click on any node to see its upstream dependencies and downstream consumers. this is one of the most powerful features in dbt docs because it makes the data flow tangible for people who do not read sql (or maybe they just do not have access to your workspace, but still have a need to understand it).&lt;/p&gt;

&lt;p&gt;when you have a project with dozens or hundreds of models organized into layers (raw, refined, business, information delivery), the lineage graph shows how a raw source table flows through transformations into the final views that analysts query. it replaces the need for manually maintained architecture diagrams that go stale the moment someone adds a new model.&lt;/p&gt;

&lt;h3&gt;
  
  
  exposures
&lt;/h3&gt;

&lt;p&gt;exposures let you document where your dbt models are consumed outside of dbt. dashboards, applications, api endpoints, anything downstream. defining them makes the lineage graph extend beyond the dbt project boundary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;exposures&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;customer_dashboard&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dashboard&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;executive&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;dashboard&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;showing&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;customer&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;order&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;trends&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;retention"&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ref('ORDER_SUMMARY_V')&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ref('CUSTOMER_RETENTION_V')&lt;/span&gt;
    &lt;span class="na"&gt;owner&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;analytics team&lt;/span&gt;
      &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;analytics@example.com&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;exposures show up in the lineage graph as leaf nodes, making it clear which models are actively consumed and by what. this is helpful when you are deciding whether it is safe to refactor or deprecate a model.&lt;/p&gt;

&lt;h2&gt;
  
  
  hosting dbt docs on github pages
&lt;/h2&gt;

&lt;p&gt;running &lt;code&gt;dbt docs serve&lt;/code&gt; is great for local browsing, but the real value comes from hosting the site where the whole team can access it without installing dbt or cloning the repo. github pages is a straightforward, free option for this.&lt;/p&gt;

&lt;h3&gt;
  
  
  how it works
&lt;/h3&gt;

&lt;p&gt;after &lt;code&gt;dbt docs generate&lt;/code&gt; runs, the &lt;code&gt;target/&lt;/code&gt; directory contains everything needed to serve the site: &lt;code&gt;index.html&lt;/code&gt;, &lt;code&gt;manifest.json&lt;/code&gt;, and &lt;code&gt;catalog.json&lt;/code&gt;. you copy those files to a branch or directory that github pages serves, and the docs are live.&lt;/p&gt;

&lt;h3&gt;
  
  
  a basic github actions workflow
&lt;/h3&gt;

&lt;p&gt;here is a minimal workflow that generates the docs on every push to main and deploys them to github pages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy dbt docs&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
  &lt;span class="na"&gt;pages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github-pages&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ steps.deployment.outputs.page_url }}&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;set up python&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-python@v5&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;python-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.11"&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;install dbt&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pip install dbt-snowflake&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;generate docs&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dbt docs generate --profiles-dir .&lt;/span&gt;
        &lt;span class="na"&gt;working-directory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my_dbt_project&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;DBT_SNOWFLAKE_ACCOUNT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.SNOWFLAKE_ACCOUNT }}&lt;/span&gt;
          &lt;span class="na"&gt;DBT_SNOWFLAKE_USER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.SNOWFLAKE_USER }}&lt;/span&gt;
          &lt;span class="na"&gt;DBT_SNOWFLAKE_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.SNOWFLAKE_PASSWORD }}&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prepare pages artifact&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;mkdir -p pages&lt;/span&gt;
          &lt;span class="s"&gt;cp my_dbt_project/target/index.html pages/&lt;/span&gt;
          &lt;span class="s"&gt;cp my_dbt_project/target/manifest.json pages/&lt;/span&gt;
          &lt;span class="s"&gt;cp my_dbt_project/target/catalog.json pages/&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;upload pages artifact&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-pages-artifact@v3&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pages&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy to github pages&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deployment&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/deploy-pages@v4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;once this runs, your dbt docs are available at &lt;code&gt;https://&amp;lt;org&amp;gt;.github.io/&amp;lt;repo&amp;gt;/&lt;/code&gt; and automatically update every time someone merges to main. no one needs to install anything or run any commands to browse the documentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  keep the credentials out of the repo
&lt;/h3&gt;

&lt;p&gt;the workflow above uses github secrets for warehouse credentials. never commit profiles with real credentials. use environment variables or a ci-specific &lt;code&gt;profiles.yml&lt;/code&gt; that references secrets, and make sure your &lt;code&gt;.gitignore&lt;/code&gt; excludes any local profiles that contain actual passwords or tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  faq
&lt;/h2&gt;

&lt;h3&gt;
  
  
  do i need to write descriptions for every column?
&lt;/h3&gt;

&lt;p&gt;you do not need to, but the columns that matter most to consumers deserve it. at minimum, describe the primary key, any business key, and any column whose meaning is not obvious from the name alone. over time, filling in the rest pays off as the team grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  can i generate docs without connecting to the warehouse?
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;dbt docs generate&lt;/code&gt; pulls catalog metadata from the warehouse, so it does need a connection. however, if you already have a &lt;code&gt;catalog.json&lt;/code&gt; from a previous run, you can serve the site locally with just those files.&lt;/p&gt;

&lt;h3&gt;
  
  
  how is this different from a wiki or confluence page?
&lt;/h3&gt;

&lt;p&gt;dbt docs stay in sync with your code automatically. a wiki page about your data model goes stale the moment someone adds a column or renames a table. dbt docs regenerate from the source of truth (your yml files and your warehouse) every time you run the command, so the documentation and the code never drift apart.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/docs/collaborate/documentation" rel="noopener noreferrer"&gt;dbt docs overview (dbt documentation)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/reference/commands/cmd-docs" rel="noopener noreferrer"&gt;dbt docs generate command&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/reference/resource-configs/persist_docs" rel="noopener noreferrer"&gt;persist_docs config&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.getdbt.com/docs/build/exposures" rel="noopener noreferrer"&gt;exposures (dbt documentation)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.github.com/en/pages" rel="noopener noreferrer"&gt;github pages documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://philliant.com/series/dbt/" rel="noopener noreferrer"&gt;dbt series&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260328-what-is-sql-and-why-it-still-works/" rel="noopener noreferrer"&gt;what is sql, and why it still works&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://philliant.com/posts/20260328-the-difference-between-snowflake-and-the-other-databases/" rel="noopener noreferrer"&gt;the difference between snowflake and the "other" databases&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dbt</category>
      <category>documentation</category>
      <category>dataengineering</category>
      <category>githubpages</category>
    </item>
  </channel>
</rss>
