<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aditya satrio nugroho</title>
    <description>The latest articles on DEV Community by Aditya satrio nugroho (@adityasatrio).</description>
    <link>https://dev.to/adityasatrio</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F256758%2Ffeadf5ff-c84a-4ed0-9769-b1e1a063c7c4.jpeg</url>
      <title>DEV Community: Aditya satrio nugroho</title>
      <link>https://dev.to/adityasatrio</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/adityasatrio"/>
    <language>en</language>
    <item>
      <title>Why Your Team Keeps Ignoring Your Instructions (And What Actually Works)</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Thu, 28 May 2026 12:46:02 +0000</pubDate>
      <link>https://dev.to/adityasatrio/why-your-team-keeps-ignoring-your-instructions-and-what-actually-works-49fk</link>
      <guid>https://dev.to/adityasatrio/why-your-team-keeps-ignoring-your-instructions-and-what-actually-works-49fk</guid>
      <description>&lt;p&gt;You've said it in the sprint planning. You've said it in the 1:1. You've put it in the team wiki. And yet — the same thing keeps happening.&lt;/p&gt;

&lt;p&gt;PRs without ticket references. Deployments without checklists. Steps skipped "just this once" because there was deadline pressure.&lt;/p&gt;

&lt;p&gt;Sound familiar?&lt;/p&gt;

&lt;p&gt;This isn't a knowledge problem. Your team knows the rule. This is a behavior problem. And behavior problems require a completely different approach than just explaining things better or louder.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three Levels Most Managers Go Through
&lt;/h2&gt;

&lt;p&gt;Most engineering managers — especially in fast-moving startups — tend to escalate through the same pattern when trying to enforce team discipline:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Level 1 — Verbal instruction:&lt;/strong&gt; You explain the rule, why it exists, what benefit it brings. You do this in team meetings, in Slack, maybe even in a nicely formatted Confluence page.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Level 2 — Micromanagement:&lt;/strong&gt; When level 1 doesn't stick, you start observing more closely. You remind people in code reviews. You bring it up again in standups. You personally check that things are done.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Level 3 — System enforcement:&lt;/strong&gt; You build the guardrail. The pipeline rejects PRs without Jira codes. The checklist is a required form. The system blocks the wrong behavior automatically.&lt;/p&gt;

&lt;p&gt;Most managers spend too long in Level 2, burning themselves out, before getting to Level 3. And the worst part? Level 2 only works while you're watching.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Explaining Isn't Enough
&lt;/h2&gt;

&lt;p&gt;Here's the uncomfortable truth: your team isn't ignoring you because they don't understand. They're ignoring you because &lt;strong&gt;the cost of non-compliance is zero until you enforce it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Human brains are efficiency machines. We take the path of least resistance. If skipping the Jira code on a PR takes 5 seconds and saves mental overhead during a deadline crunch — and nothing happens — the brain logs that as: &lt;em&gt;acceptable shortcut&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Your explanation created awareness. It did not create consequences. And without consequences, awareness alone fades fast.&lt;/p&gt;

&lt;p&gt;There's also a cultural layer to this, particularly in startup environments. Teams that have been through multiple managers, multiple "initiatives of the month," learn something: &lt;strong&gt;instructions eventually fade&lt;/strong&gt;. So they wait to see if this one is real. They're not being malicious. They're being rational based on past experience.&lt;/p&gt;




&lt;h2&gt;
  
  
  The WIIFM Problem
&lt;/h2&gt;

&lt;p&gt;One of the biggest mistakes in process enforcement is framing everything around what benefits the manager or the organization.&lt;/p&gt;

&lt;p&gt;"We need Jira codes so I can track velocity." &lt;br&gt;
"We need this checklist so the audit is clean."&lt;br&gt;
"We need this because compliance requires it."&lt;/p&gt;

&lt;p&gt;Your team hears: &lt;em&gt;this is overhead that helps the boss, not me.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;WIIFM — What's In It For Me&lt;/strong&gt; — is the real question every engineer is silently asking. Until you answer it from their perspective, you're asking people to add friction to their day for someone else's benefit.&lt;/p&gt;

&lt;p&gt;Some reframes that actually land:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Jira code on a PR = your work gets credited.&lt;/strong&gt; If you use any engineering metrics tooling, unlinked PRs are invisible contributions. Their output disappears from the data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linked PRs = faster code reviews.&lt;/strong&gt; Reviewers understand context immediately. Fewer back-and-forth questions. Less time blocked waiting for review.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit trail = protection during incidents.&lt;/strong&gt; When something breaks in production, the person with clean, linked commit history is the one who can clearly show what they were working on and what was in scope.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But here's the thing — even if you nail the WIIFM framing, it still might not be enough. Because WIIFM changes motivation. It doesn't change habits. And habits need systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  Six Behavior Frameworks That Actually Explain What's Happening
&lt;/h2&gt;

&lt;p&gt;After going through this cycle enough times, it's worth understanding the underlying mechanics. These frameworks explain &lt;em&gt;why&lt;/em&gt; teams behave the way they do — and more importantly, what to do about it.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. BJ Fogg's Behavior Model
&lt;/h3&gt;

&lt;p&gt;Fogg's model says behavior only happens when three things converge at the same moment: &lt;strong&gt;Motivation + Ability + Prompt&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Most managers nail motivation (explaining why) but completely miss the prompt. You explain the Jira code rule in a Monday meeting. Three days later, an engineer is pushing a hotfix at 11pm under deadline pressure. The motivation has faded. The prompt isn't there. The behavior doesn't happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Put the prompt at the exact moment the behavior needs to occur. A PR template with a mandatory Jira field triggers at the right moment — when the PR is being opened, not three days earlier in a meeting.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Nudge Theory
&lt;/h3&gt;

&lt;p&gt;From Thaler and Sunstein: you don't need to force behavior if you design the environment so the desired behavior is the default.&lt;/p&gt;

&lt;p&gt;Most process enforcement is designed as a wall — do the wrong thing and something blocks you. But walls require constant maintenance and create resentment. Nudges work differently — they make the right behavior the easiest behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; A PR template that pre-fills the Jira ticket field with a placeholder makes filling it in take 5 seconds. Leaving it blank takes more effort. You've flipped the friction. The wall (pipeline rejection) is the backup, not the primary mechanism.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. The Consequences Model
&lt;/h3&gt;

&lt;p&gt;This one is simple but often ignored: &lt;strong&gt;behavior that has no immediate consequence doesn't change&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The consequence needs to be three things: immediate, consistent, and certain. Not severe — just unavoidable.&lt;/p&gt;

&lt;p&gt;A verbal reminder in a 1:1 next week is a delayed, inconsistent consequence. A pipeline that rejects a PR right now, in front of the team, before they can move on, is an immediate and certain consequence. The immediacy is what makes it register.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Automate the consequence. Don't rely on yourself to catch it and bring it up later. The system should be the one saying no — not you.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. Renting vs. Owning Behavior
&lt;/h3&gt;

&lt;p&gt;This is the most important distinction for busy engineering managers.&lt;/p&gt;

&lt;p&gt;When &lt;em&gt;you&lt;/em&gt; are the enforcement mechanism, you're renting behavior. The team complies because you're watching. The moment you're heads-down on a hiring cycle, or in back-to-back stakeholder meetings, or on leave — behavior reverts. You were the rule, not the system.&lt;/p&gt;

&lt;p&gt;Owned behavior means the team follows the rule when you're not there. You don't get that from verbal instructions. You get it from consistent system enforcement over time, until the system becomes the authority — not the manager.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Ask yourself: "Would this process survive a 2-week vacation without me?" If the answer is no, you have rented behavior. Build the system.&lt;/p&gt;




&lt;h3&gt;
  
  
  5. Edmondson's Peer Compliance Effect
&lt;/h3&gt;

&lt;p&gt;Amy Edmondson's research on team dynamics shows something important: people calibrate their behavior to what they observe their peers doing — especially influential peers — not just what leaders say.&lt;/p&gt;

&lt;p&gt;If your most senior or most respected engineer submits a PR without a Jira code and it gets merged (even once, even with good reason), every other engineer reads that as the real rule. Senior engineers can skip this under pressure. And now everyone starts finding their own version of "pressure."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Make enforcement hierarchy-blind. The pipeline should reject a senior engineer's PR the same way it rejects a junior's. Impersonal, automated enforcement removes the social dynamics from the equation. It's not about punishing the senior — it's about making the rule real for everyone equally.&lt;/p&gt;




&lt;h3&gt;
  
  
  6. Rule Erosion
&lt;/h3&gt;

&lt;p&gt;Here's the one that managers cause themselves without realizing it.&lt;/p&gt;

&lt;p&gt;Every exception you allow — even with a completely valid reason — sends a signal: &lt;em&gt;this rule has conditions&lt;/em&gt;. Your team doesn't hear "the rule still applies, this was just an emergency." They hear: "I just need to find the right condition."&lt;/p&gt;

&lt;p&gt;Over time, the rule erodes. Not because anyone is being defiant, but because you've taught them the rule is negotiable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Separate the exception from the rule. A production incident happens and a hotfix needs to be merged fast — fine. But require a follow-up: the Jira link must be added within 24 hours, or a retroactive ticket created. The rule still applies. The timing is flexible. This closes the exit without blocking production, and preserves the integrity of the rule.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Decision Framework: Which Tool For Which Problem
&lt;/h2&gt;

&lt;p&gt;Not every situation calls for the same response. Here's a simple way to map your problem to the right framework:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Your Problem&lt;/th&gt;
&lt;th&gt;Start With&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Team forgets the rule at execution time&lt;/td&gt;
&lt;td&gt;Fogg Behavior Model + Nudge Theory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Team knows but doesn't bother&lt;/td&gt;
&lt;td&gt;Consequences Model + Rule Erosion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Behavior improves when you're around, regresses when you're not&lt;/td&gt;
&lt;td&gt;Renting vs Owning + Consequences Model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A senior engineer is setting the wrong example&lt;/td&gt;
&lt;td&gt;Edmondson Peer Compliance&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In most real situations, you need &lt;strong&gt;Nudge + Consequences + Rule Erosion&lt;/strong&gt; running together. The others help you diagnose what's really happening.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Hardest Part: Protecting the System From Yourself
&lt;/h2&gt;

&lt;p&gt;You can build a perfect pipeline. You can design the right nudges. You can get buy-in from the team.&lt;/p&gt;

&lt;p&gt;And then you override the system once, with a good reason, under pressure.&lt;/p&gt;

&lt;p&gt;That one override costs more than you think. The team noticed. And they'll use it as a reference point the next time they need an exception.&lt;/p&gt;

&lt;p&gt;The real skill isn't building the enforcement system. It's having the discipline to protect it — including from your own judgment calls in the moment.&lt;/p&gt;

&lt;p&gt;If you need exceptions to exist (and you will), formalize them. Make the exception process explicit and documented. "Hotfixes can merge without a Jira code IF a follow-up ticket is created within 24 hours" is a better rule than "no exceptions" that gets violated, because it's honest and it closes the gap.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;Process discipline in engineering teams isn't really about process. It's about trust and predictability.&lt;/p&gt;

&lt;p&gt;When a team consistently does what they say they'll do — when the agreed process is actually followed — it creates a foundation where people can rely on each other, where metrics are trustworthy, and where quality compounds over time.&lt;/p&gt;

&lt;p&gt;That foundation doesn't come from better explanations. It comes from systems that enforce the right behavior consistently, leaders who protect those systems, and enough time for the behavior to become habit.&lt;/p&gt;

&lt;p&gt;The verbal instruction was never going to be enough. It was always going to end with a pipeline rejection. The question is just how long you spend in the middle before you get there.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Lessons from a MySQL Migration: What We Learned and How to Do It Better Next Time</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Sat, 04 Oct 2025 05:35:12 +0000</pubDate>
      <link>https://dev.to/adityasatrio/lessons-from-a-mysql-migration-what-we-learned-and-how-to-do-it-better-next-time-4j1h</link>
      <guid>https://dev.to/adityasatrio/lessons-from-a-mysql-migration-what-we-learned-and-how-to-do-it-better-next-time-4j1h</guid>
      <description>&lt;p&gt;We migrated our MySQL database. It “worked,” until it didn’t: the new DB’s size didn’t match the old one. Same schema, same rows—different footprint. That tiny mismatch pushed us to build a real migration playbook: understand what’s happening, prove data equality, and leave a paper trail that stakeholders actually trust.&lt;/p&gt;

&lt;p&gt;Here’s the journey—told as we lived it—with commands, expected outputs, and the why behind each step.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1 — Capture More Than Just Rows
&lt;/h2&gt;

&lt;p&gt;Before moving data, we grabbed the database’s shape and logic. Otherwise you carry the data but lose the rules that make it behave.&lt;/p&gt;

&lt;p&gt;Command&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mysqldump -h OLD_HOST -u root -p --no-data OLD_DB &amp;gt; schema_only.sql
mysqldump -h OLD_HOST -u root -p --routines --triggers --events OLD_DB &amp;gt; routines.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Expected Output&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;schema_only.sql&lt;/code&gt; contains only &lt;code&gt;CREATE TABLE ... statements&lt;/code&gt; (no &lt;code&gt;INSERT INTO&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;routines.sql contains &lt;code&gt;CREATE PROCEDURE, CREATE FUNCTION, CREATE TRIGGER, CREATE EVENT&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;You’ll likely see &lt;code&gt;DEFINER=&lt;/code&gt; clauses—note them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why this matters&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Schema/routine parity prevents silent logic drift (e.g., missing trigger = missing audit row).&lt;/li&gt;
&lt;li&gt;Pro tip: keep these files in source control for diffs across migrations.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 2 — Dump &amp;amp; Restore (Safely and Predictably)
&lt;/h2&gt;

&lt;p&gt;We wanted a consistent snapshot without table-level locks that freeze the app.&lt;/p&gt;

&lt;p&gt;Command&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mysqldump -h OLD_HOST -u root -p \
  --single-transaction --quick \
  --routines --triggers --events \
  --default-character-set=utf8mb4 \
  OLD_DB &amp;gt; old_db_dump.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mysql -h NEW_HOST -u root -p -e "CREATE DATABASE IF NOT EXISTS NEW_DB DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;"
mysql -h NEW_HOST -u root -p NEW_DB &amp;lt; old_db_dump.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Expected Output&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;old_db_dump.sql&lt;/code&gt; is large and full of &lt;code&gt;INSERT INTO ...&lt;/code&gt; lines.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SHOW TABLES FROM NEW_DB;&lt;/code&gt; lists the same tables as OLD_DB.&lt;/li&gt;
&lt;li&gt;Critical tables pass a spot-check: &lt;code&gt;SELECT COUNT(*) FROM big_table;&lt;/code&gt; (numbers line up or are close; exact checks later).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why this matters&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--single-transaction&lt;/code&gt; gives a consistent InnoDB snapshot without blocking writers.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--quick streams&lt;/code&gt; rows to keep memory flat.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now we had data in place—but sizes looked off. Time to measure, then normalize.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3 — Snapshot the Raw Footprint
&lt;/h2&gt;

&lt;p&gt;We measured where space was going before any tuning.&lt;/p&gt;

&lt;p&gt;Command&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT table_name,
       ENGINE,
       TABLE_ROWS, -- estimate for InnoDB
       ROUND((data_length + index_length)/1024/1024, 2) AS size_mb
FROM information_schema.tables
WHERE table_schema = 'NEW_DB'
ORDER BY size_mb DESC;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Expected Output (sample)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+------------+--------+-----------+---------+
| table_name | ENGINE | TABLE_ROWS| size_mb |
+------------+--------+-----------+---------+
| orders     | InnoDB | 1000000   | 350.12  |
| users      | InnoDB |  500000   |  80.45  |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why this matters&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In InnoDB, &lt;code&gt;TABLE_ROWS&lt;/code&gt; is an estimate; sizes reflect fragmentation and stale stats after bulk load.&lt;/li&gt;
&lt;li&gt;Don’t conclude inequality yet—stats come next.&lt;/li&gt;
&lt;li&gt;The “aha” moment came when we ran &lt;code&gt;ANALYZE/OPTIMIZE.&lt;/code&gt; Here’s the deeper why.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 4 — The Deep Dive: ANALYZE and OPTIMIZE (What They Do, and Why You Should Care)
&lt;/h2&gt;

&lt;p&gt;After bulk inserts, InnoDB pages are fragmented and index statistics are stale. The optimizer can’t “see” reality, so it guesses—sometimes badly. Two tools fix that:&lt;/p&gt;

&lt;h3&gt;
  
  
  ANALYZE TABLE: refresh index statistics (cardinality)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;What it does: Re-samples index distributions and updates cardinality (estimated unique values per index).&lt;/li&gt;
&lt;li&gt;Why it matters: The optimizer chooses join order and index paths based largely on cardinality. Bad cardinality → bad plans.&lt;/li&gt;
&lt;li&gt;Where it lives: With &lt;code&gt;innodb_stats_persistent=ON&lt;/code&gt; (default in MySQL 8), stats are stored persistently and survive restarts.&lt;/li&gt;
&lt;li&gt;Histograms: MySQL 8 supports column histograms to model non-indexed predicates:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ANALYZE TABLE my_table UPDATE HISTOGRAM ON col1, col2 WITH 128 BUCKETS;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT * FROM information_schema.COLUMN_STATISTICS
WHERE SCHEMA_NAME='NEW_DB' AND TABLE_NAME='my_table';
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(If the table is empty, enable &lt;code&gt;show_compatibility_56=OFF&lt;/code&gt; and ensure &lt;code&gt;information_schema_stats_expiry&lt;/code&gt; permits refresh.)&lt;/p&gt;

&lt;h3&gt;
  
  
  OPTIMIZE TABLE: rebuild and defragment
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;What it does: For InnoDB, effectively rebuilds the table and its indexes (similar to &lt;code&gt;ALTER TABLE ... ENGINE=InnoDB&lt;/code&gt;), compacting pages and reclaiming space.&lt;/li&gt;
&lt;li&gt;Why it matters: You get a cleaner on-disk layout, tighter B-trees, and often a smaller file size that now resembles your source DB more closely.&lt;/li&gt;
&lt;li&gt;Locking/perf: On large tables, it’s heavy. In MySQL 8, many operations are in-place or “instant,” but plan it off-hours for big tables.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Commands&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- For a single table
ANALYZE TABLE my_table;
OPTIMIZE TABLE my_table;

-- Batch all tables
SELECT CONCAT('ANALYZE TABLE `', table_name, '`; OPTIMIZE TABLE `', table_name, '`;')
FROM information_schema.tables
WHERE table_schema='NEW_DB';
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(Pipe the generator into mysql to execute in one go.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Expected Output&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+------------------+----------+----------+----------+
| Table            | Op       | Msg_type | Msg_text |
+------------------+----------+----------+----------+
| NEW_DB.my_table  | analyze  | status   | OK       |
| NEW_DB.my_table  | optimize | status   | OK       |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Verifying Cardinality Improved&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SHOW INDEX FROM my_table;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Expected Output (excerpt)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+----------+------------+----------+--------------+------------+
| Table    | Non_unique | Key_name | Column_name  | Cardinality|
+----------+------------+----------+--------------+------------+
| my_table |          0 | PRIMARY  | id           |   999800   |
| my_table |          1 | idx_cust | customer_id  |    54012   |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;After &lt;code&gt;ANALYZE&lt;/code&gt;, Cardinality should look realistic (not suspiciously tiny like 1 or 2 on huge tables).&lt;/li&gt;
&lt;li&gt;If predicates rely on non-indexed columns, consider histograms (above). They don’t change cardinality but drastically improve selectivity estimates for those columns.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With stats refreshed and fragmentation reduced, our sizes converged. But “looks good” isn’t enough—we wanted proofs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5 — Prove Equality with Percona Toolkit (and More You Can Do)
&lt;/h2&gt;

&lt;p&gt;We rely on Percona Toolkit because it’s built for production-grade checks.&lt;/p&gt;

&lt;h3&gt;
  
  
  pt-table-checksum: detect row-level differences
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;How it works: Splits each table into chunks (ranges by PK), computes checksums (CRC32) per chunk on the source, then compares on the target (best with replication; otherwise compare results tables).&lt;/li&gt;
&lt;li&gt;Why it’s great: It scales. You get a precise answer without FULL TABLE SCAN everywhere.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Command (typical replication setup)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pt-table-checksum \
  --host=PRIMARY_HOST --user=USER --password=PASS \
  --databases NEW_DB \
  --replicate=percona.checksums
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected Output&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
... 0      0     1000000     0     20     0    5.3 NEW_DB.orders
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;DIFFS = 0 across all rows/tables = ✅&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No replication? Two independent servers?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Option A: Run pt-table-checksum on both and diff the percona.checksums tables.&lt;/li&gt;
&lt;li&gt;Option B: Use pt-table-sync directly to compare and optionally fix.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  pt-table-sync: generate the minimal fix
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pt-table-sync \
  --print --execute \
  h=OLD_HOST,u=USER,p=PASS,D=OLD_DB \
  h=NEW_HOST,u=USER,p=PASS,D=NEW_DB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected Output&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL INSERT/UPDATE/DELETE statements (and execution if &lt;code&gt;--execute&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;--print&lt;/code&gt; first, review, then add &lt;code&gt;--execute&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  More Percona tools (worth having on every migration)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;pt-query-digest&lt;/code&gt;: Analyze slow logs/traces to find worst queries (post-migration).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pt-duplicate-key-checker&lt;/code&gt;: Identify redundant/overlapping indexes before/after migration.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pt-index-usage&lt;/code&gt;: See which indexes aren’t used (on sampled workload).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pt-online-schema-change&lt;/code&gt;: Safer online DDL for big tables without long lock times.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Data proven equal, we closed the loop on objects and behavior.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 6 — Don’t Forget the “Invisible” Pieces
&lt;/h2&gt;

&lt;p&gt;Missing triggers or a changed collation can pass unnoticed—until a bug report lands.&lt;/p&gt;

&lt;p&gt;Commands&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SHOW TRIGGERS FROM NEW_DB;
SHOW PROCEDURE STATUS WHERE Db='NEW_DB';
SHOW FUNCTION STATUS WHERE Db='NEW_DB';
SELECT table_name, constraint_name, referenced_table_name
FROM information_schema.key_column_usage
WHERE table_schema='NEW_DB' AND referenced_table_name IS NOT NULL;
SELECT default_character_set_name, default_collation_name
FROM information_schema.schemata
WHERE schema_name='NEW_DB';
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected Output&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Counts and names match OLD_DB for triggers/procs/functions.&lt;/li&gt;
&lt;li&gt;Foreign keys listed as expected.&lt;/li&gt;
&lt;li&gt;Charset/collation align with app expectations (e.g., utf8mb4_0900_ai_ci).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why this matters&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A single missing trigger can silently break invariants (e.g., stock, audit, denormalized totals).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the database validated, we packaged results for non-DBA stakeholders.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 7 — Report in Plain Language
&lt;/h2&gt;

&lt;p&gt;Engineers love logs; stakeholders love summaries. Stakeholder Table (example)&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Old DB&lt;/th&gt;
&lt;th&gt;New DB&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Row Count (orders)&lt;/td&gt;
&lt;td&gt;1,000,000&lt;/td&gt;
&lt;td&gt;1,000,000&lt;/td&gt;
&lt;td&gt;✅ Equal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Index Count (orders)&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;✅ Equal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Size (MB)&lt;/td&gt;
&lt;td&gt;350&lt;/td&gt;
&lt;td&gt;348&lt;/td&gt;
&lt;td&gt;⚠ 0.6% diff (≤5% = OK)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Checksums (Percona)&lt;/td&gt;
&lt;td&gt;Match&lt;/td&gt;
&lt;td&gt;Match&lt;/td&gt;
&lt;td&gt;✅ All chunks DIFFS=0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Triggers/Procedures&lt;/td&gt;
&lt;td&gt;2/3&lt;/td&gt;
&lt;td&gt;2/3&lt;/td&gt;
&lt;td&gt;✅ Parity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Collation/Charset&lt;/td&gt;
&lt;td&gt;utf8mb4/…&lt;/td&gt;
&lt;td&gt;utf8mb4/…&lt;/td&gt;
&lt;td&gt;✅ Match&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Expected Output&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clear pass/fail with small, explained variances.&lt;/li&gt;
&lt;li&gt;A link to the raw validation logs for auditors (appendix).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Migration doesn’t end at restore; we watch for regressions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 8 — Watch the System Breathe (Post-Migration)
&lt;/h2&gt;

&lt;p&gt;We enabled the slow log to catch new bad plans caused by fresh stats.&lt;/p&gt;

&lt;p&gt;Commands&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;
SHOW VARIABLES LIKE 'slow_query_log%';
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected Output&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| slow_query_log  | ON    |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this matters&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ANALYZE&lt;/code&gt; can change plans; 1–2 weeks of vigilance pays off.&lt;/li&gt;
&lt;li&gt;Pair with &lt;code&gt;pt-query-digest&lt;/code&gt; to summarize hotspots quickly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;What We’ll Keep Doing Next Time&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run &lt;code&gt;ANALYZE&lt;/code&gt; (and histograms when needed) to fix cardinality → good plans.&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;OPTIMIZE&lt;/code&gt; on big movers to defragment and align sizes.&lt;/li&gt;
&lt;li&gt;Use Percona for truth (&lt;code&gt;pt-table-checksum&lt;/code&gt; to detect, &lt;code&gt;pt-table-sync&lt;/code&gt; to fix).&lt;/li&gt;
&lt;li&gt;Validate the “invisibles” (triggers, procs, FKs, collation).&lt;/li&gt;
&lt;li&gt;Report with thresholds (e.g., size diff ≤5% = acceptable).&lt;/li&gt;
&lt;li&gt;Monitor for 1–2 weeks to catch plan regressions early.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Appendix — Quick “Pass/Fail” Thresholds We Use&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Row count (critical tables): 100% match → PASS&lt;/li&gt;
&lt;li&gt;Percona checksum: DIFFS=0 for all chunks → PASS&lt;/li&gt;
&lt;li&gt;Size variance (post-OPTIMIZE): ≤5% → PASS&lt;/li&gt;
&lt;li&gt;Cardinality sanity (via SHOW INDEX): no suspiciously tiny values on high-cardinality columns → PASS&lt;/li&gt;
&lt;li&gt;Objects (triggers/procs/functions/FKs): full parity → PASS&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>database</category>
      <category>tutorial</category>
      <category>mysql</category>
      <category>devops</category>
    </item>
    <item>
      <title>SMEs vs Owners in Software Engineering: The Coach and the Captain</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Wed, 01 Oct 2025 13:16:36 +0000</pubDate>
      <link>https://dev.to/adityasatrio/smes-vs-owners-in-software-engineering-the-coach-and-the-captain-5j</link>
      <guid>https://dev.to/adityasatrio/smes-vs-owners-in-software-engineering-the-coach-and-the-captain-5j</guid>
      <description>&lt;h2&gt;
  
  
  When the Lights Go Out
&lt;/h2&gt;

&lt;p&gt;It’s 11 PM on payday. Your payments service crashes. Notifications blow up, managers panic, and someone asks the inevitable question: &lt;em&gt;“Who’s fixing this?”&lt;/em&gt;  &lt;/p&gt;

&lt;p&gt;The service owner jumps in to roll back and stabilize. Meanwhile, a payments SME joins the call, explaining why retry logic wasn’t built to handle this traffic spike and suggesting how to prevent it next time.  &lt;/p&gt;

&lt;p&gt;Two people, two very different roles, both essential in that moment. One takes responsibility for restoring the system. The other shapes the long-term fix. This is the difference between an &lt;strong&gt;Owner Expert&lt;/strong&gt; and a &lt;strong&gt;Subject Matter Expert (SME)&lt;/strong&gt;.  &lt;/p&gt;




&lt;h2&gt;
  
  
  The Coach: Subject Matter Expert (SME)
&lt;/h2&gt;

&lt;p&gt;Every team has someone who knows a specific domain inside out. They might not be the one pushing commits at midnight, but when you’re designing a new feature or trying to avoid a costly mistake, they’re the first call.  &lt;/p&gt;

&lt;p&gt;That’s the SME — the coach of the engineering world. They’re the ones who:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Define standards and best practices.
&lt;/li&gt;
&lt;li&gt;Review designs and guide decisions.
&lt;/li&gt;
&lt;li&gt;Train teams so that knowledge spreads, not just stays locked in one head.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SMEs aren’t measured by the uptime of a single service. Their impact comes from enabling &lt;em&gt;multiple teams&lt;/em&gt; to work smarter and more consistently.  &lt;/p&gt;

&lt;p&gt;But expertise alone isn’t enough. Every team also needs someone who takes the field, owns the system, and carries the accountability.  &lt;/p&gt;




&lt;h2&gt;
  
  
  The Captain: Owner Expert
&lt;/h2&gt;

&lt;p&gt;Where the SME guides, the Owner delivers. This is the engineer who carries the weight of a system every day. If it breaks, they fix it. If it needs scaling, they plan it.  &lt;/p&gt;

&lt;p&gt;Think of the Owner Expert as the captain of the ship. They’re the ones who:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep uptime and reliability on track.
&lt;/li&gt;
&lt;li&gt;Handle incidents, rollbacks, and bug fixes.
&lt;/li&gt;
&lt;li&gt;Own the costs, performance, and stability of their system.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the SME is about &lt;strong&gt;how things should be done&lt;/strong&gt;, the Owner is about &lt;strong&gt;making sure it actually gets done&lt;/strong&gt;. And the difference becomes obvious when you compare them side by side.  &lt;/p&gt;




&lt;h2&gt;
  
  
  Coaches and Captains in Action
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;SME (Coach)&lt;/th&gt;
&lt;th&gt;Owner Expert (Captain)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Security SME (defines authentication guidelines)&lt;/td&gt;
&lt;td&gt;Identity Service Owner (keeps login alive)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database SME (advises schema, migration)&lt;/td&gt;
&lt;td&gt;Product DB Owner (ensures data is consistent and available)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend SME (drives design system adoption)&lt;/td&gt;
&lt;td&gt;Web App Owner (meets Lighthouse score and bug SLA)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The SME spreads knowledge across teams. The Owner goes deep on a single service. Neither role is optional — and nowhere is this more obvious than in payments.  &lt;/p&gt;




&lt;h2&gt;
  
  
  The Payments Story
&lt;/h2&gt;

&lt;p&gt;Picture the same e-commerce platform. The payments SME is the one who wrote the playbook: retry strategies, PCI-DSS compliance, fraud detection libraries. They’ve made sure every squad knows the rules of the game.  &lt;/p&gt;

&lt;p&gt;But when the API slows down at 11 PM on payday, it’s the payments service owner who’s accountable. They’re the one watching latency, applying fixes, and making sure money keeps flowing.  &lt;/p&gt;

&lt;p&gt;The SME sets the strategy. The Owner executes under pressure. To formalize this relationship, many companies use RACI.  &lt;/p&gt;




&lt;h2&gt;
  
  
  Making It Clear with RACI
&lt;/h2&gt;

&lt;p&gt;RACI — Responsible, Accountable, Consulted, Informed — helps untangle who does what.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SMEs fit best as &lt;strong&gt;Consulted&lt;/strong&gt;: they guide, review, and teach.
&lt;/li&gt;
&lt;li&gt;Owners are both &lt;strong&gt;Responsible and Accountable&lt;/strong&gt;: they’re on the hook for results.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s what that looks like in practice:  &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Responsible&lt;/th&gt;
&lt;th&gt;Accountable&lt;/th&gt;
&lt;th&gt;Consulted&lt;/th&gt;
&lt;th&gt;Informed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Checkout outage&lt;/td&gt;
&lt;td&gt;Checkout service owner&lt;/td&gt;
&lt;td&gt;Checkout service owner&lt;/td&gt;
&lt;td&gt;Infra SME (incident review)&lt;/td&gt;
&lt;td&gt;PM, leadership&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database migration&lt;/td&gt;
&lt;td&gt;Product DB owner&lt;/td&gt;
&lt;td&gt;Product DB owner&lt;/td&gt;
&lt;td&gt;Database SME&lt;/td&gt;
&lt;td&gt;Affected squads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design system rollout&lt;/td&gt;
&lt;td&gt;Web app owners&lt;/td&gt;
&lt;td&gt;Web app owners&lt;/td&gt;
&lt;td&gt;Frontend SME&lt;/td&gt;
&lt;td&gt;UX team, PM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI/CD deployment strategy&lt;/td&gt;
&lt;td&gt;Pipeline owner&lt;/td&gt;
&lt;td&gt;Pipeline owner&lt;/td&gt;
&lt;td&gt;DevOps SME&lt;/td&gt;
&lt;td&gt;All squads&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Another Story: The Migration Gone Wrong
&lt;/h3&gt;

&lt;p&gt;Last year, a squad ran a database migration without consulting the database SME. On paper, the product DB owner was both responsible and accountable. They executed the migration, but a subtle indexing issue caused queries to crawl, impacting three other services.  &lt;/p&gt;

&lt;p&gt;It took hours of firefighting before the SME jumped in, identified the missing partitioning strategy, and guided the fix. The owner restored the system, but the SME made sure the same mistake would never happen again.  &lt;/p&gt;

&lt;p&gt;This is RACI in action: owners get systems back online, SMEs make sure the org learns and doesn’t repeat mistakes.  &lt;/p&gt;

&lt;p&gt;But RACI only clarifies &lt;em&gt;roles&lt;/em&gt;. To really measure success, you need OKRs.  &lt;/p&gt;




&lt;h2&gt;
  
  
  From Roles to Results: OKRs
&lt;/h2&gt;

&lt;p&gt;RACI defines responsibilities. OKRs define outcomes.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An &lt;strong&gt;Owner Expert OKR&lt;/strong&gt; might be: “Reduce failed deployments to fewer than 3 per quarter.”
&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;SME OKR&lt;/strong&gt; might be: “Ensure 100% of squads adopt canary deployment guidelines by end of quarter.”
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Owners are measured by the health of their system. SMEs are measured by the adoption of their expertise. And when you make them SMART, they become even sharper.  &lt;/p&gt;




&lt;h2&gt;
  
  
  SMART OKRs for Coaches and Captains
&lt;/h2&gt;

&lt;p&gt;SMART (Specific, Measurable, Achievable, Relevant, Time-bound) highlights the contrast.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Owners&lt;/strong&gt;: SMART OKRs are outcome-driven.&lt;br&gt;&lt;br&gt;
Example: &lt;em&gt;“Maintain 99.95% uptime for Checkout API by end of Q2.”&lt;/em&gt;  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SMEs&lt;/strong&gt;: SMART OKRs are adoption-driven.&lt;br&gt;&lt;br&gt;
Example: &lt;em&gt;“Train 4 squads on retry logic best practices by end of Q2.”&lt;/em&gt;  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One ensures delivery. The other ensures consistency. Together, they keep the org moving forward. But how does this cascade in a real workflow?  &lt;/p&gt;




&lt;h2&gt;
  
  
  Cascading OKRs in Practice
&lt;/h2&gt;

&lt;p&gt;Imagine the chain from manager to IC.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Engineering Manager (SME role)&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Objective: Improve backend reliability across the org&lt;br&gt;&lt;br&gt;
Key Result: 99.95% uptime across critical services  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech Lead (Owner Expert for Checkout Service)&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Objective: Improve Checkout API reliability&lt;br&gt;&lt;br&gt;
Key Result: Reduce p95 latency from 300ms to 200ms  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IC (Backend Engineer in Checkout Squad)&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Objective: Contribute to Checkout reliability&lt;br&gt;&lt;br&gt;
Key Result: Refactor retry logic to cut failures by 15%  &lt;/p&gt;

&lt;p&gt;Each level connects. The IC’s change drives the TL’s service reliability, which rolls up to the manager’s org-wide outcome. When done right, cascading OKRs create alignment instead of silos.  &lt;/p&gt;




&lt;h2&gt;
  
  
  Closing the Loop
&lt;/h2&gt;

&lt;p&gt;SMEs and Owners are not competing roles. They’re complementary. The SME ensures everyone knows the right way to play the game. The Owner ensures the game is actually won.  &lt;/p&gt;

&lt;p&gt;Without SMEs, teams scatter into inconsistency. Without Owners, accountability disappears. Together, they bring both breadth and depth.  &lt;/p&gt;

&lt;p&gt;If your org doesn’t know who the SMEs and Owners are, your OKRs will drift and responsibility will blur. Define them early, connect them with RACI, and cascade their OKRs. That’s how you get teams that stay aligned at 2 PM in a planning meeting — and steady at 2 AM during a production fire.&lt;/p&gt;

</description>
      <category>leadership</category>
      <category>softwareengineering</category>
      <category>management</category>
      <category>career</category>
    </item>
    <item>
      <title>Sales Talks Impact, Engineers Talk Process. Bridging the Language Gap</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Thu, 28 Aug 2025 01:42:59 +0000</pubDate>
      <link>https://dev.to/adityasatrio/sales-talks-impact-engineers-talk-process-bridging-the-language-gap-4on</link>
      <guid>https://dev.to/adityasatrio/sales-talks-impact-engineers-talk-process-bridging-the-language-gap-4on</guid>
      <description>&lt;h2&gt;
  
  
  The Meeting Room Divide
&lt;/h2&gt;

&lt;p&gt;Picture this: you’re in a leadership meeting.  &lt;/p&gt;

&lt;p&gt;The Head of Sales is energized, pacing at the front of the room:  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“There’s a huge opportunity in this new vertical. If we move fast, we could add 20% revenue this quarter!”  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;On the other side of the table, the Head of Engineering calmly counters:  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“We need to improve our deployment process. Change failure rates are too high, and our lead time is slowing down.”  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Both are speaking passionately, both are right — yet they sound like they’re on different planets.  &lt;/p&gt;

&lt;p&gt;Here’s the truth: &lt;strong&gt;business and sales-oriented people talk about opportunities for sales impact, while software engineers talk about processes for quality impact&lt;/strong&gt;.  &lt;/p&gt;

&lt;p&gt;It’s not that one is short-term and the other long-term. It’s simply two different lenses on the same mission: building a business that grows, scales, and lasts.  &lt;/p&gt;




&lt;h2&gt;
  
  
  Two Languages, One Goal
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sales / Business Lens&lt;/strong&gt; → “Opportunities” → revenue, market share, customer acquisition.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engineering Lens&lt;/strong&gt; → “Processes” → code quality, deployment stability, defect rates, developer productivity.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both are obsessed with impact — just measured differently. Sales sees &lt;em&gt;impact on top-line revenue&lt;/em&gt;. Engineering sees &lt;em&gt;impact on system reliability and delivery velocity&lt;/em&gt;.  &lt;/p&gt;

&lt;p&gt;If these two perspectives remain disconnected, companies end up with mismatched expectations: Sales closes big deals the system can’t support, or Engineering optimizes processes with no clear link to business growth.  &lt;/p&gt;

&lt;p&gt;The bridge lies in &lt;strong&gt;translation&lt;/strong&gt;. And here’s where research and experience back this up.  &lt;/p&gt;




&lt;h2&gt;
  
  
  The Engineering Side: Process = Impact
&lt;/h2&gt;

&lt;p&gt;Software engineering has long been treated as an internal cost center. But modern research shows otherwise.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In &lt;em&gt;Accelerate&lt;/em&gt; (Forsgren, Humble, Kim), the authors found that &lt;strong&gt;elite software teams deploy 46x more frequently and recover from incidents 96x faster than low performers&lt;/strong&gt;. These process improvements directly correlate with business performance: profitability, market share, and customer satisfaction.
&lt;/li&gt;
&lt;li&gt;The DORA metrics (Deployment Frequency, Lead Time, Mean Time to Restore, Change Failure Rate) are now industry standards precisely because they prove that process quality isn’t “nice-to-have” — it drives competitiveness.
&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;The Phoenix Project&lt;/em&gt; (Gene Kim et al.) illustrates this in story form: organizations that ignore engineering bottlenecks see their business grind to a halt, no matter how strong their sales pipeline is.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words: &lt;strong&gt;better processes lead to higher-quality software, which leads to faster time-to-market, fewer outages, and ultimately happier customers who stay and spend more&lt;/strong&gt;.  &lt;/p&gt;




&lt;h2&gt;
  
  
  The Sales Side: Opportunity = Impact
&lt;/h2&gt;

&lt;p&gt;Sales, on the other hand, has always been outcome-obsessed — but the best sales thinking also emphasizes &lt;em&gt;process discipline&lt;/em&gt;.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;The Challenger Sale&lt;/em&gt; (Dixon &amp;amp; Adamson) showed that top-performing sales reps succeed not by chasing every lead, but by following a repeatable approach: teach, tailor, and take control.
&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;SPIN Selling&lt;/em&gt; (Rackham) provides a structured framework: Situation → Problem → Implication → Need-payoff. This isn’t freewheeling persuasion; it’s process that scales.
&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Crossing the Chasm&lt;/em&gt; (Geoffrey Moore) demonstrates that capturing new markets requires systematic go-to-market strategies, not opportunistic wins.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words: &lt;strong&gt;opportunities convert into real impact only if sales organizations follow repeatable, quality-driven processes&lt;/strong&gt;.  &lt;/p&gt;




&lt;h2&gt;
  
  
  Side-by-Side Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Sales / Business Lens&lt;/th&gt;
&lt;th&gt;Engineering Lens&lt;/th&gt;
&lt;th&gt;Common Ground&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Focus&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Opportunities → Revenue Impact&lt;/td&gt;
&lt;td&gt;Processes → Quality Impact&lt;/td&gt;
&lt;td&gt;Both seek predictable growth&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Metrics&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pipeline health, win rate, ARR uplift&lt;/td&gt;
&lt;td&gt;Deployment frequency, MTTR, defect ratio&lt;/td&gt;
&lt;td&gt;Predictability and trust&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Risk if ignored&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Missed deals, poor market capture&lt;/td&gt;
&lt;td&gt;System failure, poor velocity, high churn&lt;/td&gt;
&lt;td&gt;Lost credibility and revenue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Key References&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Challenger Sale, SPIN Selling, Crossing the Chasm&lt;/td&gt;
&lt;td&gt;Accelerate, Phoenix Project, DORA metrics&lt;/td&gt;
&lt;td&gt;Discipline creates impact&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Notice the symmetry: both sides care about impact, but &lt;strong&gt;impact without process is fragile, and process without impact is meaningless&lt;/strong&gt;.  &lt;/p&gt;




&lt;h2&gt;
  
  
  A Real-World Use Case
&lt;/h2&gt;

&lt;p&gt;Let’s make this real.  &lt;/p&gt;

&lt;p&gt;A SaaS startup landed a major enterprise client — a global bank. The sales team celebrated: millions in potential ARR, new market credibility, and a case study to unlock future deals.  &lt;/p&gt;

&lt;p&gt;But Engineering had concerns. Their deployment pipeline had a 20% change failure rate. Incidents occurred weekly. Monitoring was minimal. The bank expected a 99.9% SLA.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sales was talking &lt;strong&gt;opportunity → impact&lt;/strong&gt;: “This deal could double our revenue.”
&lt;/li&gt;
&lt;li&gt;Engineering was talking &lt;strong&gt;process → quality&lt;/strong&gt;: “Without fixing deployment, we risk outages that break our SLA.”
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Initially, leadership saw this as friction. But once translated, it became synergy. Engineering tied their improvements directly to business outcomes:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Before:&lt;/strong&gt; 12 incidents per month, 4-hour average recovery, risk of SLA penalties.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After CI/CD &amp;amp; automated testing improvements:&lt;/strong&gt; incidents down to 5/month, recovery time cut to under 1 hour.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That reliability gave Sales the confidence to close two more enterprise clients, together worth $3M ARR.  &lt;/p&gt;

&lt;p&gt;The result: &lt;strong&gt;process improvements in engineering enabled opportunity capture in sales&lt;/strong&gt;.  &lt;/p&gt;




&lt;h2&gt;
  
  
  Insight
&lt;/h2&gt;

&lt;p&gt;Business leaders and engineers don’t need to “speak the same language.” What they need is &lt;strong&gt;translation and alignment&lt;/strong&gt;.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sales translates opportunities into revenue impact.
&lt;/li&gt;
&lt;li&gt;Engineering translates processes into quality impact.
&lt;/li&gt;
&lt;li&gt;Together, they drive sustainable, scalable growth.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As &lt;em&gt;Accelerate&lt;/em&gt; shows, quality in engineering fuels business performance. As &lt;em&gt;The Challenger Sale&lt;/em&gt; and &lt;em&gt;SPIN Selling&lt;/em&gt; show, process discipline in sales fuels revenue impact.  &lt;/p&gt;

&lt;p&gt;The companies that win are those that see beyond the divide and recognize the truth: &lt;strong&gt;sales talks about what is possible, engineering ensures it is sustainable&lt;/strong&gt;.  &lt;/p&gt;

</description>
    </item>
    <item>
      <title>Tabular vs Columnar Databases</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Mon, 11 Aug 2025 12:30:52 +0000</pubDate>
      <link>https://dev.to/adityasatrio/tabular-vs-columnar-databases-32lj</link>
      <guid>https://dev.to/adityasatrio/tabular-vs-columnar-databases-32lj</guid>
      <description>&lt;p&gt;When you first hear “tabular” vs “columnar” databases, it might sound like an abstract storage concept. But if we put it into a &lt;strong&gt;grocery shopping&lt;/strong&gt; analogy, it suddenly becomes a lot easier to grasp.&lt;/p&gt;




&lt;h2&gt;
  
  
  🛒 The Grocery Store Analogy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Tabular (Row-Oriented) — Shopping by Recipe&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In a &lt;strong&gt;row-oriented&lt;/strong&gt; (tabular) database, data is stored &lt;strong&gt;row by row&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Imagine a grocery store where each aisle contains &lt;strong&gt;everything you need for a single recipe&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Aisle 1&lt;/strong&gt; → Spaghetti Bolognese kit (pasta, sauce, beef, spices)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aisle 2&lt;/strong&gt; → Chicken Curry kit (chicken, curry paste, coconut milk, rice)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aisle 3&lt;/strong&gt; → Salad kit (lettuce, tomato, dressing, croutons)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re cooking one recipe, you simply go to that aisle and grab all the ingredients in one go.&lt;/p&gt;

&lt;p&gt;💡 &lt;strong&gt;Best for:&lt;/strong&gt; Tasks where you often need &lt;em&gt;all data for a single record&lt;/em&gt;, like retrieving a full customer profile or processing a transaction.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Columnar (Column-Oriented) — Shopping by Ingredient&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In a &lt;strong&gt;column-oriented&lt;/strong&gt; database, data is stored &lt;strong&gt;column by column&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Imagine a grocery store organized by ingredient type:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Aisle 1&lt;/strong&gt; → All pasta types
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aisle 2&lt;/strong&gt; → All sauces
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aisle 3&lt;/strong&gt; → All meats
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aisle 4&lt;/strong&gt; → All vegetables
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to find &lt;em&gt;all tomatoes&lt;/em&gt; in the store, you only go to the vegetable aisle — you don’t waste time walking through every recipe aisle.&lt;/p&gt;

&lt;p&gt;💡 &lt;strong&gt;Best for:&lt;/strong&gt; Analytical tasks where you scan specific columns over large datasets — like calculating the average age of all customers or the total sales per region.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚖️ Pros &amp;amp; Cons
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Tabular (Row-Oriented)&lt;/th&gt;
&lt;th&gt;Columnar (Column-Oriented)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Optimized for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OLTP (transactions)&lt;/td&gt;
&lt;td&gt;OLAP (analytics)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Read pattern&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;All columns for a few rows&lt;/td&gt;
&lt;td&gt;A few columns for many rows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Insert/Update speed&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Aggregate queries&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;td&gt;Very fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compression&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lower&lt;/td&gt;
&lt;td&gt;Higher&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Examples&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MySQL, PostgreSQL, SQL Server&lt;/td&gt;
&lt;td&gt;ClickHouse, BigQuery, Redshift, Snowflake&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  📌 Best Use Cases
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Tabular (Row-Oriented)&lt;/strong&gt; is ideal when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You’re handling &lt;strong&gt;real-time transactions&lt;/strong&gt; (banking, e-commerce orders, POS systems).&lt;/li&gt;
&lt;li&gt;You frequently insert, update, and delete individual rows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Columnar (Column-Oriented)&lt;/strong&gt; is ideal when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You’re running &lt;strong&gt;heavy analytics&lt;/strong&gt; on large datasets.&lt;/li&gt;
&lt;li&gt;You often aggregate or filter by specific columns.&lt;/li&gt;
&lt;li&gt;Your queries typically touch a small subset of columns but many rows.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ⚠️ Pitfalls to Watch Out For
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Tabular
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Inefficient for analytical queries on large datasets.&lt;/li&gt;
&lt;li&gt;Higher storage I/O when only a few columns are needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Columnar
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Poor performance for frequent single-row updates.&lt;/li&gt;
&lt;li&gt;More complex transactional handling — often not the best choice as a primary OLTP store.&lt;/li&gt;
&lt;li&gt;Can be overkill for small datasets or systems with simple queries.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔧 Popular Tools
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Row-Oriented (Tabular):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MySQL&lt;/li&gt;
&lt;li&gt;PostgreSQL&lt;/li&gt;
&lt;li&gt;Oracle Database&lt;/li&gt;
&lt;li&gt;Microsoft SQL Server&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Column-Oriented:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Google BigQuery&lt;/li&gt;
&lt;li&gt;Amazon Redshift&lt;/li&gt;
&lt;li&gt;Snowflake&lt;/li&gt;
&lt;li&gt;ClickHouse&lt;/li&gt;
&lt;li&gt;Apache Parquet (file format)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🏁 Quick Takeaway
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tabular (Row)&lt;/strong&gt; → “Give me &lt;strong&gt;everything&lt;/strong&gt; about one thing.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Columnar&lt;/strong&gt; → “Give me &lt;strong&gt;one thing&lt;/strong&gt; about everything.”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Choosing the right one depends on your workload — transactional systems thrive on tabular, while analytics shines on columnar.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>📘 Build a Tech Performance Framework for Engineering OKRs That Actually Drive Impact</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Mon, 23 Jun 2025 23:32:48 +0000</pubDate>
      <link>https://dev.to/adityasatrio/-build-a-tech-performance-framework-for-engineering-okrs-that-actually-drive-impact-12d6</link>
      <guid>https://dev.to/adityasatrio/-build-a-tech-performance-framework-for-engineering-okrs-that-actually-drive-impact-12d6</guid>
      <description>&lt;p&gt;In my experience leading engineering teams, I’ve found that the hardest part of OKRs isn’t setting them — it’s making sure they &lt;strong&gt;actually mean something&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Too many teams set OKRs like "refactor the admin panel" or "increase test coverage" without asking the bigger question:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;What business outcome are we trying to enable?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This post introduces a simple, powerful framework I use to ensure every engineering OKR ladders up to something that &lt;strong&gt;matters&lt;/strong&gt; — whether that’s profitability, product reliability, user experience, or operational efficiency.&lt;/p&gt;




&lt;h2&gt;
  
  
  🎯 Why This Framework Exists
&lt;/h2&gt;

&lt;p&gt;Most engineering leaders know we should align our work to business goals. But how do you translate something like “reduce churn” or “increase CVR” into backend initiatives or platform improvements?&lt;/p&gt;

&lt;p&gt;The answer: start with &lt;strong&gt;engineering fundamentals&lt;/strong&gt; that map cleanly to business impact, not just project deliverables.&lt;/p&gt;

&lt;p&gt;This framework helps me to align the tech metrics with the business metrics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prioritize what to build and what to cut&lt;/li&gt;
&lt;li&gt;Make trade-offs explicit (not accidental)&lt;/li&gt;
&lt;li&gt;Hold teams accountable with metrics that matter&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔺 The “Project Management Triangle” and Why It Still Matters
&lt;/h2&gt;

&lt;p&gt;You’ve probably heard the saying:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“You can have it fast, cheap, or good — pick two.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This idea is rooted in what’s known academically as the &lt;strong&gt;Project Management Triangle&lt;/strong&gt;, sometimes called the &lt;strong&gt;Iron Triangle&lt;/strong&gt; or informally the &lt;strong&gt;Golden Triangle&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It describes the fundamental trade-offs in any technical decision:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Constraint&lt;/th&gt;
&lt;th&gt;Engineering Focus&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Speed&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Delivery &amp;amp; time-to-market&lt;/td&gt;
&lt;td&gt;Lead time, sprint velocity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bug prevention, testability, stability&lt;/td&gt;
&lt;td&gt;Defect rates, incident count&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Infra and labor efficiency&lt;/td&gt;
&lt;td&gt;Infra cost, developer utilization&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;No matter the size of the org, these tensions always exist, and the C-Level mostly only care about this triangle. And the best engineering OKRs don’t ignore them — they &lt;strong&gt;make them visible and intentional&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 What the Experts Say (And Why I Take It Seriously)
&lt;/h2&gt;

&lt;p&gt;This framework is inspired by some of the best minds in software engineering and DevOps.&lt;/p&gt;

&lt;h3&gt;
  
  
  📘 &lt;em&gt;The Mythical Man-Month&lt;/em&gt; — Frederick Brooks
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Adding manpower to a late software project makes it later.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Brooks explains how rushing projects often leads to even more delays and coordination overhead. A powerful reminder that &lt;strong&gt;quality and speed are not linearly scalable&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  ⚙️ &lt;em&gt;Continuous Delivery&lt;/em&gt; — Jez Humble &amp;amp; David Farley
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“If it hurts, do it more often.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This quote refers to things like testing, deployment, and integration. The more painful a process is, the more it needs to be automated, so quality doesn’t degrade as you scale speed.&lt;/p&gt;




&lt;h3&gt;
  
  
  📈 &lt;em&gt;Accelerate&lt;/em&gt; — Nicole Forsgren, Jez Humble, Gene Kim
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“High performers deploy more frequently, recover faster, and are more stable.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This book backs everything with data. The takeaway? &lt;strong&gt;You don’t have to trade speed for quality&lt;/strong&gt; — high-performing teams achieve both.&lt;/p&gt;




&lt;h3&gt;
  
  
  ✅ ISO/IEC 25010:2011
&lt;/h3&gt;

&lt;p&gt;This global standard defines what "software quality" actually means, beyond just bugs. It includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reliability&lt;/li&gt;
&lt;li&gt;Maintainability&lt;/li&gt;
&lt;li&gt;Performance efficiency&lt;/li&gt;
&lt;li&gt;Functional suitability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These ideas directly inspired the six dimensions below.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧱 The 7 Dimensions of Tech Performance
&lt;/h2&gt;

&lt;p&gt;Every good engineering OKR I’ve seen (or set) can be mapped to one or more of the following &lt;strong&gt;seven dimensions&lt;/strong&gt;. These are the &lt;strong&gt;technical levers that actually move the business&lt;/strong&gt; — across speed, reliability, cost, and growth-readiness.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;What It Measures&lt;/th&gt;
&lt;th&gt;Example Metrics&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1. Delivery&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How fast and predictably we ship value&lt;/td&gt;
&lt;td&gt;Lead time, deployment frequency, sprint velocity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2. Quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How well we avoid defects and rework&lt;/td&gt;
&lt;td&gt;Defect rate, escaped bugs, test coverage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3. Availability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Whether the system is up when users need it&lt;/td&gt;
&lt;td&gt;Uptime %, MTTR, alerting coverage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;4. Reliability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Whether the system behaves as expected under normal use&lt;/td&gt;
&lt;td&gt;API P95 latency, crash-free sessions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;5. Maintainability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How easily the system can evolve without breaking&lt;/td&gt;
&lt;td&gt;PR cycle time, SonarQube score, legacy deprecation progress&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;6. Cost Efficiency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How efficiently we use compute and human resources&lt;/td&gt;
&lt;td&gt;Infra cost/session, cloud bill reduction, manual hour savings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;7. Scalability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How well the system performs as usage or data grows&lt;/td&gt;
&lt;td&gt;Throughput under load, autoscaling behavior, resource saturation thresholds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;🧠 Pro tip: Every OKR should align to &lt;strong&gt;at least two&lt;/strong&gt; of these dimensions. One is not enough.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧭 Aligning to Business Impact (Without Internal Jargon)
&lt;/h2&gt;

&lt;p&gt;Instead of exposing internal OKRs, I prefer to frame impact areas like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔄 Improving system stability for user-facing products&lt;/li&gt;
&lt;li&gt;📈 Supporting growth experiments by speeding up delivery&lt;/li&gt;
&lt;li&gt;💰 Reducing cloud infrastructure and operational costs&lt;/li&gt;
&lt;li&gt;🔧 Eliminating manual work through better tooling&lt;/li&gt;
&lt;li&gt;🧪 Improving data quality to make analytics more trustworthy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These themes are universally valuable, whether you’re in a startup or scaling enterprise.&lt;/p&gt;

&lt;p&gt;So, when I review OKRs, I ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Does this actually improve one of those outcomes?&lt;/em&gt;&lt;br&gt;&lt;br&gt;
If not, it's probably technical debt disguised as a priority.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧠 Example OKRs Using This Framework
&lt;/h2&gt;

&lt;p&gt;Here’s what this looks like in practice&lt;/p&gt;

&lt;h3&gt;
  
  
  🚀 Improve Admin Dashboard Quality
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Objective&lt;/th&gt;
&lt;th&gt;Sunset legacy platform and reduce manual issues&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;KR 1&lt;/td&gt;
&lt;td&gt;Avoid security issues, migrate 100% of Legacy admin dashboard to the new code base&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KR 2&lt;/td&gt;
&lt;td&gt;Improve Sentry Perf score page XXX in the admin dashboard by 90%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;📌 Dimensions: &lt;strong&gt;Maintainability&lt;/strong&gt;, &lt;strong&gt;Reliability&lt;/strong&gt;, &lt;strong&gt;Quality&lt;/strong&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  💸 Infra Cost Optimization
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Objective&lt;/th&gt;
&lt;th&gt;Reduce infrastructure cost and latency API&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;KR 1&lt;/td&gt;
&lt;td&gt;Reduce Database reads by 60%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KR 2&lt;/td&gt;
&lt;td&gt;Keep P95 check-in latency ≤ 500ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;📌 Dimensions: &lt;strong&gt;Cost Efficiency&lt;/strong&gt;, &lt;strong&gt;Reliability&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔁 How I Operationalize This
&lt;/h2&gt;

&lt;p&gt;Use this framework not just for OKR &lt;strong&gt;planning&lt;/strong&gt;, but for &lt;strong&gt;ongoing decision-making&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;During planning&lt;/strong&gt;: Tag each draft OKR with the dimensions it targets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;During reviews&lt;/strong&gt;: Check if any key business outcomes are neglected&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;During sprints&lt;/strong&gt;: Map Jira stories to the OKRs and dimensions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Tools that you can use&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Jira, Google sheet (delivery &amp;amp; velocity)&lt;/li&gt;
&lt;li&gt;APM like Sentry or New Relic (monitoring and error tracking)&lt;/li&gt;
&lt;li&gt;Static code analysis, SonarQube (maintainability)&lt;/li&gt;
&lt;li&gt;GCP/AWS billing menu for cost reports&lt;/li&gt;
&lt;li&gt;Team WIKI, you can use Confluence or Notion (shared visibility)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔚 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;You don’t need 20 OKRs to show impact. You need &lt;strong&gt;fewer, smarter, well-targeted ones&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This framework — based on engineering theory, real-world use case, and business alignment — helps me set OKRs that do more than just tick boxes.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They guide teams.
&lt;/li&gt;
&lt;li&gt;They inform trade-offs.
&lt;/li&gt;
&lt;li&gt;They create leverage.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Let’s stop writing OKRs that “sound good” or are not correlated with business impacts, let's start writing ones that &lt;strong&gt;move the needle&lt;/strong&gt; — for real.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
    </item>
    <item>
      <title>Comparing Software Architecture Documentation Models and When to Use Them</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Mon, 23 Jun 2025 23:04:10 +0000</pubDate>
      <link>https://dev.to/adityasatrio/comparing-software-architecture-documentation-models-and-when-to-use-them-495n</link>
      <guid>https://dev.to/adityasatrio/comparing-software-architecture-documentation-models-and-when-to-use-them-495n</guid>
      <description>&lt;p&gt;Documenting software architecture isn’t just a formality—it’s a critical tool for communication, onboarding, and decision-making. While the &lt;strong&gt;C4 Model&lt;/strong&gt; has become popular for its simplicity and developer focus, there are several other frameworks and templates, each with strengths for specific contexts.&lt;/p&gt;

&lt;p&gt;This post breaks down the most widely used architecture documentation models, compares them, highlights real-world use cases, and provides concrete examples to help you choose the right approach for your team and project.&lt;/p&gt;




&lt;h3&gt;
  
  
  1. &lt;strong&gt;C4 Model&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is it?&lt;/strong&gt;&lt;br&gt;
A hierarchical model (Context, Container, Component, Code) for visualizing software architecture at different levels of detail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best For:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agile teams&lt;/li&gt;
&lt;li&gt;Developer-centric communication&lt;/li&gt;
&lt;li&gt;Fast onboarding&lt;/li&gt;
&lt;li&gt;Modern cloud-native applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to Use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When your audience includes developers, product owners, or external stakeholders who need a visual “big picture” down to component level.&lt;/li&gt;
&lt;li&gt;Projects where diagrams need to stay in sync with code (C4 can be generated from code in some tools).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SaaS web app:&lt;/strong&gt; Context diagram shows users, payment gateways, and your platform; container diagram shows API, frontend, and database; component diagram details API modules.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Drawbacks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Doesn’t prescribe document structure, just diagrams.&lt;/li&gt;
&lt;li&gt;Less focus on non-visual documentation (rationale, cross-cutting concerns).&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2. &lt;strong&gt;4+1 View Model&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is it?&lt;/strong&gt;&lt;br&gt;
Introduced by Philippe Kruchten, 4+1 organizes architecture into Logical, Development, Process, Physical views, plus Scenarios.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best For:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Large enterprise systems&lt;/li&gt;
&lt;li&gt;Projects with multiple stakeholder groups&lt;/li&gt;
&lt;li&gt;Situations where hardware, deployment, and runtime concerns matter&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to Use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When you need to separate “what the system does” (Logical) from “how it is deployed” (Physical), “how it is built” (Development), and “how it runs” (Process).&lt;/li&gt;
&lt;li&gt;Projects with non-technical and technical audiences.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Telecom system:&lt;/strong&gt; Logical view shows services, Development view shows microservices repos, Process view shows runtime processes/threads, Physical view maps containers to servers, Scenarios walk through call setup.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Drawbacks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More effort and overhead than C4.&lt;/li&gt;
&lt;li&gt;Can be overkill for simple systems.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3. &lt;strong&gt;Views and Beyond (V&amp;amp;B)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is it?&lt;/strong&gt;&lt;br&gt;
A framework from the Software Engineering Institute (SEI) that focuses on describing a system from different views (Module, Component &amp;amp; Connector, Allocation), each tailored to stakeholder concerns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best For:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complex systems with many stakeholders (ops, QA, business, dev)&lt;/li&gt;
&lt;li&gt;Organizations with a culture of detailed documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to Use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When you need to ensure every stakeholder’s concern is addressed.&lt;/li&gt;
&lt;li&gt;For compliance or formal architecture review processes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Banking platform:&lt;/strong&gt; Module view for code structure, Connector view for service integrations, Allocation view for cloud vs. on-prem deployment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Drawbacks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Heavyweight, can be too formal for agile/startup environments.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4. &lt;strong&gt;Arc42&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is it?&lt;/strong&gt;&lt;br&gt;
A template for comprehensive architecture documentation, combining structure (what to write) with flexibility (how to visualize).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best For:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams looking for a complete architecture documentation template&lt;/li&gt;
&lt;li&gt;Projects requiring thorough coverage (context, quality scenarios, cross-cutting concerns)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to Use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When you want to document not only structure, but also decisions, quality attributes, and concepts.&lt;/li&gt;
&lt;li&gt;Good for regulated environments or projects with high turnover.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Healthcare platform:&lt;/strong&gt; Use Arc42 template to document system context, business goals, architecture decisions, data flow, deployment, and risks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Drawbacks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can seem overwhelming at first.&lt;/li&gt;
&lt;li&gt;Not diagram-focused; you must choose your own diagram styles (often used with C4).&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  5. &lt;strong&gt;ISO/IEC/IEEE 42010&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is it?&lt;/strong&gt;&lt;br&gt;
An international standard for describing architecture using viewpoints and views.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best For:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Organizations needing compliance with international standards&lt;/li&gt;
&lt;li&gt;Very large, mission-critical projects&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to Use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When documentation must satisfy formal regulatory, client, or industry requirements.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Aerospace control system:&lt;/strong&gt; Architecture documentation split into safety, security, and deployment viewpoints as per ISO 42010 guidelines.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Drawbacks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Very formal and generic; doesn’t provide concrete diagram or template recommendations.&lt;/li&gt;
&lt;li&gt;Usually implemented through other frameworks (Arc42, 4+1, etc.).&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  6. &lt;strong&gt;ADR (Architecture Decision Records)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is it?&lt;/strong&gt;&lt;br&gt;
A lightweight way to document individual architectural or technical decisions as short markdown/text files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best For:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agile teams&lt;/li&gt;
&lt;li&gt;Projects where decisions evolve rapidly&lt;/li&gt;
&lt;li&gt;Complementing high-level documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to Use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When you want to record the “why” behind important choices (tech stack, database, patterns).&lt;/li&gt;
&lt;li&gt;When you need an auditable trail of decisions for future maintainers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Microservices platform:&lt;/strong&gt; Each decision (e.g., "Use Postgres instead of MySQL") gets a 1-pager with context, options, decision, consequences.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Drawbacks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Not a full documentation framework, but a supplement.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  7. &lt;strong&gt;UML (Unified Modeling Language)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is it?&lt;/strong&gt;&lt;br&gt;
A standard visual language with diagram types (class, sequence, deployment, etc.).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best For:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams needing detailed object-level diagrams&lt;/li&gt;
&lt;li&gt;Generating code from diagrams (and vice versa)&lt;/li&gt;
&lt;li&gt;Modeling at various levels of abstraction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to Use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When low-level relationships, interactions, or deployment details are needed.&lt;/li&gt;
&lt;li&gt;When standard visual notations are required (e.g., for handover).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Library management system:&lt;/strong&gt; UML class diagram for book, member, loan objects; sequence diagram for the checkout process.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Drawbacks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can get too detailed (“diagram for the sake of diagram”).&lt;/li&gt;
&lt;li&gt;Not a documentation methodology—just diagrams.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Comparison Table&lt;/strong&gt;
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;Use Case Example&lt;/th&gt;
&lt;th&gt;Overhead&lt;/th&gt;
&lt;th&gt;Visual/Template&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;C4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Modern, code-centric teams&lt;/td&gt;
&lt;td&gt;SaaS/web app&lt;/td&gt;
&lt;td&gt;Low-Med&lt;/td&gt;
&lt;td&gt;Visual&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;4+1 View&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multi-stakeholder, enterprise&lt;/td&gt;
&lt;td&gt;Telecom, ERP&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Visual&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;V&amp;amp;B (SEI)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Formal, many stakeholders&lt;/td&gt;
&lt;td&gt;Banking, critical infra&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Both&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Arc42&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Thorough, template-driven&lt;/td&gt;
&lt;td&gt;Healthcare, gov&lt;/td&gt;
&lt;td&gt;Medium-High&lt;/td&gt;
&lt;td&gt;Template&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ISO 42010&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Compliance, formal review&lt;/td&gt;
&lt;td&gt;Aerospace, defense&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Template&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ADR&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Decision history, agility&lt;/td&gt;
&lt;td&gt;All&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;UML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Object, behavior diagrams&lt;/td&gt;
&lt;td&gt;Component, flow models&lt;/td&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;Visual&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;How to Choose?&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For startups and fast-moving teams:&lt;/strong&gt; Start with &lt;strong&gt;C4&lt;/strong&gt; for high-level clarity + &lt;strong&gt;ADRs&lt;/strong&gt; for decision history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For large enterprises:&lt;/strong&gt; Use &lt;strong&gt;4+1 View&lt;/strong&gt; or &lt;strong&gt;V&amp;amp;B&lt;/strong&gt; if many stakeholders and complex concerns. &lt;strong&gt;Arc42&lt;/strong&gt; works great if you need a thorough template.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For regulated industries:&lt;/strong&gt; Consider &lt;strong&gt;ISO 42010&lt;/strong&gt; compliance, usually with &lt;strong&gt;Arc42&lt;/strong&gt; or &lt;strong&gt;V&amp;amp;B&lt;/strong&gt; as the documentation base.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For legacy systems or heavy object-oriented designs:&lt;/strong&gt; Use &lt;strong&gt;UML&lt;/strong&gt; diagrams as needed, but avoid over-documenting.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;No one-size-fits-all. Mix and match based on project complexity, team size, compliance needs, and audience. For most modern SaaS products, a combination of C4 diagrams, a few ADRs, and lightweight templates (even Notion or Markdown) gives the right balance of clarity and speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;References:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://c4model.com/" rel="noopener noreferrer"&gt;C4 Model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cs.ubc.ca/~gregor/teaching/papers/4+1view-architecture.pdf" rel="noopener noreferrer"&gt;4+1 View Model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arc42.org/" rel="noopener noreferrer"&gt;arc42.org&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.iso.org/standard/50508.html" rel="noopener noreferrer"&gt;ISO 42010 Standard&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/joelparkerhenderson/architecture_decision_record" rel="noopener noreferrer"&gt;ADR GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.uml.org/" rel="noopener noreferrer"&gt;UML&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>My Leadership Playbook</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Fri, 02 May 2025 05:49:36 +0000</pubDate>
      <link>https://dev.to/adityasatrio/my-leadership-playbook-imn</link>
      <guid>https://dev.to/adityasatrio/my-leadership-playbook-imn</guid>
      <description>&lt;p&gt;This playbook is based on my experience leading small teams in bootstrapped startups, navigating growing pains in scale-ups, and coaching managers in legacy corporations trying to stay relevant.&lt;/p&gt;

&lt;p&gt;Indonesian leadership isn't like Silicon Valley's and shouldn't try to be. We have our rhythms, cultural codes, expectations, and blind spots. We lead with respect, context, and emotions, but we sometimes struggle with clarity, consistency, and execution.&lt;/p&gt;

&lt;p&gt;This book is not a list of hacks. It is a mirror. A way to see what is working, what is not, and what might be worth rethinking, and trigger discussion.&lt;/p&gt;

&lt;p&gt;Each chapter starts with a story because leadership does not happen in slides. It happens in rooms, message threads, crisis calls, and awkward 1:1s. It happens in human moments.&lt;/p&gt;

&lt;p&gt;And that is where this playbook belongs: in real conversations, in real companies, with real people.&lt;/p&gt;

&lt;p&gt;Let's lead better. Together.&lt;/p&gt;




&lt;h2&gt;
  
  
  PLAY 1: Leader Selection "Beyond IQ &amp;amp; Degrees"
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Scene&lt;/strong&gt;: Imagine you are hiring a new team lead. One candidate graduated from a prestigious university with top marks. Another did not, but she/he has led teams through tough pivots, learned from failure, and kept their team intact through two reorgs. The first impresses the board. The second earns silent respect from peers. Who do you pick?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Lesson&lt;/strong&gt;: Credentials may open the door, but leadership walks through experience, resilience, and the ability to make others better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Evaluate three core dimensions: Cognitive Skill, Character/Temperament, and Cultural/Team Fit.&lt;/li&gt;
&lt;li&gt;Use real-world simulations: present them with live team or delivery problems to solve collaboratively.&lt;/li&gt;
&lt;li&gt;Ask for past examples of failure and rebound, not just success.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to avoid&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Overvaluing academic pedigree. Intelligence and maturity.&lt;/li&gt;
&lt;li&gt;Assuming confidence equals competence, some great leaders are calm, not loud.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cultural Frame (Indonesia)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Many local teams still respect authority from title or education. Your job is to model the shift toward credibility through action.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Adage&lt;/strong&gt;: Degrees do not lead teams. People do.&lt;/p&gt;




&lt;h2&gt;
  
  
  PLAY 2: Show up your Presence as Leadership
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Scene&lt;/strong&gt;: A product launch failed. Everyone's frustrated, but the team has not seen the Lead of the Engineering team in days. No acknowledgment, no regroup, no face. Slack goes quiet. Then the frustrated Engineers start looking for jobs. The narrative becomes: We are on our own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Lesson&lt;/strong&gt;: Leadership is not just about solving the problem it is about being there when the team needs a steady face, a decision, or just acknowledgment. Presence creates psychological safety. Absence creates drift.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Be visible during high-stress or high-uncertainty moments (failures, pivots, conflicts).&lt;/li&gt;
&lt;li&gt;Create rituals of visibility: daily standups, open office hours, and walkarounds.&lt;/li&gt;
&lt;li&gt;Practice emotional presence, and listen fully, even when you do not have the answer yet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to avoid&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Becoming a ghost leader during hard times.&lt;/li&gt;
&lt;li&gt;Confusing "trust" with "total hands-off".&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cultural Frame (Indonesia)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams are used to hierarchy, but what they remember is who stood with them when in difficult situations. In a high-context culture, silence often gets interpreted as ignorance or abandonment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Adage&lt;/strong&gt;: You do not need to have the answer. You need to be the one who stays in the room. Act as a team and make a plan to find the solutions together.&lt;/p&gt;




&lt;h2&gt;
  
  
  PLAY 3: Frugal Signals "Kill the Rolex Myth"
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Scene&lt;/strong&gt;: A new VP arrives at a startup still struggling with burn rate. He parks a luxury car at the office, wears a 150 million IDR watch, and casually mentions his Bali villa. That week, the team quietly stopped pushing extra hours. They think, Why bleed for a leader who is already cashed out?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Lesson&lt;/strong&gt;: Symbolism matters. In leadership, your choices signal values. Frugality is not about being cheap, it is about being credible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Model intentional modesty, especially during resource-constrained phases.&lt;/li&gt;
&lt;li&gt;Make financial decisions that reinforce the collective mission.&lt;/li&gt;
&lt;li&gt;Talk openly about company finances, so your modesty is not seen as performative it is aligned.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to avoid&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Flaunting wealth when the team is grinding.&lt;/li&gt;
&lt;li&gt;Using frugality to guilt employees.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cultural Frame (Indonesia)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We have a culture of quiet comparison. People notice, even if they do not say it. Your lifestyle choices impact morale, especially in collectivist teams.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Adage&lt;/strong&gt;: The more power you have, the more your choices echo.&lt;/p&gt;




&lt;h2&gt;
  
  
  PLAY 4: Mentor &amp;gt; Patron
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Scene&lt;/strong&gt;: A senior leader tells a junior PM, Just follow my instructions. The PM does but never grows. After 6 months, the leader complains, Why is this person so dependent? The answer: because you trained them to be.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Lesson&lt;/strong&gt;: Paternalistic leadership might bring speed early, but it kills scalability. Great leaders don't just solve problems they teach people how to think.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shift from command to coaching: Ask guiding questions instead of giving answers.&lt;/li&gt;
&lt;li&gt;Empower decisions with boundaries. You decide X, I will review Y&lt;/li&gt;
&lt;li&gt;Share your thought process, not just your conclusions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to avoid&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Being the bottleneck.&lt;/li&gt;
&lt;li&gt;Criticizing people for not thinking when you never gave them space to.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cultural Frame (Indonesia)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Many teams are trained to defer. That is okay, start from where they are, then lead them out of dependency with trust and teaching.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Adage&lt;/strong&gt;: a leader's job is to make themselves less needed, not more feared.&lt;/p&gt;




&lt;h2&gt;
  
  
  PLAY 5: Avoid Vacuums "Build Leadership Safety Nets"
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Scene&lt;/strong&gt;: A team is thriving. The founder takes a two-week vacation, proudly stating, Let them figure it out. It will build muscle. When he returned, 3 key decisions were delayed, 2 high performers were disengaged, and nobody knew who was accountable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Lesson&lt;/strong&gt;: Autonomy does not mean ambiguity. Absence without structure is abandonment. Great leaders step back with design.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create clear fallback roles: Who decides when you are out?&lt;/li&gt;
&lt;li&gt;Write "if-then" escalation plans for product, people, and conflict.&lt;/li&gt;
&lt;li&gt;Train people ahead of time, do not surprise them with sudden empowerment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to avoid&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using absence as a test. It is not a test if they are not prepared.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cultural Frame (Indonesia)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams still look upward for direction. Without clarity, they default to inaction. Build soft guardrails, not total freedom.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Adage&lt;/strong&gt;: Real delegation is not letting go, it is setting up first.&lt;/p&gt;




&lt;h2&gt;
  
  
  PLAY 6: Ritualize Execution
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Scene&lt;/strong&gt;: A startup team constantly misses deadlines. Everyone's busy, no one's aligned. When asked who is doing what, responses are vague: We will discuss in the next sync. The next sync? Canceled.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Lesson&lt;/strong&gt;: Without structure, energy scatters. Rituals are the heartbeat of execution. Not for bureaucracy but for rhythm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set non-negotiable cadences: daily standups, weekly planning, fortnight retros.&lt;/li&gt;
&lt;li&gt;Keep it sharp: focused agendas, rotating leads, timed updates.&lt;/li&gt;
&lt;li&gt;Add rhythm to recognition to praise, it's not a quarterly event.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to avoid&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Treating rituals like status updates. Make them decision moments.&lt;/li&gt;
&lt;li&gt;Letting meetings become autopilot or optional.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cultural Frame (Indonesia)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams respond well to predictable rhythm, especially in hybrid or distributed setups. It creates psychological security.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Adage&lt;/strong&gt;: Speed doesn't come from chaos. It comes from choreography.&lt;/p&gt;




&lt;h2&gt;
  
  
  PLAY 7: Filter Thought Leadership "BS"
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Scene&lt;/strong&gt;: A new leader joins and starts quoting buzzwords: Let's be agile, build tribes, and practice radical candor. Everyone nods. No one understands. Execution drops, alignment fades, and engineers roll their eyes whenever a new slogan drops.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Lesson&lt;/strong&gt;: Not all thought leadership is useful. A lot of it is packaged noise. Great leaders translate ideas into tools, not vibes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Validate frameworks before adoption: pilot in one team.&lt;/li&gt;
&lt;li&gt;Choose ideas based on problem relevance, not trendiness.&lt;/li&gt;
&lt;li&gt;Favor materials used in actual business schools over viral LinkedIn content.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to avoid&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Blindly importing Western models into local teams.&lt;/li&gt;
&lt;li&gt;Using jargon as a proxy for insight.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cultural Frame (Indonesia)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams value clarity. A simpler language works better. Over-complex methods often get quietly ignored.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Adage&lt;/strong&gt;: Good leadership is not found in quotes, but forged in context.&lt;/p&gt;




&lt;h3&gt;
  
  
  Final Reflection
&lt;/h3&gt;

&lt;p&gt;Leadership is rarely about knowing what to do. It is about doing what you know consistently. In Indonesian companies, where culture is high-context and hierarchies still shape interactions, leading well requires both strength and subtlety.&lt;/p&gt;

&lt;p&gt;You do not need to be loud to be powerful. You do not need to be perfect to be trusted. But you must be present. You must be clear. And you must be willing to grow while helping others grow.&lt;/p&gt;

&lt;p&gt;Use this playbook as your check-in guide, not your checklist. Come back to it when you feel stuck. Or when your team feels stuck.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Because in the end, leadership is not the spotlight. It is the structure that helps others shine.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This playbook is alive. Revisit it. Rewrite it. And apply it.&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Back-of-the-Envelope Thinking for Scalable System Design</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Mon, 21 Apr 2025 10:02:59 +0000</pubDate>
      <link>https://dev.to/adityasatrio/back-of-the-envelope-thinking-for-scalable-system-design-258g</link>
      <guid>https://dev.to/adityasatrio/back-of-the-envelope-thinking-for-scalable-system-design-258g</guid>
      <description>&lt;p&gt;Have you ever been assigned a project where you designed an architecture using all the latest state-of-the-art tools — sharded databases, message queues, event buses, and more? At first glance, the architecture looks impressive. It sounds cool. But does it really solve the core problem you're facing?&lt;/p&gt;

&lt;p&gt;Even if your CTO gives you the green light to build it, can you be sure the system will perform as expected? Once you start implementing it, doubt often creeps in. You begin questioning the performance, wondering how to validate your assumptions.&lt;/p&gt;

&lt;p&gt;One simple but powerful method to validate your system design early is through &lt;strong&gt;back-of-the-envelope calculations&lt;/strong&gt;. It helps you estimate, reason, and catch potential issues long before they become expensive mistakes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Back-of-the-envelope calculations&lt;/strong&gt; will help you create estimations using a combination of &lt;a href="http://en.wikipedia.org/wiki/Thought_experiment?ref=highscalability.com" rel="noopener noreferrer"&gt;thought experiments&lt;/a&gt; and common performance numbers to get a good feel for which designs will meet your requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧮 Operation Latency Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;No&lt;/th&gt;
&lt;th&gt;Original Data&lt;/th&gt;
&lt;th&gt;Activity Category&lt;/th&gt;
&lt;th&gt;Component Category&lt;/th&gt;
&lt;th&gt;Time (ns)&lt;/th&gt;
&lt;th&gt;Time (ms)&lt;/th&gt;
&lt;th&gt;Time (min)&lt;/th&gt;
&lt;th&gt;Time (hr)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;L1 cache reference&lt;/td&gt;
&lt;td&gt;read&lt;/td&gt;
&lt;td&gt;cache&lt;/td&gt;
&lt;td&gt;0.5&lt;/td&gt;
&lt;td&gt;0.0000005&lt;/td&gt;
&lt;td&gt;0.00000000000833&lt;/td&gt;
&lt;td&gt;0.00000000000139&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Branch mispredict&lt;/td&gt;
&lt;td&gt;misc&lt;/td&gt;
&lt;td&gt;cpu&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0.000005&lt;/td&gt;
&lt;td&gt;0.00000000008333&lt;/td&gt;
&lt;td&gt;0.00000000001389&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;L2 cache reference&lt;/td&gt;
&lt;td&gt;read&lt;/td&gt;
&lt;td&gt;cache&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;0.000007&lt;/td&gt;
&lt;td&gt;0.00000000011667&lt;/td&gt;
&lt;td&gt;0.00000000001944&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Mutex lock/unlock&lt;/td&gt;
&lt;td&gt;sync&lt;/td&gt;
&lt;td&gt;cpu&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;0.0001&lt;/td&gt;
&lt;td&gt;0.00000000166667&lt;/td&gt;
&lt;td&gt;0.00000000002778&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Main memory reference&lt;/td&gt;
&lt;td&gt;read&lt;/td&gt;
&lt;td&gt;memory&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;0.0001&lt;/td&gt;
&lt;td&gt;0.00000000166667&lt;/td&gt;
&lt;td&gt;0.00000000002778&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Compress 1K bytes with Zippy&lt;/td&gt;
&lt;td&gt;compute&lt;/td&gt;
&lt;td&gt;cpu&lt;/td&gt;
&lt;td&gt;10000&lt;/td&gt;
&lt;td&gt;0.01&lt;/td&gt;
&lt;td&gt;0.000000166667&lt;/td&gt;
&lt;td&gt;0.000000002778&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Send 2K bytes over 1 Gbps network&lt;/td&gt;
&lt;td&gt;write&lt;/td&gt;
&lt;td&gt;network&lt;/td&gt;
&lt;td&gt;20000&lt;/td&gt;
&lt;td&gt;0.02&lt;/td&gt;
&lt;td&gt;0.000000333333&lt;/td&gt;
&lt;td&gt;0.000000005556&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Read 1 MB sequentially from memory&lt;/td&gt;
&lt;td&gt;read&lt;/td&gt;
&lt;td&gt;memory&lt;/td&gt;
&lt;td&gt;250000&lt;/td&gt;
&lt;td&gt;0.25&lt;/td&gt;
&lt;td&gt;0.000004166667&lt;/td&gt;
&lt;td&gt;0.000000069444&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Round trip within same datacenter&lt;/td&gt;
&lt;td&gt;network&lt;/td&gt;
&lt;td&gt;network&lt;/td&gt;
&lt;td&gt;500000&lt;/td&gt;
&lt;td&gt;0.5&lt;/td&gt;
&lt;td&gt;0.000008333333&lt;/td&gt;
&lt;td&gt;0.000000138889&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Disk seek&lt;/td&gt;
&lt;td&gt;read&lt;/td&gt;
&lt;td&gt;disk&lt;/td&gt;
&lt;td&gt;10000000&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;0.000166667&lt;/td&gt;
&lt;td&gt;0.000002778&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Read 1 MB sequentially from network&lt;/td&gt;
&lt;td&gt;read&lt;/td&gt;
&lt;td&gt;network&lt;/td&gt;
&lt;td&gt;10000000&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;0.000166667&lt;/td&gt;
&lt;td&gt;0.000002778&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Read 1 MB sequentially from disk&lt;/td&gt;
&lt;td&gt;read&lt;/td&gt;
&lt;td&gt;disk&lt;/td&gt;
&lt;td&gt;30000000&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;0.0005&lt;/td&gt;
&lt;td&gt;0.000008333&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Send packet CA→Netherlands→CA&lt;/td&gt;
&lt;td&gt;network&lt;/td&gt;
&lt;td&gt;network&lt;/td&gt;
&lt;td&gt;150000000&lt;/td&gt;
&lt;td&gt;150&lt;/td&gt;
&lt;td&gt;0.0025&lt;/td&gt;
&lt;td&gt;0.000041667&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  💡 The Lessons
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Writes are 40 times more expensive than reads.&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Frequent writes/updates will have high contention.&lt;/li&gt;
&lt;li&gt;To scale writes, you need to &lt;strong&gt;partition&lt;/strong&gt;, and once you do that, it becomes difficult to maintain shared state like counters.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Global shared data is expensive.&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;This is a &lt;strong&gt;fundamental limitation of distributed systems&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Lock contention on heavily written shared objects kills performance as transactions become serialized and slow.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Architect for scaling writes.&lt;/strong&gt;&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Optimize for low write contention.&lt;/strong&gt;&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Optimize wide.&lt;/strong&gt; Make writes as parallel as you can.&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔥 Writes Are Expensive!
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Datastores are &lt;strong&gt;transactional&lt;/strong&gt;: writes require &lt;strong&gt;disk access&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Disk access means &lt;strong&gt;disk seeks&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;🧠 Rule of thumb:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1 disk seek = ~10ms
→ 1s / 10ms = 100 seeks/second (max per disk)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Throughput depends on:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The size and shape of your data&lt;/li&gt;
&lt;li&gt;Doing work in &lt;strong&gt;batches&lt;/strong&gt; (batch puts/gets)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ⚡ Reads Are Cheap!
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Reads don’t have to be transactional — just consistent.&lt;/li&gt;
&lt;li&gt;After the first disk load, data is cached in memory.&lt;/li&gt;
&lt;li&gt;Subsequent reads are super fast.&lt;/li&gt;
&lt;li&gt;🧠 Rule of thumb:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Read 1MB from memory ≈ 250μs
→ 1s / 250μs = 4GB/sec
→ For 1MB entities: 4000 fetches/sec

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧪 Example: Generate Image Results Page of 30 Thumbnails
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ❌ Design 1 – Serial
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Read images one-by-one:&lt;/li&gt;
&lt;li&gt;Each image = disk seek + read 256KB at 30MB/s&lt;/li&gt;
&lt;li&gt;Calculation:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;30 seeks × 10ms = 300ms
30 × (256KB / 30MB/s) = 250ms
→ Total: 300 + 250 = 550ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ✅ Design 2 – Parallel
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Issue reads in parallel.&lt;/li&gt;
&lt;li&gt;Calculation:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1 seek = 10ms
Read 256KB / 30MBps ≈ 8.5ms
→ Total: 10 + 8.5 = ~18.5ms
- Expect variance in real world: ~30–60ms range
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧠 Simplified Mental Models
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Insight&lt;/th&gt;
&lt;th&gt;What It Means (Simplified)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;💾 &lt;strong&gt;Disk is super slow&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Like walking to the garage. You don’t want to do this often.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🧠 &lt;strong&gt;RAM is much faster than disk&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Like grabbing from your desk instead of walking to the cabinet.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⚡ &lt;strong&gt;CPU is rarely the bottleneck&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Your processor is fast. If your system is slow, it’s not the CPU’s fault.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔁 &lt;strong&gt;Cache is insanely fast&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Think of L1/L2 cache like stuff in your pocket — instant access.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🌐 &lt;strong&gt;Network trips are expensive&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Talking to another datacenter is like mailing a letter to Europe. Avoid it.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔃 &lt;strong&gt;Batching is your friend&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Instead of reading 1 comment at a time, grab 100 at once.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🧵 &lt;strong&gt;Avoid shared locks&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Waiting for someone to unlock the bathroom wastes time.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📦 &lt;strong&gt;Design for locality&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Keep data close to where it’s processed — like keeping your tools nearby.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Cache beats RAM. RAM beats disk. Disk is lava. Network is long-distance love."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧠 Conclusion
&lt;/h2&gt;

&lt;p&gt;Back-of-the-envelope calculations won’t give you perfect answers — but they give you &lt;em&gt;fast&lt;/em&gt; and &lt;em&gt;estimations&lt;/em&gt; answers. That’s often all you need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Avoid wasteful engineering&lt;/li&gt;
&lt;li&gt;Identify bottlenecks early&lt;/li&gt;
&lt;li&gt;Make sound architecture decisions without building the wrong thing first&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before building that real-time dashboard or scaling out another microservice, ask yourself:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Did I run the numbers? Even roughly?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You might just save yourself days of debugging.&lt;/p&gt;




&lt;h2&gt;
  
  
  📚 References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://highscalability.com/google-pro-tip-use-back-of-the-envelope-calculations-to-choo/" rel="noopener noreferrer"&gt;Google Pro Tip: Back-of-the-Envelope Calculations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>systemdesign</category>
      <category>scalability</category>
      <category>backend</category>
      <category>performance</category>
    </item>
    <item>
      <title>Moving Fast and Safely: Lessons from Scaling Tech Organizations</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Thu, 10 Apr 2025 06:54:18 +0000</pubDate>
      <link>https://dev.to/adityasatrio/moving-fast-and-safely-lessons-from-scaling-tech-organizations-2l16</link>
      <guid>https://dev.to/adityasatrio/moving-fast-and-safely-lessons-from-scaling-tech-organizations-2l16</guid>
      <description>&lt;p&gt;&lt;em&gt;"Why scaling startups need more than just lean practices to survive and thrive."&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Scaling a tech organization in the financial industry, particularly in sensitive domains like stocks and crypto, introduces unique challenges. What works for a 10-person startup no longer holds when the team grows to 80+ engineers. While lean principles foster agility early on, &lt;strong&gt;structure and governance&lt;/strong&gt; become critical for sustainable growth.&lt;/p&gt;

&lt;p&gt;This article draws heavily from the &lt;strong&gt;Team Topologies&lt;/strong&gt; framework by Matthew Skelton and Manuel Pais, supported by &lt;strong&gt;Cognitive Load Theory&lt;/strong&gt;, and validated by real-world scaling practices from &lt;strong&gt;Amazon, Spotify, and Google&lt;/strong&gt;. Together, these references form the academic and practical foundation for our approach.&lt;/p&gt;

&lt;p&gt;We explore the typical scaling pains, diagnose the root causes behind them, and outline a step-by-step guide to solving these issues.&lt;/p&gt;




&lt;h2&gt;
  
  
  Startup Growth Pains: From Lean Beginnings to Structured Necessity
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;You are no longer a "startup" at 80 engineers. You are a mid-sized tech company.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the early stages, small teams rely on flexibility: informal communication, rapid decisions, and blurred responsibilities. However, as the team size grows, these strengths become liabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Delivery slows due to coordination overhead.&lt;/li&gt;
&lt;li&gt;Infrastructure strains under increasing demand.&lt;/li&gt;
&lt;li&gt;Internal politics and confusion rise.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lean practices help small teams move fast, but beyond a certain scale, &lt;strong&gt;intentional structure&lt;/strong&gt; must complement speed. Without it, chaos, instability, and organizational mistrust set in.&lt;/p&gt;

&lt;h3&gt;
  
  
  Team Size and Scaling Needs
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Team Size&lt;/th&gt;
&lt;th&gt;Typical State&lt;/th&gt;
&lt;th&gt;Scaling Requirement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1-10 engineers&lt;/td&gt;
&lt;td&gt;Chaos is acceptable&lt;/td&gt;
&lt;td&gt;Maximize flexibility and exploration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10-30 engineers&lt;/td&gt;
&lt;td&gt;Growing pains start&lt;/td&gt;
&lt;td&gt;Light processes, early team ownership&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30-80 engineers&lt;/td&gt;
&lt;td&gt;Structured chaos&lt;/td&gt;
&lt;td&gt;Formalize team types, start Platform teams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;80-150 engineers&lt;/td&gt;
&lt;td&gt;Scaling complexity&lt;/td&gt;
&lt;td&gt;Introduce IDP, enforce clear boundaries, governance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;150+ engineers&lt;/td&gt;
&lt;td&gt;Large-scale organization&lt;/td&gt;
&lt;td&gt;Split into Tribes, strong Platform engineering culture&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Our example of &lt;strong&gt;80 engineers&lt;/strong&gt; places us firmly in the "Structured chaos" phase, where building an Internal Developer Platform and enforcing clear team structures becomes mandatory.&lt;/p&gt;




&lt;p&gt;big tech and academic papers have hit exactly the same problem and evolved similar solutions.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Amazon (early 2000s) — “You build it, you run it” with platform guardrails
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt;&lt;br&gt;
Amazon was scaling fast. Developers needed to move fast but infra/security couldn’t let them “touch everything.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Split into 2-pizza teams (small, independent Stream-aligned Teams).&lt;/li&gt;
&lt;li&gt;Mandatory self-service platforms (deployment, logging, monitoring).&lt;/li&gt;
&lt;li&gt;No manual infra work: Developers use platforms built by platform teams.&lt;/li&gt;
&lt;li&gt;Teams own their service end-to-end within predefined guardrails.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Quote:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“You build it, you run it. But you run it inside the constraints provided by the central platform.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Impact:&lt;/strong&gt;&lt;br&gt;
Allowed Amazon to scale to thousands of services without losing security, compliance, or control.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Spotify (2012–2014) — “Squads, Tribes, Chapters, Guilds” model
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt;&lt;br&gt;
Growing fast, too much friction between teams.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Squads: Stream-aligned teams (own one part of the product).&lt;/li&gt;
&lt;li&gt;Chapters: Shared function across squads (e.g., Infra Chapter).&lt;/li&gt;
&lt;li&gt;Guilds: Loose, voluntary knowledge sharing (e.g., Security Guild).&lt;/li&gt;
&lt;li&gt;Platform Teams: Build enabling platforms, not manual ops.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Special Rule:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Infra teams acted as Internal Service Providers.&lt;/li&gt;
&lt;li&gt;Developers self-serve infra through APIs, not by asking infra engineers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Academic Reference:&lt;/strong&gt;&lt;br&gt;
Spotify Engineering Culture, by Henrik Kniberg (official document, referenced globally).&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Google SRE Model — “Error Budgets” and strict production control
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt;&lt;br&gt;
At Google scale, random developer changes = massive risks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Developers are responsible for code and minor ops.&lt;/li&gt;
&lt;li&gt;SREs own production environment stability.&lt;/li&gt;
&lt;li&gt;Error Budgets: Developers are allowed to break things within acceptable limits. If errors spike, devs lose the right to deploy until fixed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Quote from Google SRE Book:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Letting developers deploy freely without accountability is a path to ruin.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Impact:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Developers move fast but inside a mathematically defined safety zone.&lt;/li&gt;
&lt;li&gt;SREs protect core infra and enforce reliability.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Academic Reference: “Cognitive Load Theory for Software Teams”
&lt;/h2&gt;

&lt;p&gt;(Skelton, Pais, 2019 — same guys as Team Topologies, published academically)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thesis:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Developers cannot own too many unrelated concerns at once.&lt;/li&gt;
&lt;li&gt;Infra must be productized into easy-to-use platforms.&lt;/li&gt;
&lt;li&gt;Team boundaries must be designed to optimize flow and minimize handoffs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Their research shows that high cognitive load (devs doing dev + infra + security manually) = slower delivery, higher burnout, and higher incident rate.&lt;/p&gt;




&lt;h2&gt;
  
  
  Diagnosing the Core Problems
&lt;/h2&gt;

&lt;p&gt;The problems faced by growing startups often trace back to fundamental organizational issues:&lt;/p&gt;

&lt;h3&gt;
  
  
  Organizational Maturity Mismatch
&lt;/h3&gt;

&lt;p&gt;Small team behaviors persist even as the organization demands more maturity. Teams lack clear boundaries, and developers are expected to juggle responsibilities across development, infrastructure, operations, and security.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cognitive Load Overload
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cognitive Load Theory&lt;/strong&gt; teaches that individuals and teams can only handle a limited amount of complexity effectively. When teams handle too many unrelated domains, delivery becomes error-prone and slow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tech Politics: Erosion of Trust
&lt;/h3&gt;

&lt;p&gt;Opaque decision-making processes create mistrust. Engineers begin competing for resources and priorities in an unhealthy way, leading to favoritism and internal alliances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Root Cause:&lt;/strong&gt; All these issues stem from the absence of deliberate team structures and communication models, a concept central to &lt;strong&gt;Team Topologies&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Principles for Scaling Successfully
&lt;/h2&gt;

&lt;p&gt;Drawing directly from &lt;strong&gt;Team Topologies&lt;/strong&gt; and &lt;strong&gt;Cognitive Load Theory&lt;/strong&gt;, organizations must adopt three core principles:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Design Clear Team Boundaries
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Team Topologies&lt;/strong&gt; prescribes explicit team types to reduce cognitive load and improve flow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stream-aligned Teams:&lt;/strong&gt; Build and run product features end-to-end.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform Teams:&lt;/strong&gt; Create internal platforms that other teams consume.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enabling Teams:&lt;/strong&gt; Help other teams build missing capabilities.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complicated Subsystem Teams:&lt;/strong&gt; Handle highly specialized areas that require deep expertise.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Amazon&lt;/strong&gt; demonstrates this with their internal platform systems. Developers own services completely but operate within strict platform guardrails, minimizing unnecessary complexity.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Build Systems of Trust
&lt;/h3&gt;

&lt;p&gt;To reduce political behavior, decision-making must be transparent and predictable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RFCs (Request for Comments):&lt;/strong&gt; Publicly document and discuss major technical decisions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open Architecture Boards:&lt;/strong&gt; Ensure that decisions are made based on merit, not hierarchy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Public OKRs:&lt;/strong&gt; Make team goals visible and measurable.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Spotify&lt;/strong&gt; applied these principles with their Squads, Tribes, Chapters, and Guilds model. Squads operated independently but within a framework that encouraged transparency and cross-team collaboration.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Empower Developers Inside Guardrails
&lt;/h3&gt;

&lt;p&gt;Developers should have autonomy but within safe, automated boundaries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Infrastructure must be self-service.&lt;/li&gt;
&lt;li&gt;Access must be controlled and audited.&lt;/li&gt;
&lt;li&gt;Guardrails must automate security and compliance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Google&lt;/strong&gt; practices this balance through their Site Reliability Engineering (SRE) model. Developers own their services, but SREs enforce reliability through &lt;strong&gt;Error Budgets&lt;/strong&gt;, aligning freedom with operational excellence.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;If you allow full infra access "for speed" today, you borrow time against massive technical debt and existential risk later.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Allowing unrestricted access for the sake of moving fast might provide short-term gains, but it compromises the long-term stability of the organization. Technical debt accumulates invisibly, security vulnerabilities grow unnoticed, and incident recovery becomes slower. In regulated industries like finance and crypto, these risks aren't just technical — they are existential.&lt;/p&gt;

&lt;p&gt;Building robust guardrails through an Internal Developer Platform protects the organization without throttling developer productivity.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-Step Solution
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Define Proper Team Structures
&lt;/h3&gt;

&lt;p&gt;Clearly establish team types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stream-aligned Teams&lt;/strong&gt; own and deliver complete product features.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform Teams&lt;/strong&gt; abstract and simplify complex infrastructure needs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enabling Teams&lt;/strong&gt; improve capability without owning delivery.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complicated Subsystem Teams&lt;/strong&gt; manage specialized technical areas.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each team has a distinct mission, reducing overlap and conflict. This structure directly follows &lt;strong&gt;Team Topologies&lt;/strong&gt; principles.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Define Developer and Infra Responsibilities
&lt;/h3&gt;

&lt;p&gt;Responsibilities must be split clearly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Developers own application code, deployment pipelines, and monitoring.&lt;/li&gt;
&lt;li&gt;Infra teams provide secured, templatized pipelines, observability tools, and enforced security policies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This division helps manage cognitive load and supports faster, safer delivery.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Introduce RFCs for Major Changes
&lt;/h3&gt;

&lt;p&gt;Every significant architectural or infrastructural change must go through an RFC process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Written proposals are discussed openly.&lt;/li&gt;
&lt;li&gt;Decisions are transparent and based on technical merit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This process builds organizational memory and eliminates backchannel decision-making, reinforcing trust systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Leadership Rituals to Maintain Trust
&lt;/h3&gt;

&lt;p&gt;Leadership must reinforce trust continuously:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Weekly Leads Meetings&lt;/strong&gt; ensure alignment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Public OKRs&lt;/strong&gt; make priorities clear.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rotating Architecture Review Boards&lt;/strong&gt; distribute authority and expertise fairly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These rituals align with building transparent, predictable decision-making systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Build an Internal Developer Platform (IDP)
&lt;/h3&gt;

&lt;p&gt;An Internal Developer Platform provides the foundation for developer autonomy without sacrificing safety. It must include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure as Code:&lt;/strong&gt; Tools like Terraform and Pulumi to create pre-approved, self-service modules.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitOps Deployments:&lt;/strong&gt; Tools like ArgoCD or FluxCD automate deployment through Git workflows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-Service Portal:&lt;/strong&gt; Platforms like Backstage allow developers to launch services, view documentation, and manage their environments easily.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secrets Management:&lt;/strong&gt; Vault or AWS Secrets Manager centralizes secret handling and improves security.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability:&lt;/strong&gt; Prometheus and Grafana provide monitoring and alerting out-of-the-box.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incident Management:&lt;/strong&gt; Slack integrations with Alertmanager or tools like PagerDuty enable professional on-call rotations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Minimal Viable Stack Recommendation:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Tool Choices&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure&lt;/td&gt;
&lt;td&gt;Terraform + Atlantis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployments&lt;/td&gt;
&lt;td&gt;GitHub Actions + ArgoCD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Portal&lt;/td&gt;
&lt;td&gt;Backstage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secrets&lt;/td&gt;
&lt;td&gt;Vault or AWS Secrets Manager&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;Prometheus + Grafana&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Incident Management&lt;/td&gt;
&lt;td&gt;Slack + Alertmanager or PagerDuty&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Building this platform aligns with Team Topologies' goal of enabling fast, secure, and independent delivery.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The desire for developers to "own everything end-to-end" is natural but risky when scaling in regulated industries. True ownership must happen &lt;em&gt;within well-designed systems&lt;/em&gt; that balance speed, safety, and organizational trust.&lt;/p&gt;

&lt;p&gt;The principles presented here are deeply grounded in the &lt;strong&gt;Team Topologies&lt;/strong&gt; framework and &lt;strong&gt;Cognitive Load Theory&lt;/strong&gt;, and validated by real-world practices at &lt;strong&gt;Amazon&lt;/strong&gt;, &lt;strong&gt;Spotify&lt;/strong&gt;, and &lt;strong&gt;Google&lt;/strong&gt;. These references provide a solid foundation for any growing tech organization to scale successfully.&lt;/p&gt;

&lt;p&gt;By applying structured team models, building internal platforms, and fostering trust through transparent processes, financial tech startups can achieve sustainable, scalable growth without chaos.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Build systems, not heroes. Move fast, but move safely.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Appendix: References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Team Topologies&lt;/em&gt; by Matthew Skelton and Manuel Pais&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Cognitive Load Theory in Software Engineering&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Google SRE Book&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Spotify Engineering Culture&lt;/em&gt; (Henrik Kniberg)&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Amazon Leadership Principles&lt;/em&gt; ("You build it, you run it")&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Ruth Malan: Thoughts on Systems and Architecture&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Change Data Capture (CDC) in Modern Systems: Pros, Cons, and Alternatives</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Sun, 30 Mar 2025 16:08:16 +0000</pubDate>
      <link>https://dev.to/adityasatrio/change-data-capture-cdc-in-modern-systems-pros-cons-and-alternatives-2dee</link>
      <guid>https://dev.to/adityasatrio/change-data-capture-cdc-in-modern-systems-pros-cons-and-alternatives-2dee</guid>
      <description>&lt;p&gt;Change Data Capture (CDC) is a powerful technique used to track and react to data changes in real time. As modern systems lean more heavily into real-time data flows, microservices, and event-driven architectures, CDC has become a key strategy for syncing data across services, feeding analytics pipelines, and enabling responsiveness without overloading source databases.&lt;/p&gt;




&lt;h3&gt;
  
  
  What is CDC?
&lt;/h3&gt;

&lt;p&gt;CDC refers to the process of identifying and capturing changes (INSERT, UPDATE, DELETE) in a data source, typically a relational database, and propagating those changes to downstream consumers like data lakes, caches, search indexes, or microservices.&lt;/p&gt;

&lt;h4&gt;
  
  
  Types of CDC:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Log-based&lt;/strong&gt;: Taps into database transaction logs (e.g., binlog, WAL). Tools: Debezium, AWS DMS.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trigger-based&lt;/strong&gt;: Uses SQL triggers to write changes to an audit or events table.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timestamp/version-based&lt;/strong&gt;: Uses columns like &lt;code&gt;updated_at&lt;/code&gt; to query for changes during polling.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: Debezium listens to PostgreSQL's WAL and emits changes to Kafka topics, which are then consumed by services or streamed to BigQuery.&lt;/p&gt;




&lt;h3&gt;
  
  
  Benefits of Using CDC
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Near real-time updates&lt;/strong&gt;: Data pipelines become reactive, not batch-driven.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decoupling&lt;/strong&gt;: Source systems remain focused on core responsibilities.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event-driven support&lt;/strong&gt;: Downstream systems can respond to events as they happen.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Less DB strain&lt;/strong&gt;: Avoids heavy polling logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit/history capabilities&lt;/strong&gt;: Replaying and inspecting changes becomes easier.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: Syncing inventory updates from a MySQL database into Elasticsearch via CDC ensures the search index is always up to date.&lt;/p&gt;




&lt;h3&gt;
  
  
  Drawbacks of CDC
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Operational complexity&lt;/strong&gt;: Needs connector management, offset handling, and monitoring.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema evolution fragility&lt;/strong&gt;: Renames, drops, and type changes can break consumers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency and ordering challenges&lt;/strong&gt;: Out-of-order or delayed delivery in high throughput systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data loss or duplication&lt;/strong&gt;: Misconfigured offsets or restarts can cause inconsistencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security/access&lt;/strong&gt;: Log-based CDC often needs high-privilege DB access.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance impact&lt;/strong&gt;: Trigger-based CDC increases write latency and can introduce locks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Common Pitfalls:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Log rotation without connector sync&lt;/strong&gt;: If your database rotates or purges logs before the CDC connector has consumed them, you may lose change events. For example, MySQL binlogs may expire and be deleted before Debezium catches up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing schema registry&lt;/strong&gt;: If you're sending CDC data (especially via Kafka) without a schema registry, changes like renaming fields or adding new ones can break downstream consumers expecting the old structure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Offset mismanagement&lt;/strong&gt;: CDC tools track how far they've read through the change log using offsets. If offsets are lost or incorrectly restored after a restart, the system may reprocess changes (duplicates) or skip them entirely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backpressure issues&lt;/strong&gt;: In high-throughput systems, if consumers are slow, buffers fill up and connectors fall behind. This can lead to data lag, system crashes, or inconsistent sync.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Alternatives to CDC
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Polling
&lt;/h4&gt;

&lt;p&gt;Querying tables periodically for changes using timestamps.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pros&lt;/strong&gt;: Simple, no DB internals required&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: High latency, risk of missing updates&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. Database Triggers
&lt;/h4&gt;

&lt;p&gt;Triggers record changes into separate tables.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pros&lt;/strong&gt;: Real-time-ish, customizable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: Adds DB load, brittle, hard to scale&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Event Sourcing
&lt;/h4&gt;

&lt;p&gt;Application emits domain events instead of just changing the DB.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pros&lt;/strong&gt;: Full audit, strong consistency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: High complexity, requires redesign&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  4. Dual Writes
&lt;/h4&gt;

&lt;p&gt;App writes to DB and queue (e.g., Kafka) at the same time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pros&lt;/strong&gt;: Simple to start&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: Prone to inconsistency, needs idempotency&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  5. Transactional Outbox Pattern
&lt;/h4&gt;

&lt;p&gt;App writes to a DB + outbox table in one transaction, then a relay service reads from outbox.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pros&lt;/strong&gt;: Reliable, atomic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: Extra infra, slight delay&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Tooling Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Tooling Example&lt;/th&gt;
&lt;th&gt;Infra Complexity&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Scalability&lt;/th&gt;
&lt;th&gt;Maturity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Log-based CDC&lt;/td&gt;
&lt;td&gt;Debezium, AWS DMS&lt;/td&gt;
&lt;td&gt;Medium to High&lt;/td&gt;
&lt;td&gt;Medium–High&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Mature&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trigger-based&lt;/td&gt;
&lt;td&gt;Custom SQL Triggers&lt;/td&gt;
&lt;td&gt;Low to Medium&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Polling&lt;/td&gt;
&lt;td&gt;Custom cron/schedulers&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Mature&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Event Sourcing&lt;/td&gt;
&lt;td&gt;Kafka, Axon Framework&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Mature&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transactional Outbox&lt;/td&gt;
&lt;td&gt;Kafka + relay service&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Proven&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Cloud vs Open-source Considerations:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS DMS and Google Datastream are managed, easy to set up but more expensive.&lt;/li&gt;
&lt;li&gt;Debezium is free but requires Kafka Connect, Zookeeper, and ops work.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  When to Use CDC vs Alternatives
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Recommended Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Real-time analytics&lt;/td&gt;
&lt;td&gt;CDC or polling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microservices sync&lt;/td&gt;
&lt;td&gt;Outbox or CDC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache invalidation&lt;/td&gt;
&lt;td&gt;Dual write or CDC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audit/history logging&lt;/td&gt;
&lt;td&gt;Event sourcing or CDC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Event-driven orchestration&lt;/td&gt;
&lt;td&gt;Event sourcing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choose based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Team maturity&lt;/strong&gt;: Infra, Kafka, observability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data sensitivity&lt;/strong&gt;: Can you tolerate duplicates/loss?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency requirements&lt;/strong&gt;: ms vs seconds vs batch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity budget&lt;/strong&gt;: Is the benefit worth the effort?&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Data Consistency and Integrity Considerations
&lt;/h4&gt;

&lt;p&gt;Yes, your choice of strategy has a direct impact on data consistency and integrity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dual writes&lt;/strong&gt; without transactional guarantees can lead to mismatched states between your DB and event consumers if one write succeeds but the other fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Polling&lt;/strong&gt; risks missing changes if rows are updated multiple times between intervals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trigger-based CDC&lt;/strong&gt; may lose events if triggers fail silently or if permissions/configurations change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CDC with proper offset tracking and delivery guarantees&lt;/strong&gt; (like exactly-once semantics in Kafka) offers higher consistency but demands stronger infrastructure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transactional Outbox&lt;/strong&gt; ensures atomicity between the DB change and the emitted event, making it one of the most reliable methods when done correctly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Always evaluate the &lt;strong&gt;failure modes&lt;/strong&gt; of your strategy—what happens when a component crashes, restarts, or loses network—and choose tools that give you the right trade-offs between consistency, complexity, and performance.&lt;/p&gt;




&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;CDC is a powerful pattern to enable reactive and event-driven systems with minimal impact on source DBs. However, it's not a one-size-fits-all solution. Consider operational complexity, data criticality, and your system's maturity before choosing it over simpler polling or more robust outbox/event sourcing models. Thoughtful architecture always beats chasing trends.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Tech Pillars vs. Metrics: Foundations of a Technology Engineering Organization</title>
      <dc:creator>Aditya satrio nugroho</dc:creator>
      <pubDate>Sun, 30 Mar 2025 15:51:20 +0000</pubDate>
      <link>https://dev.to/adityasatrio/tech-pillars-vs-metrics-foundations-of-a-technology-engineering-organization-1e1j</link>
      <guid>https://dev.to/adityasatrio/tech-pillars-vs-metrics-foundations-of-a-technology-engineering-organization-1e1j</guid>
      <description>&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;As engineering leaders, we often operate in high-velocity, high-uncertainty environments. Our teams are shipping fast, but are we improving sustainably? Without clearly defining the strategic priorities (pillars) and measuring them effectively (metrics), we risk optimizing the wrong things—and building tech that's brittle, expensive, or misaligned with business goals.&lt;/p&gt;

&lt;p&gt;This article serves as a framework to build a structured, metrics-driven engineering culture rooted in clear, strategic pillars. It's written for tech leadership who want to scale with intention.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. What Are Tech Pillars?
&lt;/h2&gt;

&lt;p&gt;Tech pillars are &lt;strong&gt;non-negotiable strategic truths&lt;/strong&gt; for your engineering organization. They represent &lt;strong&gt;core domains of focus&lt;/strong&gt; that support the long-term viability, performance, and alignment of the tech org.&lt;/p&gt;

&lt;p&gt;They aren't measured directly, but instead &lt;strong&gt;guide decision-making, investment, and cultural behaviors&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example Tech Pillars:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;System Reliability &amp;amp; Observability&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Engineering Productivity &amp;amp; Developer Experience (DX)&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost Efficiency &amp;amp; Resource Optimization&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performance Optimization&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Quality, Stability &amp;amp; Security&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Innovation &amp;amp; Technical Growth&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cross-Team Collaboration &amp;amp; Alignment&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Analogy:&lt;/strong&gt; Think of pillars like the foundation of a building. You don’t measure the pillar itself; you check for cracks in the walls, leaks in the ceiling—the symptoms of a failing pillar.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference:&lt;/strong&gt; Forsgren et al. (2018, &lt;em&gt;Accelerate&lt;/em&gt;) describe these as "capability domains" predictive of software delivery and organizational performance.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  2. What Are Metrics?
&lt;/h2&gt;

&lt;p&gt;Metrics are the &lt;strong&gt;gauges, dials, and warning lights&lt;/strong&gt; of your engineering organization. They provide &lt;strong&gt;quantifiable feedback&lt;/strong&gt; on how well you’re upholding each pillar.&lt;/p&gt;

&lt;p&gt;Metrics are only meaningful &lt;strong&gt;when tied to a strategic context&lt;/strong&gt;. Measuring uptime means little unless it serves your Reliability pillar. Tracking PR merge time without pairing it with review quality might be counterproductive.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples of Metrics (Grouped by Pillar):
&lt;/h3&gt;

&lt;h4&gt;
  
  
  System Reliability &amp;amp; Observability
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Uptime %&lt;/li&gt;
&lt;li&gt;Mean Time to Detect (MTTD)&lt;/li&gt;
&lt;li&gt;Mean Time to Recovery (MTTR)&lt;/li&gt;
&lt;li&gt;Incident Recurrence Rate&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Engineering Productivity &amp;amp; DX
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Cycle Time&lt;/li&gt;
&lt;li&gt;Lead Time for Changes&lt;/li&gt;
&lt;li&gt;PR Review Time&lt;/li&gt;
&lt;li&gt;Deployment Frequency&lt;/li&gt;
&lt;li&gt;Developer Satisfaction Score&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Cost Efficiency &amp;amp; Resource Optimization
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Cost per User Session&lt;/li&gt;
&lt;li&gt;Cost per API Hit&lt;/li&gt;
&lt;li&gt;Infra Utilization %&lt;/li&gt;
&lt;li&gt;Cloud Spend per Feature&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Performance Optimization
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;P95 API Latency&lt;/li&gt;
&lt;li&gt;Backend Build Time&lt;/li&gt;
&lt;li&gt;App Startup Time&lt;/li&gt;
&lt;li&gt;Cache Hit Ratio&lt;/li&gt;
&lt;li&gt;Database Query Latency&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Quality, Stability &amp;amp; Security
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Code Coverage&lt;/li&gt;
&lt;li&gt;Hotfix Rate&lt;/li&gt;
&lt;li&gt;Rollback Rate&lt;/li&gt;
&lt;li&gt;CO Success Rate&lt;/li&gt;
&lt;li&gt;SonarQube Score&lt;/li&gt;
&lt;li&gt;Security Issue Resolution Ratio&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Innovation &amp;amp; Growth
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;R&amp;amp;D Initiative Completion Rate&lt;/li&gt;
&lt;li&gt;New Tech Adoption %&lt;/li&gt;
&lt;li&gt;Internal Tech Talks per Quarter&lt;/li&gt;
&lt;li&gt;Training Hours per Engineer&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Cross-Team Collaboration
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Conflict Resolution Time&lt;/li&gt;
&lt;li&gt;Tech-Biz OKR Alignment Score&lt;/li&gt;
&lt;li&gt;Engineering Contributions to Shared Goals&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. Pillars vs. Metrics: A Simple Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Pillar&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Metric Example&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Reliability&lt;/td&gt;
&lt;td&gt;Uptime %, MTTR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Productivity &amp;amp; DX&lt;/td&gt;
&lt;td&gt;Cycle Time, PR Merge Time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost Efficiency&lt;/td&gt;
&lt;td&gt;Cost per Session, Infra Utilization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;API Latency, App Startup Time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quality &amp;amp; Security&lt;/td&gt;
&lt;td&gt;Code Coverage, Bug Leakage Rate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Innovation&lt;/td&gt;
&lt;td&gt;R&amp;amp;D Completion Rate, Tech Talks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Collaboration&lt;/td&gt;
&lt;td&gt;OKR Alignment, Conflict Resolution Time&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  4. Anti-Patterns: What Happens When You Confuse the Two
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Example 1:
&lt;/h3&gt;

&lt;p&gt;You obsess over uptime, but ignore MTTR. The result? Systems stay up—until they don’t. When they crash, it takes hours to recover. You’re measuring the wrong thing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: Track both Uptime (proactive) and MTTR (reactive) under the Reliability pillar.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 2:
&lt;/h3&gt;

&lt;p&gt;You optimize for PR merge speed without context. Review quality plummets, bugs leak to prod, and velocity backfires.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: Balance speed (Cycle Time) with review quality or test coverage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 3:
&lt;/h3&gt;

&lt;p&gt;You track too many unaligned metrics. Your dashboard is impressive but meaningless. Engineers feel overwhelmed, not empowered.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: Anchor every metric to a pillar and business goal.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. How to Structure &amp;amp; Operationalize
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Define Your Pillars
&lt;/h3&gt;

&lt;p&gt;Use company objectives, system retros, and org pain points. Don’t copy-paste from others. Your pillars should reflect your context.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Map Metrics to Each Pillar
&lt;/h3&gt;

&lt;p&gt;2–5 meaningful metrics per pillar is a good starting point. Less is more, especially early on.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Assign Ownership
&lt;/h3&gt;

&lt;p&gt;Each pillar should have a driver (e.g., EM, Tech Lead) accountable for continuous improvement and reporting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Embed in the Process
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Use pillars in OKRs&lt;/li&gt;
&lt;li&gt;Mention them in sprint reviews&lt;/li&gt;
&lt;li&gt;Include pillar health in quarterly business reviews&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 5: Review and Adapt
&lt;/h3&gt;

&lt;p&gt;Pillars rarely change. Metrics do. Track monthly, reflect quarterly, refine yearly.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Practical Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Case A: Performance Regression on Checkout
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pillar&lt;/strong&gt;: Performance Optimization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metric&lt;/strong&gt;: P95 latency, CO success rate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Action&lt;/strong&gt;: Profile APIs, introduce pre-warming or caching&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Case B: Developer Burnout from Delivery Pressure
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pillar&lt;/strong&gt;: Productivity &amp;amp; DX&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metric&lt;/strong&gt;: PR Cycle Time, Dev Satisfaction Survey&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Action&lt;/strong&gt;: Automate boilerplate, reduce context switching, enable focus blocks&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Case C: Infra Costs Spike with Flat Traffic
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pillar&lt;/strong&gt;: Cost Efficiency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metric&lt;/strong&gt;: Cost per Session, Infra Utilization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Action&lt;/strong&gt;: Analyze low-efficiency services, right-size instances, optimize autoscaling&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  7. Templates &amp;amp; Playbooks
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Quarterly Pillar Review Template:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pillar Health (Red/Yellow/Green)&lt;/li&gt;
&lt;li&gt;Key Metrics&lt;/li&gt;
&lt;li&gt;Trends vs. Last Quarter&lt;/li&gt;
&lt;li&gt;Upcoming Initiatives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Engineering Metric Audit Checklist:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is this metric still relevant?&lt;/li&gt;
&lt;li&gt;Is it tied to a pillar?&lt;/li&gt;
&lt;li&gt;Is someone accountable for it?&lt;/li&gt;
&lt;li&gt;Are we acting on it regularly?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Starter Pillar Set for Scaling Teams:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reliability&lt;/li&gt;
&lt;li&gt;Productivity&lt;/li&gt;
&lt;li&gt;Quality&lt;/li&gt;
&lt;li&gt;Cost&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Add others as the org matures.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Pillars give purpose. Metrics give feedback.&lt;/p&gt;

&lt;p&gt;Together, they create a system that is strategic, actionable, and resilient. When properly structured, they align engineers, guide investment, and let your team scale with clarity instead of chaos.&lt;/p&gt;

&lt;p&gt;Don’t just track more. Track what matters.&lt;br&gt;&lt;br&gt;
Don’t just optimize metrics. Reinforce your pillars.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Forsgren, N., Humble, J., &amp;amp; Kim, G. (2018). &lt;em&gt;Accelerate: The Science of Lean Software and DevOps&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Treacy, M., &amp;amp; Wiersema, F. (1993). &lt;em&gt;Customer Intimacy and Other Value Disciplines&lt;/em&gt;. HBR.&lt;/li&gt;
&lt;li&gt;Kerzner, H. (2017). &lt;em&gt;Project Management: A Systems Approach to Planning, Scheduling, and Controlling&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Google DORA Research. (2023). &lt;em&gt;State of DevOps Report&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
  </channel>
</rss>
