DEV Community

Ali Abbas
Ali Abbas

Posted on

How I Addressed the Real Pain Behind ADRs Enforcement

Description

Technical deep-dive on building a GitHub Action that prevents institutional amnesia by surfacing past architectural decisions on Pull Requests

I spent 3 years building systems. During that time, I watched teams waste months repeatedly debating architectural decisions they'd already made. We use ADRs in our company, but enforcement is inconsistent - only senior devs remember to update them.

So I built Decision Guardian - a GitHub Action that surfaces past architectural decisions directly on Pull Requests.

Here's how I did it, and the technical challenges I solved along the way.


The Solution: Decision Guardian

Decision Guardian is a GitHub Action that:

  1. Parses architectural decisions from markdown files
  2. Matches them against PR file changes
  3. Surfaces relevant context automatically as a comment

Example decision file (.decispher/decisions.md):

<!-- DECISION-DB-001 -->
## Decision: Database Choice for Billing

**Status**: Active  
**Date**: 2024-03-15  
**Severity**: Critical

**Files**:
- `src/db/pool.ts`
- `config/database.yml`

### Context

We chose Postgres over MongoDB because billing requires 
ACID compliance for financial transactions.

**Alternatives rejected:**
- MongoDB: No ACID guarantees
- Redis: Added unnecessary complexity

**Before modifying:** Consult with @tech-lead

---
Enter fullscreen mode Exit fullscreen mode

When someone opens a PR touching src/db/pool.ts, Decision Guardian reacts.

Decision Guardian Demo


Technical Challenge #1: Pattern Matching at Scale

The Naive Approach (O(N×M))

My first implementation was simple:

// For each file in PR
for (const file of prFiles) {
  // Check against every decision
  for (const decision of decisions) {
    if (decision.files.includes(file.path)) {
      matches.push(decision);
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Problem: With 100 files and 500 decisions = 50,000 comparisons

On a large PR (3000 files), this took 12+ seconds and sometimes hit GitHub Actions timeout.

The Solution: Prefix Trie

I built a prefix trie to index decisions by file patterns:

interface TrieNode {
  children: Map;
  decisions: Decision[];
  wildcardDecisions: Decision[];
}

class PatternTrie {
  private root: TrieNode;

  constructor(decisions: Decision[]) {
    this.root = this.createNode();
    for (const decision of decisions) {
      for (const pattern of decision.files) {
        this.insert(pattern, decision);
      }
    }
  }

  private createNode(): TrieNode {
    return {
      children: new Map(),
      decisions: [],
      wildcardDecisions: [],
    };
  }

  private insert(pattern: string, decision: Decision): void {
    const parts = pattern.split('/');
    this.insertRecursive(this.root, parts, decision);
  }

  private insertRecursive(node: TrieNode, parts: string[], decision: Decision): void {
    if (parts.length === 0) {
      node.decisions.push(decision);
      return;
    }

    const part = parts[0];
    const remaining = parts.slice(1);

    // Handle wildcards specially
    if (part === '**') {
      node.wildcardDecisions.push(decision);
      if (remaining.length > 0) {
        this.insertRecursive(node, remaining, decision);
      }
      return;
    }

    // Handle glob patterns
    if (
      part.includes('*') ||
      part.includes('?') ||
      part.includes('{') ||
      part.includes('}') ||
      part.includes('[') ||
      part.includes(']')
    ) {
      node.wildcardDecisions.push(decision);
      return;
    }

    // Exact match - traverse deeper
    let child = node.children.get(part);
    if (!child) {
      child = this.createNode();
      node.children.set(part, child);
    }

    this.insertRecursive(child, remaining, decision);
  }

  /**
   * Returns a set of unique decisions that *might* match the given file path.
   */
  findCandidates(file: string): Set {
    const parts = file.split('/');
    const candidates = new Set();

    this.collectCandidates(this.root, parts, candidates);

    return candidates;
  }

  private collectCandidates(node: TrieNode, parts: string[], candidates: Set): void {
    // Collect wildcard matches at this level
    for (const decision of node.wildcardDecisions) {
      candidates.add(decision);
    }

    if (parts.length === 0) {
      // Reached the end - collect exact matches
      for (const decision of node.decisions) {
        candidates.add(decision);
      }
      return;
    }

    const part = parts[0];
    const child = node.children.get(part);
    if (child) {
      this.collectCandidates(child, parts.slice(1), candidates);
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Performance improvement:

  • Before: O(N×M) → 12 seconds for 3000 files
  • After: O(M × log D) → 2.8 seconds for same PR

That's a 4.3x speedup on large PRs.


Technical Challenge #2: Security (ReDoS Protection)

The Problem

Users can define custom patterns using regex:

{
  "type": "file",
  "pattern": "src/**/*.ts",
  "content_rules": [{
    "mode": "regex",
    "pattern": "(a+)+b"  // ⚠️ Evil regex
  }]
}
Enter fullscreen mode Exit fullscreen mode

That pattern (a+)+b is vulnerable to ReDoS (Regular Expression Denial of Service).

When tested against aaaaaaaaaaaaaaaaaaaa (no 'b'), it creates exponential backtracking:

  • 20 'a's: ~1 second
  • 25 'a's: ~30 seconds
  • 30 'a's: freeze forever

This could DOS the entire GitHub Action.

The Solution: Multi-Layer Protection

Layer 1: Safe-Regex Check

import safeRegex from 'safe-regex';

function validatePattern(pattern: string): void {
  if (!safeRegex(pattern)) {
    throw new Error(`Unsafe regex pattern: ${pattern}`);
  }
}
Enter fullscreen mode Exit fullscreen mode

Layer 2: VM Sandbox with Timeout

Even safe-regex can miss some cases, so I added a VM sandbox:

import vm from 'vm';

function runRegexWithTimeout(pattern: string, flags: string, text: string, timeoutMs: number): boolean {
  const sandbox = Object.create(null);
  sandbox.result = false;
  sandbox.text = String(text);
  sandbox.pattern = String(pattern);
  sandbox.flags = String(flags || '');

  const context = vm.createContext(sandbox, {
    name: 'RegexSandbox',
    codeGeneration: {
      strings: false,
      wasm: false,
    },
  });

  const code = `
    'use strict';
    try {
      const regex = new RegExp(pattern, flags);
      result = regex.test(text);
    } catch (e) {
      result = false;
    }
  `;

  try {
    vm.runInContext(code, context, {
      timeout: timeoutMs,
      displayErrors: false,
    });
    return Boolean(sandbox.result);
  } catch (e) {
    return false;
  }
}
Enter fullscreen mode Exit fullscreen mode

Key security features:

  • Isolated VM context - No access to Node.js globals or filesystem
  • Hard timeout - Kills execution after 5 seconds
  • No code generation - Prevents eval() and WebAssembly escapes
  • String coercion - Prevents prototype pollution

Layer 3: Input Size Limits

const MAX_CONTENT_SIZE = 1_000_000; // 1MB
const MAX_REGEX_LENGTH = 1000;

if (content.length > MAX_CONTENT_SIZE) {
  throw new Error('Content too large for regex matching');
}

if (pattern.length > MAX_REGEX_LENGTH) {
  throw new Error('Regex pattern too long');
}
Enter fullscreen mode Exit fullscreen mode

Layer 4: Result Caching

import crypto from 'crypto';

class ContentMatchers {
  private resultCache = new Map();
  private readonly MAX_CACHE_SIZE = 500;

  private createCacheKey(pattern: string, flags: string, content: string): string {
    const contentHash = crypto
      .createHash('sha256')
      .update(content)
      .digest('hex')
      .substring(0, 16);
    return `${pattern}:${flags}:${contentHash}`;
  }

  async matchRegex(rule: ContentRule, fileDiff: FileDiff): Promise {
    const changedContent = this.getChangedLines(fileDiff.patch).join('\n');
    const cacheKey = this.createCacheKey(rule.pattern!, rule.flags || '', changedContent);

    const cached = this.resultCache.get(cacheKey);
    if (cached !== undefined) {
      return { matched: cached, matchedPatterns: cached ? [rule.pattern!] : [] };
    }

    try {
      const matched = this.runRegexWithTimeout(
        rule.pattern!, 
        rule.flags, 
        changedContent, 
        5000
      );

      this.updateCache(cacheKey, matched);
      return { matched, matchedPatterns: matched ? [rule.pattern!] : [] };
    } catch (error) {
      return { matched: false, matchedPatterns: [] };
    }
  }

  private updateCache(key: string, value: boolean): void {
    if (this.resultCache.size >= this.MAX_CACHE_SIZE) {
      // LRU eviction
      const firstKey = this.resultCache.keys().next().value;
      if (firstKey) this.resultCache.delete(firstKey);
    }
    this.resultCache.set(key, value);
  }
}
Enter fullscreen mode Exit fullscreen mode

Result: Zero ReDoS vulnerabilities in production. ✅


Technical Challenge #3: Handling Massive PRs

The Problem

Some PRs modify 3000+ files (dependency updates, refactors, migrations).

GitHub's API returns all changed files, but:

  • Loading 3000 file diffs into memory → OOM (Out of Memory)
  • Processing them serially → timeout
  • Posting a comment with all matches → exceeds GitHub's 65KB limit

Solution 1: Streaming Processing

Instead of loading all files at once:

async function* streamFileDiffs(
  token: string
): AsyncGenerator {
  const octokit = github.getOctokit(token);
  const { owner, repo, pull_number } = github.context;

  let page = 1;
  const MAX_PAGES = 30;

  while (page <= MAX_PAGES) {
    const { data } = await octokit.rest.pulls.listFiles({
      owner,
      repo,
      pull_number,
      per_page: 100,
      page,
    });

    if (data.length === 0) break;

    yield data.map((f) => ({
      filename: f.filename.replace(/\\/g, '/'),
      status: f.status as FileDiff['status'],
      additions: f.additions,
      deletions: f.deletions,
      changes: f.changes,
      patch: f.patch || '',
      previous_filename: f.previous_filename,
    }));

    if (data.length < 100) break;
    page++;
  }
}

// Usage
const matches: DecisionMatch[] = [];
for await (const batch of streamFileDiffs(token)) {
  const batchMatches = await matcher.findMatchesWithDiffs(batch);
  matches.push(...batchMatches);

  core.info(`Processed ${batch.length} files, found ${matches.length} matches so far`);
}
Enter fullscreen mode Exit fullscreen mode

Memory usage:

  • Before: High memory usage for 3000 files → OOM risk
  • After: Constant memory (processes 100 files at a time) → No crashes ✅

Solution 2: Progressive Truncation

If the comment exceeds GitHub's limit, truncate intelligently:

function truncateComment(decisions: Decision[], maxLength = 65000): string {
  let comment = formatComment(decisions);

  if (comment.length <= maxLength) {
    return comment;
  }

  // Layer 1: Show 20 decisions in detail, summarize rest
  comment = formatComment(decisions, { detailLimit: 20 });
  if (comment.length <= maxLength) return comment;

  // Layer 2: Show 10 decisions in detail
  comment = formatComment(decisions, { detailLimit: 10 });
  if (comment.length <= maxLength) return comment;

  // Layer 3: Show 5 decisions in detail
  comment = formatComment(decisions, { detailLimit: 5 });
  if (comment.length <= maxLength) return comment;

  // Layer 4: Show 2 decisions in detail
  comment = formatComment(decisions, { detailLimit: 2 });
  if (comment.length <= maxLength) return comment;

  // Layer 5: Show counts only
  comment = formatCommentCounts(decisions);
  if (comment.length <= maxLength) return comment;

  // Layer 6: Hard truncate as last resort
  return hardTruncate(comment, maxLength);
}
Enter fullscreen mode Exit fullscreen mode

Result: Never hit comment size limit, even with 1000+ matched decisions. ✅


Technical Challenge #4: Idempotent Comments

The Problem

GitHub Actions can run multiple times for a single PR:

  • New commit pushed
  • Workflow re-run
  • Manual trigger

Without proper handling:

  • 3 runs = 3 duplicate comments
  • Spams the PR thread
  • Confuses reviewers

The Solution: Content Hash

async function upsertComment(
  prNumber: number,
  content: string
): Promise {
  const hash = crypto
    .createHash('sha256')
    .update(content)
    .digest('hex')
    .substring(0, 16);

  const marker = ``;
  const hashMarker = ``;
  const fullContent = `${marker}\n${hashMarker}\n\n${content}`;

  // Find existing comment
  const comments = await octokit.issues.listComments({
    owner,
    repo,
    issue_number: prNumber,
  });

  const existing = comments.data.find(c =>
    c.body?.includes('decision-guardian-v1')
  );

  if (existing) {
    const existingHash = existing.body?.match(/hash:([a-f0-9-]+)/)?.[1];

    if (existingHash === hash) {
      console.log('Comment unchanged, skipping update');
      return;
    }

    // Update existing comment
    await octokit.issues.updateComment({
      owner,
      repo,
      comment_id: existing.id,
      body: fullContent,
    });
  } else {
    // Create new comment
    await octokit.issues.createComment({
      owner,
      repo,
      issue_number: prNumber,
      body: fullContent,
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

Result:

✅ Single comment per PR

✅ Updates in-place when decisions change

✅ No spam, no duplicates


Architecture Overview

┌────────────────────────────────────────────────────┐
│                  DECISION GUARDIAN                 │
├────────────────────────────────────────────────────┤
│                                                    │
│  ┌──────────────────────────────────────────────┐  │
│  │         DECISION PARSER (AST-based)          │  │
│  │  - Markdown parsing with remark              │  │
│  │  - JSON rule extraction & validation         │  │
│  └──────────────────────────────────────────────┘  │
│                      ↓                             │
│  ┌──────────────────────────────────────────────┐  │
│  │      DECISION INDEX (Prefix Trie)            │  │
│  │  - O(log n) file lookup                      │  │
│  │  - Wildcard pattern optimization             │  │
│  └──────────────────────────────────────────────┘  │
│                      ↓                             │
│  ┌──────────────────────────────────────────────┐  │
│  │         FILE MATCHER (Rule Evaluator)        │  │
│  │  - Glob pattern matching                     │  │
│  │  - Content diff analysis                     │  │
│  │  - ReDoS protection                          │  │
│  └──────────────────────────────────────────────┘  │
│                      ↓                             │
│  ┌──────────────────────────────────────────────┐  │
│  │      COMMENT MANAGER (Idempotent)            │  │
│  │  - Hash-based update detection               │  │
│  │  - Progressive truncation                    │  │
│  │  - Retry with exponential backoff            │  │
│  └──────────────────────────────────────────────┘  │
│                                                    │
└────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

High-level flow:

PR Created → Parse Decisions → Match Files → Post Comment → Check Status
Enter fullscreen mode Exit fullscreen mode

Key components:

  • Parser (parser.ts): Markdown → structured data
  • Matcher (matcher.ts): Trie-based file matching
  • Rule Evaluator (rule-evaluator.ts): Advanced rules
  • Comment Manager (comment.ts): Idempotent PR comments

Lessons Learned

1. Performance matters from day 1

I could have shipped with the O(N×M) algorithm and optimized later.

But teams with large PRs would have hit timeouts immediately and never come back.

Lesson: Build for scale early, especially in tools that run on every PR.

2. Security is not optional

The ReDoS vulnerability wasn't theoretical - during testing, a user accidentally created a pattern that froze the action for 5 minutes.

Lesson: Validate all user input, especially anything that can loop or recurse.

3. Idempotency prevents pain

Early versions created duplicate comments. Users reported it as "spammy" and disabled the action.

Adding content hashing fixed this and improved adoption.

Lesson: Make actions side-effect-free and repeatable.

4. Documentation > features

I spent 60% of development time on README, examples, and error messages.

Users still ask "how do I use this?" constantly.

Lesson: You can never document enough.


Try It Yourself

Install:

- uses: DecispherHQ/decision-guardian@v1
  with:
    token: ${{ secrets.GITHUB_TOKEN }}
Enter fullscreen mode Exit fullscreen mode

Example decision:


## Decision: Database Choice

**Status**: Active  
**Date**: 2024-03-15  
**Files**: `src/db/**`

### Context
We chose Postgres for ACID compliance.
Rejected: MongoDB (no ACID), Redis (complexity)
Enter fullscreen mode Exit fullscreen mode

Links:


What's Next?

Short-term:

  • GitLab/Bitbucket support (if demand exists)
  • Decision templates

Long-term:

  • VS Code extension (show decisions inline)
  • Analytics dashboard
  • Cross-repository rules

Want to contribute? Open an issue or start a discussion.


Conclusion

Decision Guardian is free, open source (MIT), and takes 2 minutes to set up.

What architectural decisions does your team repeatedly debate?

Drop a comment - I'd love to hear your stories.

Made with ❤️ by Ali Abbas


Top comments (0)