Ali Abbas

Posted on Feb 10

How I Addressed the Real Pain Behind ADRs Enforcement

#programming #opensource #github #productivity

Description

Technical deep-dive on building a GitHub Action that prevents institutional amnesia by surfacing past architectural decisions on Pull Requests

I spent 3 years building systems. During that time, I watched teams waste months repeatedly debating architectural decisions they'd already made. We use ADRs in our company, but enforcement is inconsistent - only senior devs remember to update them.

So I built Decision Guardian - a GitHub Action that surfaces past architectural decisions directly on Pull Requests.

Here's how I did it, and the technical challenges I solved along the way.

The Solution: Decision Guardian

Decision Guardian is a GitHub Action that:

Parses architectural decisions from markdown files
Matches them against PR file changes
Surfaces relevant context automatically as a comment

Example decision file (.decispher/decisions.md):

<!-- DECISION-DB-001 -->
## Decision: Database Choice for Billing

**Status**: Active  
**Date**: 2024-03-15  
**Severity**: Critical

**Files**:
- `src/db/pool.ts`
- `config/database.yml`

### Context

We chose Postgres over MongoDB because billing requires 
ACID compliance for financial transactions.

**Alternatives rejected:**
- MongoDB: No ACID guarantees
- Redis: Added unnecessary complexity

**Before modifying:** Consult with @tech-lead

---

When someone opens a PR touching src/db/pool.ts, Decision Guardian reacts.

Technical Challenge #1: Pattern Matching at Scale

The Naive Approach (O(N×M))

My first implementation was simple:

// For each file in PR
for (const file of prFiles) {
  // Check against every decision
  for (const decision of decisions) {
    if (decision.files.includes(file.path)) {
      matches.push(decision);
    }
  }
}

Problem: With 100 files and 500 decisions = 50,000 comparisons

On a large PR (3000 files), this took 12+ seconds and sometimes hit GitHub Actions timeout.

The Solution: Prefix Trie

I built a prefix trie to index decisions by file patterns:

interface TrieNode {
  children: Map;
  decisions: Decision[];
  wildcardDecisions: Decision[];
}

class PatternTrie {
  private root: TrieNode;

  constructor(decisions: Decision[]) {
    this.root = this.createNode();
    for (const decision of decisions) {
      for (const pattern of decision.files) {
        this.insert(pattern, decision);
      }
    }
  }

  private createNode(): TrieNode {
    return {
      children: new Map(),
      decisions: [],
      wildcardDecisions: [],
    };
  }

  private insert(pattern: string, decision: Decision): void {
    const parts = pattern.split('/');
    this.insertRecursive(this.root, parts, decision);
  }

  private insertRecursive(node: TrieNode, parts: string[], decision: Decision): void {
    if (parts.length === 0) {
      node.decisions.push(decision);
      return;
    }

    const part = parts[0];
    const remaining = parts.slice(1);

    // Handle wildcards specially
    if (part === '**') {
      node.wildcardDecisions.push(decision);
      if (remaining.length > 0) {
        this.insertRecursive(node, remaining, decision);
      }
      return;
    }

    // Handle glob patterns
    if (
      part.includes('*') ||
      part.includes('?') ||
      part.includes('{') ||
      part.includes('}') ||
      part.includes('[') ||
      part.includes(']')
    ) {
      node.wildcardDecisions.push(decision);
      return;
    }

    // Exact match - traverse deeper
    let child = node.children.get(part);
    if (!child) {
      child = this.createNode();
      node.children.set(part, child);
    }

    this.insertRecursive(child, remaining, decision);
  }

  /**
   * Returns a set of unique decisions that *might* match the given file path.
   */
  findCandidates(file: string): Set {
    const parts = file.split('/');
    const candidates = new Set();

    this.collectCandidates(this.root, parts, candidates);

    return candidates;
  }

  private collectCandidates(node: TrieNode, parts: string[], candidates: Set): void {
    // Collect wildcard matches at this level
    for (const decision of node.wildcardDecisions) {
      candidates.add(decision);
    }

    if (parts.length === 0) {
      // Reached the end - collect exact matches
      for (const decision of node.decisions) {
        candidates.add(decision);
      }
      return;
    }

    const part = parts[0];
    const child = node.children.get(part);
    if (child) {
      this.collectCandidates(child, parts.slice(1), candidates);
    }
  }
}

Performance improvement:

Before: O(N×M) → 12 seconds for 3000 files
After: O(M × log D) → 2.8 seconds for same PR

That's a 4.3x speedup on large PRs.

Technical Challenge #2: Security (ReDoS Protection)

The Problem

Users can define custom patterns using regex:

{
  "type": "file",
  "pattern": "src/**/*.ts",
  "content_rules": [{
    "mode": "regex",
    "pattern": "(a+)+b"  // ⚠️ Evil regex
  }]
}

That pattern (a+)+b is vulnerable to ReDoS (Regular Expression Denial of Service).

When tested against aaaaaaaaaaaaaaaaaaaa (no 'b'), it creates exponential backtracking:

20 'a's: ~1 second
25 'a's: ~30 seconds
30 'a's: freeze forever

This could DOS the entire GitHub Action.

The Solution: Multi-Layer Protection

Layer 1: Safe-Regex Check

import safeRegex from 'safe-regex';

function validatePattern(pattern: string): void {
  if (!safeRegex(pattern)) {
    throw new Error(`Unsafe regex pattern: ${pattern}`);
  }
}

Layer 2: VM Sandbox with Timeout

Even safe-regex can miss some cases, so I added a VM sandbox:

import vm from 'vm';

function runRegexWithTimeout(pattern: string, flags: string, text: string, timeoutMs: number): boolean {
  const sandbox = Object.create(null);
  sandbox.result = false;
  sandbox.text = String(text);
  sandbox.pattern = String(pattern);
  sandbox.flags = String(flags || '');

  const context = vm.createContext(sandbox, {
    name: 'RegexSandbox',
    codeGeneration: {
      strings: false,
      wasm: false,
    },
  });

  const code = `
    'use strict';
    try {
      const regex = new RegExp(pattern, flags);
      result = regex.test(text);
    } catch (e) {
      result = false;
    }
  `;

  try {
    vm.runInContext(code, context, {
      timeout: timeoutMs,
      displayErrors: false,
    });
    return Boolean(sandbox.result);
  } catch (e) {
    return false;
  }
}

Key security features:

✅ Isolated VM context - No access to Node.js globals or filesystem
✅ Hard timeout - Kills execution after 5 seconds
✅ No code generation - Prevents eval() and WebAssembly escapes
✅ String coercion - Prevents prototype pollution

Layer 3: Input Size Limits

const MAX_CONTENT_SIZE = 1_000_000; // 1MB
const MAX_REGEX_LENGTH = 1000;

if (content.length > MAX_CONTENT_SIZE) {
  throw new Error('Content too large for regex matching');
}

if (pattern.length > MAX_REGEX_LENGTH) {
  throw new Error('Regex pattern too long');
}

Layer 4: Result Caching

import crypto from 'crypto';

class ContentMatchers {
  private resultCache = new Map();
  private readonly MAX_CACHE_SIZE = 500;

  private createCacheKey(pattern: string, flags: string, content: string): string {
    const contentHash = crypto
      .createHash('sha256')
      .update(content)
      .digest('hex')
      .substring(0, 16);
    return `${pattern}:${flags}:${contentHash}`;
  }

  async matchRegex(rule: ContentRule, fileDiff: FileDiff): Promise {
    const changedContent = this.getChangedLines(fileDiff.patch).join('\n');
    const cacheKey = this.createCacheKey(rule.pattern!, rule.flags || '', changedContent);

    const cached = this.resultCache.get(cacheKey);
    if (cached !== undefined) {
      return { matched: cached, matchedPatterns: cached ? [rule.pattern!] : [] };
    }

    try {
      const matched = this.runRegexWithTimeout(
        rule.pattern!, 
        rule.flags, 
        changedContent, 
        5000
      );

      this.updateCache(cacheKey, matched);
      return { matched, matchedPatterns: matched ? [rule.pattern!] : [] };
    } catch (error) {
      return { matched: false, matchedPatterns: [] };
    }
  }

  private updateCache(key: string, value: boolean): void {
    if (this.resultCache.size >= this.MAX_CACHE_SIZE) {
      // LRU eviction
      const firstKey = this.resultCache.keys().next().value;
      if (firstKey) this.resultCache.delete(firstKey);
    }
    this.resultCache.set(key, value);
  }
}

Result: Zero ReDoS vulnerabilities in production. ✅

Technical Challenge #3: Handling Massive PRs

The Problem

Some PRs modify 3000+ files (dependency updates, refactors, migrations).

GitHub's API returns all changed files, but:

Loading 3000 file diffs into memory → OOM (Out of Memory)
Processing them serially → timeout
Posting a comment with all matches → exceeds GitHub's 65KB limit

Solution 1: Streaming Processing

Instead of loading all files at once:

async function* streamFileDiffs(
  token: string
): AsyncGenerator {
  const octokit = github.getOctokit(token);
  const { owner, repo, pull_number } = github.context;

  let page = 1;
  const MAX_PAGES = 30;

  while (page <= MAX_PAGES) {
    const { data } = await octokit.rest.pulls.listFiles({
      owner,
      repo,
      pull_number,
      per_page: 100,
      page,
    });

    if (data.length === 0) break;

    yield data.map((f) => ({
      filename: f.filename.replace(/\\/g, '/'),
      status: f.status as FileDiff['status'],
      additions: f.additions,
      deletions: f.deletions,
      changes: f.changes,
      patch: f.patch || '',
      previous_filename: f.previous_filename,
    }));

    if (data.length < 100) break;
    page++;
  }
}

// Usage
const matches: DecisionMatch[] = [];
for await (const batch of streamFileDiffs(token)) {
  const batchMatches = await matcher.findMatchesWithDiffs(batch);
  matches.push(...batchMatches);

  core.info(`Processed ${batch.length} files, found ${matches.length} matches so far`);
}

Memory usage:

Before: High memory usage for 3000 files → OOM risk
After: Constant memory (processes 100 files at a time) → No crashes ✅

Solution 2: Progressive Truncation

If the comment exceeds GitHub's limit, truncate intelligently:

function truncateComment(decisions: Decision[], maxLength = 65000): string {
  let comment = formatComment(decisions);

  if (comment.length <= maxLength) {
    return comment;
  }

  // Layer 1: Show 20 decisions in detail, summarize rest
  comment = formatComment(decisions, { detailLimit: 20 });
  if (comment.length <= maxLength) return comment;

  // Layer 2: Show 10 decisions in detail
  comment = formatComment(decisions, { detailLimit: 10 });
  if (comment.length <= maxLength) return comment;

  // Layer 3: Show 5 decisions in detail
  comment = formatComment(decisions, { detailLimit: 5 });
  if (comment.length <= maxLength) return comment;

  // Layer 4: Show 2 decisions in detail
  comment = formatComment(decisions, { detailLimit: 2 });
  if (comment.length <= maxLength) return comment;

  // Layer 5: Show counts only
  comment = formatCommentCounts(decisions);
  if (comment.length <= maxLength) return comment;

  // Layer 6: Hard truncate as last resort
  return hardTruncate(comment, maxLength);
}

Result: Never hit comment size limit, even with 1000+ matched decisions. ✅

Technical Challenge #4: Idempotent Comments

The Problem

GitHub Actions can run multiple times for a single PR:

New commit pushed
Workflow re-run
Manual trigger

Without proper handling:

3 runs = 3 duplicate comments
Spams the PR thread
Confuses reviewers

The Solution: Content Hash

async function upsertComment(
  prNumber: number,
  content: string
): Promise {
  const hash = crypto
    .createHash('sha256')
    .update(content)
    .digest('hex')
    .substring(0, 16);

  const marker = ``;
  const hashMarker = ``;
  const fullContent = `${marker}\n${hashMarker}\n\n${content}`;

  // Find existing comment
  const comments = await octokit.issues.listComments({
    owner,
    repo,
    issue_number: prNumber,
  });

  const existing = comments.data.find(c =>
    c.body?.includes('decision-guardian-v1')
  );

  if (existing) {
    const existingHash = existing.body?.match(/hash:([a-f0-9-]+)/)?.[1];

    if (existingHash === hash) {
      console.log('Comment unchanged, skipping update');
      return;
    }

    // Update existing comment
    await octokit.issues.updateComment({
      owner,
      repo,
      comment_id: existing.id,
      body: fullContent,
    });
  } else {
    // Create new comment
    await octokit.issues.createComment({
      owner,
      repo,
      issue_number: prNumber,
      body: fullContent,
    });
  }
}

Result:

✅ Single comment per PR

✅ Updates in-place when decisions change

✅ No spam, no duplicates

Architecture Overview

┌────────────────────────────────────────────────────┐
│                  DECISION GUARDIAN                 │
├────────────────────────────────────────────────────┤
│                                                    │
│  ┌──────────────────────────────────────────────┐  │
│  │         DECISION PARSER (AST-based)          │  │
│  │  - Markdown parsing with remark              │  │
│  │  - JSON rule extraction & validation         │  │
│  └──────────────────────────────────────────────┘  │
│                      ↓                             │
│  ┌──────────────────────────────────────────────┐  │
│  │      DECISION INDEX (Prefix Trie)            │  │
│  │  - O(log n) file lookup                      │  │
│  │  - Wildcard pattern optimization             │  │
│  └──────────────────────────────────────────────┘  │
│                      ↓                             │
│  ┌──────────────────────────────────────────────┐  │
│  │         FILE MATCHER (Rule Evaluator)        │  │
│  │  - Glob pattern matching                     │  │
│  │  - Content diff analysis                     │  │
│  │  - ReDoS protection                          │  │
│  └──────────────────────────────────────────────┘  │
│                      ↓                             │
│  ┌──────────────────────────────────────────────┐  │
│  │      COMMENT MANAGER (Idempotent)            │  │
│  │  - Hash-based update detection               │  │
│  │  - Progressive truncation                    │  │
│  │  - Retry with exponential backoff            │  │
│  └──────────────────────────────────────────────┘  │
│                                                    │
└────────────────────────────────────────────────────┘

High-level flow:

PR Created → Parse Decisions → Match Files → Post Comment → Check Status

Key components:

Parser (parser.ts): Markdown → structured data
Matcher (matcher.ts): Trie-based file matching
Rule Evaluator (rule-evaluator.ts): Advanced rules
Comment Manager (comment.ts): Idempotent PR comments

Lessons Learned

1. Performance matters from day 1

I could have shipped with the O(N×M) algorithm and optimized later.

But teams with large PRs would have hit timeouts immediately and never come back.

Lesson: Build for scale early, especially in tools that run on every PR.

2. Security is not optional

The ReDoS vulnerability wasn't theoretical - during testing, a user accidentally created a pattern that froze the action for 5 minutes.

Lesson: Validate all user input, especially anything that can loop or recurse.

3. Idempotency prevents pain

Early versions created duplicate comments. Users reported it as "spammy" and disabled the action.

Adding content hashing fixed this and improved adoption.

Lesson: Make actions side-effect-free and repeatable.

4. Documentation > features

I spent 60% of development time on README, examples, and error messages.

Users still ask "how do I use this?" constantly.

Lesson: You can never document enough.

Try It Yourself

Install:

- uses: DecispherHQ/decision-guardian@v1
  with:
    token: ${{ secrets.GITHUB_TOKEN }}

Example decision:


## Decision: Database Choice

**Status**: Active  
**Date**: 2024-03-15  
**Files**: `src/db/**`

### Context
We chose Postgres for ACID compliance.
Rejected: MongoDB (no ACID), Redis (complexity)

Links:

GitHub: https://github.com/DecispherHQ/decision-guardian
Docs: https://decision-guardian.decispher.com
Marketplace: https://github.com/marketplace/actions/decision-guardian

What's Next?

Short-term:

GitLab/Bitbucket support (if demand exists)
Decision templates

Long-term:

VS Code extension (show decisions inline)
Analytics dashboard
Cross-repository rules

Want to contribute? Open an issue or start a discussion.

Conclusion

Decision Guardian is free, open source (MIT), and takes 2 minutes to set up.

What architectural decisions does your team repeatedly debate?

Drop a comment - I'd love to hear your stories.

Made with ❤️ by Ali Abbas

DEV Community

How I Addressed the Real Pain Behind ADRs Enforcement

Description

The Solution: Decision Guardian

Technical Challenge #1: Pattern Matching at Scale

The Naive Approach (O(N×M))

The Solution: Prefix Trie

Technical Challenge #2: Security (ReDoS Protection)

The Problem

The Solution: Multi-Layer Protection

Layer 4: Result Caching

Technical Challenge #3: Handling Massive PRs

The Problem

Solution 1: Streaming Processing

Solution 2: Progressive Truncation

Technical Challenge #4: Idempotent Comments

The Problem

The Solution: Content Hash

Architecture Overview

Lessons Learned

1. Performance matters from day 1

2. Security is not optional

3. Idempotency prevents pain

4. Documentation > features

Try It Yourself

What's Next?

Conclusion

Top comments (0)