Naveensivam S

Posted on Jun 2

PRFlow: From Abandoned Scaffold to a Production-Grade PR Orchestration Engine

#devchallenge #githubchallenge #development #githubcopilot

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

PRFlow is an event-driven pull request orchestration and intelligence platform for engineering organizations.

The core idea: GitHub is great at storing code and showing you diffs. But it has no intelligence about who should review what, how overloaded a reviewer already is, or what happens when a review goes stale for 3 days and blocks a release.

PRFlow owns that layer entirely. It sits alongside GitHub as an active workflow orchestrator — modeling developer expertise using decay curves, deterministically routing reviewers based on real familiarity scores, balancing workloads, and enforcing SLA accountability with a self-healing escalation engine that auto-reassigns stalled PRs before a manager even notices.

Tech stack:

Java 21 / Spring Boot 3.x — core orchestration monolith, 4 processing engines
TypeScript / Bun / Express — high-throughput webhook ingress gateway
PostgreSQL + Flyway — transactional schema with 9 tables
Valkey (Redis-compatible) — caching layer

🔗 Repo: github.com/Naveensivam03/PrFlow

Parameter	Specification	Status
System Architecture	Modular Monolith / Event-Driven	`Active`
Backend Core	Java 21 / Spring Boot 3.x	`Build Passing`
Ingress Layer	TypeScript / Bun / Express	`Build Passing`
Database Persist	PostgreSQL / Flyway	`Migrated`
Cache Layer	Valkey (Redis Compatibility)	`Active`
Test Coverage	18 Tests Ingress-to-Escalation	`100% Success`

Demo

To reproduce the full smoke test locally:

# 1. Spin up dependencies
docker run -d --name prflow-db -p 5432:5432 \
  -e POSTGRES_DB=prflow_db \
  -e POSTGRES_USER=prflow_app \
  -e POSTGRES_PASSWORD=change_me_in_local_env \
  postgres:16

docker run -d --name prflow-valkey -p 6379:6379 valkey/valkey:8.0

# 2. Run backend (Flyway migrations run automatically)
cd backend/spring-api
mvn spring-boot:run -Dspring-boot.run.arguments="--spring.datasource.password=change_me_in_local_env"

# 3. Run webhook ingress
cd integrations/github-webhook-service
bun install && bun run dev

# 4. Run tests
mvn test
# → 18 tests, 0 failures

The Comeback Story

Where the project was

I started PRFlow during a tight sprint. The monorepo scaffold was there. The README had grand architecture diagrams. The database schema was designed. And then I stopped.

What actually existed when I came back:

A folder structure and an ambitious README
Database migrations written, but never proven against real engine output
Expertise model documented in /docs but not implemented in code
Assignment Engine: pseudocode logic in a half-written service class
Escalation Engine: entirely missing
Zero tests
No idempotency — concurrent webhook retries would have caused duplicate assignments
Webhook ingress had no cryptographic signature verification

It was a graveyard of good intentions. The architecture was sound. The execution had never happened.

What I finished during this challenge

Here is the before/after breakdown across every major component:

Component	Before	After
Complexity Engine	Partially written, not integrated	Fully implemented, wired to event pipeline
Expertise Engine	Design doc only	Linear decay curves, touch + review scores, DB upserts
Assignment Engine	Pseudocode	Deterministic scoring, load balancing, 3-stage fallback
Escalation Engine	Did not exist	Full 3-tier SLA scanner with hourly cron, auto-reassignment
Idempotency	None	Atomic conditional SQL gates, replay hash checks, audit log boundary
Tests	0	18 tests, 100% passing, full ingress-to-escalation coverage
Webhook Security	Unverified	HMAC-SHA256 cryptographic signature verification
Engine Packaging	Flat service classes	Refactored into `service/event/dto/repository` layers

Architecture

Core Philosophy

  ┌────────────────────────┐         ┌────────────────────────┐
  │   GitHub App Ingress   │         │    PRFlow Core Engine  │
  ├────────────────────────┤         ├────────────────────────┤
  │ - Git Repositories     │         │ - Workflow State       │
  │ - Source Code Storage  ├────────>│ - Expertise Graphs     │
  │ - PR Visual UI         │         │ - SLA Escalations      │
  │ - Review Sync Hook     │         │ - Deterministic Routing│
  └────────────────────────┘         └────────────────────────┘

GitHub is the system of record for code. PRFlow is the system of record for workflow intelligence. They never overlap.

System Architecture Overview

Core Workflow Lifecycle

End-to-End Sequence Flow

The 4 Engine Pipeline

PRFlow segments all workflow logic into four deterministic, replay-safe engines wired together through an internal Spring event bus.

Engine 1 — Complexity Engine

Responsibility: Calculates structural risk and engineering scope for every ingested PR.

Inputs: changed file counts, additions, deletions, directory hierarchy depth
Output: PullRequest.complexity_score
Role: Entry point of the pipeline. PRs with complexity > 7.0 are gated — only senior engineers are eligible as reviewers.

Engine 2 — Expertise Engine

Responsibility: Computes real-time, decayed familiarity scores per developer per file path.

The expertise model uses a linear decay curve — knowledge fades with time since a developer last touched or reviewed a file:

Weight = 1.0   (touch < 30 days ago)     → fresh
       = 0.7   (30–90 days ago)          → fading
       = 0.4   (90–180 days ago)         → stale
       = 0.1   (> 180 days ago)          → nearly forgotten

Review actions carry bonus weight — active intellectual contribution matters more than a passive file touch:

ReviewScore = (Participation × 1.0) + (Approval × 2.0)

Outputs: upserts to developer_file_expertise, emits ExpertiseCalculatedEvent
Role: Accumulates organic organizational memory. No manual domain lists. Knowledge maps itself.

Engine 3 — Assignment Engine

Responsibility: Deterministically routes PRs to optimal reviewer combinations while balancing workload.

The scoring formula penalizes reviewers who are already overloaded:

Score = ExpertiseScore / (1.0 + (ActiveReviewsCount × 2.0))

ActiveReviewsCount only counts assignments with status ASSIGNED, REMINDER_SENT, or STALE — open and stalled work counts against you.

The engine then applies a 3-stage fallback if the target reviewer count can't be filled:

Engine 4 — Escalation Engine

Responsibility: Monitors review response latency against SLA limits and triggers restorative workflows.

An hourly scheduled scanner evaluates all open reviewer_assignments against three progressive breach tiers:

Transactional Idempotency

This was one of the most important things I finished. Without it, any webhook retry would cause duplicate assignments and duplicate escalation emails.

PRFlow applies three idempotency layers:

1. State machine lockouts — SLA transitions use atomic conditional SQL:

UPDATE reviewer_assignments
SET assignment_status = 'REMINDER_SENT',
    reminder_sent_at  = NOW(),
    escalation_level  = 1
WHERE pull_request_id = ?
  AND developer_id    = ?
  AND escalation_level < 1
  AND assignment_status = 'ASSIGNED';

If the scheduler fires twice concurrently, exactly one UPDATE succeeds. The second hits the AND escalation_level < 1 gate and affects 0 rows.

2. Review sync replay gating — ReviewSyncService checks state hashes before storing reviews. Identical state = ignored replay.

3. Audit log ingress boundary — A dedicated webhook_logs table records every inbound payload with its HMAC signature. Duplicates are caught at the door before entering the engine pipeline.

Database Schema

The persistence model uses 9 tables in a PostgreSQL transactional schema, managed entirely via Flyway migrations.

              ┌─────────────────┐
              │  organizations  │
              └────────┬────────┘
                       │ 1
                       │ *
              ┌────────┴────────┐
              │   developers    │
              └────────┬────────┘
                       │ 1
                       │ *
              ┌────────┴────────┐
              │   assignments   │
              └────────┬────────┘
                       │ *
                       │ 1
              ┌────────┴────────┐
              │  pull_requests  │
              └─────────────────┘

Table	Core Rationale	Key Indexes
`organizations`	Defines org boundary matching a GitHub App installation scope	`uq_organizations_name`
`developers`	Contributor registry — username, seniority, active review capacity, reliability	`idx_developers_org_github`
`repositories`	Code repos mapped to org boundaries	`idx_repos_org_github_repo`
`pull_requests`	Persistent PR lifecycle metadata — title, author, complexity, status	`idx_pr_repository_status`
`pull_request_files`	Changed files and domain scopes per PR	`idx_pr_files_pr_id`
`developer_file_expertise`	Organic knowledge base — touch and review scores per developer per file	`uq_dev_repo_file_path`
`repository_developers`	Many-to-many linking active developers to repositories	`uq_repository_developers`
`reviewer_assignments`	Workflow state machine — score, status, SLA level, timestamps	`idx_reviewer_assignments_status`
`pull_request_reviews`	Historic review memory synced from GitHub	`uq_github_review_id`

Monorepo Structure

.
├── backend/spring-api/          # Core Java 21 / Spring Boot Orchestration Monolith
│   ├── src/main/java/           # Engines, services, events, db gateways
│   └── src/main/resources/      # Flyway migrations and config
├── integrations/
│   └── github-webhook-service/  # TypeScript / Bun Webhook Ingestion Service
├── docs/
│   ├── engines/                 # Engine design guides
│   └── database/                # Schema design guides
├── infra/                       # Infrastructure configurations
└── README.md

My Experience with GitHub Copilot

GitHub Copilot was genuinely the difference between this project staying abandoned and actually shipping.

Here's exactly how it helped, with no fluff:

Engine package refactoring

My engine classes had grown into a tangled mess — services calling each other directly, no clean event/dto/repository separation. I used Copilot to systematically refactor each engine into proper layered packages. It caught coupling violations I would have shipped as permanent tech debt and suggested the right abstraction boundaries for each layer.

Idempotency SQL patterns

Writing atomic conditional UPDATE statements for SLA state machine transitions is subtle. You need to check both the current status AND the escalation level in the same WHERE clause to prevent duplicate dispatches under concurrent execution. Copilot suggested the exact gating pattern on the first attempt. Without it, I would have spent hours debugging phantom duplicate escalation emails in testing.

Test scaffolding

I described the coverage requirements — full ingress-to-escalation, including edge cases for replay-safe review sync — and Copilot generated the 18-test suite structure. It included edge cases I hadn't thought about: what happens when a review sync arrives with an identical state hash, what happens when the escalation scanner fires on an assignment that was just reassigned by a concurrent webhook.

Decay curve implementation

Translating the 4-tier linear recency weight formula into correct JPA query logic (with proper TIMESTAMPDIFF comparisons relative to opened_at) was something Copilot nailed cleanly. This kind of time-arithmetic SQL is easy to get subtly wrong.

Logging hygiene refactor

The most recent commit in the repo was a Copilot-assisted refactor. It helped strip verbose debug logging across all four engines while identifying which audit-critical log lines must be preserved. This is surprisingly judgment-heavy — you don't want to remove the log line that traces a missed escalation — and Copilot handled it well by reasoning about which paths were observability-critical.

The bigger shift

The most valuable thing Copilot did wasn't any single code block. It kept me in architecture mode instead of constantly context-switching into syntax lookup. I could describe what I wanted at the system level and get working, idiomatic Java back fast enough that I never lost the thread of what I was building. That's what got a dormant project across the finish line.

What's Next

The core platform is production-complete. The roadmap has three realistic extensions:

Review Latency Insights — historic trend dashboards tracking avg cycle time per reviewer, per complexity tier, per repository
Ownership Heatmaps — map file directories to knowledge concentration indexes, expose bottlenecks and single-points-of-failure
Workload Forecasting — predictive charts estimating future reviewer capacity limits from current branch activity

Running It Yourself

Prerequisites

JDK 21
Bun
Docker

# Bootstrap PostgreSQL and Valkey
docker run -d --name prflow-db -p 5432:5432 \
  -e POSTGRES_DB=prflow_db \
  -e POSTGRES_USER=prflow_app \
  -e POSTGRES_PASSWORD=change_me_in_local_env \
  postgres:16

docker run -d --name prflow-valkey -p 6379:6379 valkey/valkey:8.0

# Start the backend (migrations run automatically via Flyway)
cd backend/spring-api
mvn spring-boot:run \
  -Dspring-boot.run.arguments="--spring.datasource.password=change_me_in_local_env"

# Start the webhook ingress gateway
cd integrations/github-webhook-service
bun install
bun run dev

To wire up a real GitHub App, expose port 3001 via ngrok http 3001, point your GitHub App webhook URL there, set Pull Requests: Read & Write and Metadata: Read-Only permissions, and drop your generated private key into the config folder.

Built with Java 21 · Spring Boot 3.x · TypeScript · Bun · PostgreSQL · Valkey · GitHub Copilot

DEV Community