Anand Krishna

Posted on Feb 16

RealityCheck CLI — Turn Legal Contracts into Decision-Grade Risk Intelligence

#githubcopilot #python #cli #githubchallenge

GitHub Copilot CLI Challenge Submission

This is a submission for the GitHub Copilot CLI Challenge

What I Built

RealityCheck CLI is a Python command-line tool that transforms legal contract PDFs into structured, actionable risk intelligence — not summaries, but real decision-grade analysis you can act on before signing.

Most people sign contracts they can't fully parse. RealityCheck makes the risk explicit, structured, and actionable.

The Problem

You receive a consulting agreement, employment contract, or NDA. It's 8 pages of dense legal text. You skim it, maybe worry about a clause or two, and sign anyway. Sound familiar?

The gap between "I read it" and "I understand the risk" is where people get burned — unlimited liability exposure, one-sided termination rights, overbroad IP assignments, missing payment protections.

The Solution

RealityCheck CLI takes any contract PDF and produces:

5 quantified risk metrics — Overall Risk Score (1-100), Power Imbalance (0-100), Ambiguity Index (0-100), Protection Coverage (0-100), and an original Leverage Index™ (0-100) showing your negotiation strength
Clause-by-clause classification across 7 legal categories (Non-Compete, IP Transfer, Liability, Termination, Financial Risk, Privacy, Neutral)
Signal detection — flags vague language ("sole discretion", "without notice"), one-sided rights, liability expansion, and missing protections
Missing protections scan — checks for 6 critical protections: payment timeline, termination notice, cure period, liability cap, breach notification window, IP retention
Negotiation-ready outputs — auto-generated email drafts with specific clause rewrites, ready to send to the counterparty
Contract comparison — diff two versions of a contract to catch new risks, expanded liability, or extended non-compete duration between drafts
Optional LLM enrichment — plug in Google Gemini for deeper clause classification alongside the fast heuristic engine

Architecture

PDF → [ingest] → [clauses] → [analysis] → [scoring] → [negotiation] → [output]
                                  ↕                                        ↕
                              [llm_client]                          [comparison]

The tool is modular by design — 9 internal packages wired through a single orchestration pipeline:

Module	Purpose
`ingest/`	PDF extraction via pdfplumber + header/footer removal
`clauses/`	Clause segmentation by heading detection + text normalization
`analysis/`	Heuristic classification engine + optional Gemini LLM enrichment
`scoring/`	Weighted multi-factor risk engine with category-specific weights
`negotiation/`	Email drafts + clause rewrite suggestions
`comparison/`	Smart clause matching + delta analysis with legal-domain flags
`output/`	Rich terminal rendering + JSON artifact export
`config/`	Environment-based settings (API keys, thresholds)
`cli/`	Typer-based CLI with `analyze` and `compare` commands

Key Design Decisions

Heuristic-first, LLM-optional — Works fully offline with regex pattern matching. No API key needed for the core analysis. LLM only enriches, never replaces.
Weighted multi-factor scoring — Not a single naive score, but 5 complementary metrics with category-specific weights (Liability: 0.22, Financial Risk: 0.20, IP Transfer: 0.17, etc.)
Actionable by default — Doesn't just flag risk — generates a negotiation email draft and clause rewrites you can actually send.
Comparison as a first-class feature — Smart clause matching (70% title similarity + 30% text similarity) with domain-specific flags like non-compete duration parsing and liability expansion detection.

Demo

GitHub Repository: github.com/Anandqwe/realitycheck-cli

Setup

git clone https://github.com/Anandqwe/realitycheck-cli.git
cd realitycheck-cli
python -m venv venv
.\venv\Scripts\activate
pip install -r requirements.txt

Demo 1: Analyzing a Real Employment Contract (`contract.pdf`)

The repo includes a real employment contract template (contract.pdf) — a multi-page agreement with clauses covering probation, compensation, termination, confidentiality, IP assignment, and more.

python -m realitycheck_cli analyze .\contract.pdf

Terminal Output:

╭──────────────────────── Analysis ────────────────────────╮
│ RealityCheck CLI                                         │
│ Contract: contract.pdf                                   │
│ Clauses analyzed: 19                                     │
╰──────────────────────────────────────────────────────────╯
╭─ Overall Risk Score ─╮ ╭─ Power Imbalance Score ─╮ ╭─ Leverage Index (TM) ─╮
│        40/100        │ │         41/100          │ │        54/100         │
╰──────────────────────╯ ╰─────────────────────────╯ ╰───────────────────────╯

The tool parsed all 19 clauses from the PDF, classified each one, and produced:

Overall Risk: 40/100 — Moderate risk level
Power Imbalance: 41/100 — Slightly favors the employer
Leverage Index: 54/100 — Borderline negotiation position

Category Breakdown:

              Category Risk Summary
┏━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Category       ┃ Score ┃ Weight ┃ Contribution ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━┩
│ IP_TRANSFER    │    57 │   0.17 │         9.69 │
│ TERMINATION    │    55 │   0.12 │         6.60 │
│ PRIVACY        │    52 │   0.09 │         4.68 │
│ NEUTRAL        │    36 │   0.05 │         1.80 │
└────────────────┴───────┴────────┴──────────────┘

IP Transfer and Termination clauses are the primary risk drivers. The tool detected an "Assignment (Transfer of Contract)" clause attempting broad IP assignment, and termination clauses with limited employee protections.

Ambiguity Detection:

The tool caught a "sole discretion" clause in the Duties section — the employer can unilaterally modify duties "in the sole discretion of the Employer." This gets flagged as VAGUE_LANGUAGE with HIGH severity.

Missing Protections:

╭──────────────── Missing Protections ─────────────────╮
│ - payment timeline                                    │
│ - cure period                                         │
│ - liability cap                                       │
│ - breach notification window                          │
│ - ip retained                                         │
╰──────────────────────────────────────────────────────╯

5 out of 6 critical protections are missing from this contract — a significant gap.

Auto-Generated Negotiation Email:

╭──────────── Negotiation Draft (Preview) ─────────────╮
│ Subject: Proposed revisions for contract              │
│                                                       │
│ Priority clauses to discuss:                          │
│ - Assignment (Transfer Of Contract Of Employment)     │
│   (C-008, risk 57/100): Narrow IP assignment to       │
│   deliverables created under this agreement.          │
│ - Probation (C-003, risk 55/100): Require written     │
│   notice and a cure period before termination.        │
│                                                       │
│ Additional protections requested:                     │
│ - Add explicit language for: payment timeline         │
│ - Add explicit language for: liability cap            │
│ - Add explicit language for: breach notification      │
╰──────────────────────────────────────────────────────╯

This email draft is ready to copy-paste and send to the counterparty. No more staring at a contract wondering what to push back on.

Demo 2: Full Pipeline with the Demo Script

The project includes a PowerShell demo script (demo.ps1) that runs the complete pipeline — analyze both versions, then compare:

.\demo.ps1 -Baseline .\baseline.pdf -Revised .\revised.pdf

This executes 3 steps automatically:

Step 1: Analyze the baseline contract → produces risk scores, missing protections, negotiation draft
Step 2: Analyze the revised contract → same analysis on the new version
Step 3: Compare both → generates a delta report

Comparison Output:

╭─────────────────── Comparison ───────────────────────╮
│ Baseline: baseline.pdf                                │
│ Revised: revised.pdf                                  │
╰──────────────────────────────────────────────────────╯
╭─ Baseline Risk ─╮ ╭─ Revised Risk ─╮ ╭─ Risk Delta ─╮
│       17        │ │       17       │ │      +0      │
╰─────────────────╯ ╰────────────────╯ ╰──────────────╯
╭─ Baseline Leverage ─╮ ╭─ Revised Leverage ─╮ ╭─ Leverage Delta ─╮
│         60          │ │         60         │ │        +0        │
╰─────────────────────╯ ╰────────────────────╯ ╰──────────────────╯

The comparison engine uses smart clause matching (70% title similarity + 30% text similarity) to pair clauses across versions and flag:

NEW_RISK — new high-risk clauses or risk increases ≥20 points
EXPANDED_LIABILITY — new liability expansion language detected
EXTENDED_NON_COMPETE — duration increases (parses days/months/years)

Demo 3: JSON Artifact Export

Every analysis produces structured JSON artifacts for downstream workflows:

python -m realitycheck_cli analyze .\contract.pdf --json-output .\artifacts\contract.analysis.json

{
  "summary": {
    "overall_risk_score": 40,
    "power_imbalance_score": 41,
    "ambiguity_index": 5,
    "protection_coverage_score": 15,
    "leverage_index": 54,
    "missing_protections": [
      "payment_timeline",
      "cure_period",
      "liability_cap",
      "breach_notification_window",
      "ip_retained"
    ]
  },
  "negotiation_email": "Subject: Proposed revisions for contract..."
}

Each clause includes its category, risk score, risk level, signals, rewrite suggestion, and negotiation points — fully structured for integration into legal tech workflows, dashboards, or review pipelines.

Demo 4: LLM-Enriched Analysis (Optional)

For deeper analysis, plug in Google Gemini:

$env:GEMINI_API_KEY = "your-key"
python -m realitycheck_cli analyze .\contract.pdf --use-llm

The LLM enrichment adds structured signals on top of the heuristic baseline — it doesn't replace the pattern engine, it supplements it. Signals from both engines are merged with deduplication.

Commands Quick Reference

Command	What it does
`python -m realitycheck_cli analyze contract.pdf`	Analyze a single contract
`python -m realitycheck_cli analyze contract.pdf --use-llm`	Analyze with Gemini enrichment
`python -m realitycheck_cli analyze contract.pdf -j output.json`	Export JSON artifacts
`python -m realitycheck_cli compare baseline.pdf revised.pdf`	Compare two contract versions
`.\demo.ps1 -Baseline baseline.pdf -Revised revised.pdf`	Run full demo pipeline
`.\demo.ps1 -Baseline baseline.pdf -Revised revised.pdf -UseLLM`	Demo with LLM

My Experience with GitHub Copilot CLI

GitHub Copilot was my co-pilot throughout this entire build — from architecture decisions to implementation details.

Scaffolding the Architecture

When I started, I had the idea but not the structure. I described what I wanted to Copilot:

"A CLI tool that parses legal PDFs, classifies clause risk, detects power imbalance, and generates negotiation outputs."

Copilot helped me design the modular architecture — separating concerns into ingest/, clauses/, analysis/, scoring/, negotiation/, comparison/, and output/ packages. This clean separation made each module independently testable and swappable.

Building the Heuristic Engine

The pattern-based classification engine in analysis/heuristics.py was built iteratively with Copilot. I'd describe a legal concept — "detect clauses that mention sole discretion or unilateral rights" — and Copilot would generate the regex patterns, signal types, and severity mappings. The result is a comprehensive heuristic engine that covers 7 clause categories, 4 signal types, and 6 missing-protection checks — all without any API calls.

The Scoring System

The weighted multi-factor scoring system was where Copilot really shined. I asked it to help design a scoring model where:

Different clause categories have different weights (liability should matter more than neutral clauses)
Vague language and missing protections should add penalty points
There should be a composite "Leverage Index" that tells you your negotiation strength

Copilot helped me implement the weighted average in scoring/risk_engine.py, the power imbalance detector in scoring/power_imbalance.py, and the Leverage Index formula in scoring/leverage.py — each with clear, auditable logic rather than a black-box score.

Rich Terminal Output

The premium terminal output with Rich was built entirely in collaboration with Copilot. Color-coded score cards (red ≥80, yellow ≥60, green <60), formatted tables for category breakdowns, and the negotiation draft preview panel — Copilot generated the Rich markup and helped me iterate on the layout until it felt polished and professional.

Contract Comparison Engine

The comparison module was the most complex feature. Copilot helped me implement:

Clause matching with weighted similarity scoring (70% title + 30% text, 0.55 threshold)
Non-compete duration parsing that converts between days, months, and years for accurate comparison
Liability expansion detection with domain-specific legal patterns
Risk flag generation for new risks, expanded scope, and extended terms

LLM Integration

Integrating Google Gemini as an optional enrichment layer was straightforward with Copilot's help. It generated the structured JSON system prompt, response parsing, schema validation, and the signal-merging logic that deduplicates heuristic and LLM signals by key.

Testing

Copilot helped scaffold the test suite in tests/ — unit tests for the heuristic engine, scoring calculations, LLM client mocking, and comparison logic. The tests validate that the scoring math is correct and the classification patterns work as expected.

What Copilot Changed

Without Copilot, this project would have been significantly harder to ship as a solo developer. The legal domain knowledge encoding (regex patterns for clause types, signal detection rules, scoring weights) is the kind of tedious, error-prone work that Copilot accelerates dramatically. It turned what could have been weeks of research and implementation into a focused, iterative build process where I could stay in flow and keep shipping.

The biggest impact was on code quality — Copilot consistently suggested Pydantic models for data validation, proper error handling boundaries, and clean separation of concerns. The codebase ended up more maintainable than most solo projects I've built.

Try it: github.com/Anandqwe/realitycheck-cli

DEV Community

RealityCheck CLI — Turn Legal Contracts into Decision-Grade Risk Intelligence

What I Built

The Problem

The Solution

Architecture

Key Design Decisions

Demo

Setup

Demo 1: Analyzing a Real Employment Contract (`contract.pdf`)

Demo 2: Full Pipeline with the Demo Script

Demo 3: JSON Artifact Export

Demo 4: LLM-Enriched Analysis (Optional)

Commands Quick Reference

My Experience with GitHub Copilot CLI

Scaffolding the Architecture

Building the Heuristic Engine

The Scoring System

Rich Terminal Output

Contract Comparison Engine

LLM Integration

Testing

What Copilot Changed

Top comments (0)

What I Built

The Problem

The Solution

Architecture

Key Design Decisions

Demo

Setup

Demo 1: Analyzing a Real Employment Contract (contract.pdf)

Demo 2: Full Pipeline with the Demo Script

Demo 3: JSON Artifact Export

Demo 4: LLM-Enriched Analysis (Optional)

Commands Quick Reference

My Experience with GitHub Copilot CLI

Scaffolding the Architecture

Building the Heuristic Engine

The Scoring System

Rich Terminal Output

Contract Comparison Engine

LLM Integration

Testing

What Copilot Changed

Demo 1: Analyzing a Real Employment Contract (`contract.pdf`)