Aamish

Posted on Nov 10

Sloppy : Chrome Extension for AI Slop Detection with Agentic Postgres

#devchallenge #agenticpostgreschallenge #ai #postgres

Agentic Postgres Challenge Submission

Sloppy is a Chrome extension that detects "slop" (low-quality, AI-generated, templated, repetitive writing) in web pages using Agentic Postgres with real-time multi-agent collaboration. The innovation lies in how each analysis job runs in its own zero-copy database fork where three specialized agents (Collector, Evaluator, Curator) work asynchronously in complete isolation.

The Problem

The web is increasingly filled with AI-generated content that lacks authenticity - generic marketing copy, repetitive templates, keyword-stuffed paragraphs, and low-value "content mill" writing. Readers need a way to quickly identify this "slop" to make informed decisions about content quality.

The Solution

Sloppy analyzes web pages in real-time, assigning each paragraph a quality score based on:

Template phrase detection: Identifies overused buzzwords ("cutting-edge", "world-class", "leverage synergy")
Repetition analysis: Detects repeated sentence structures and phrases using pg_trgm
Semantic similarity: Uses pgvector to find suspiciously similar paragraphs
AI writing patterns: Flags generic transitions like "It's important to note", "Moreover", "Furthermore"
Low lexical diversity: Identifies vocabulary repetition and generic language

What makes Sloppy unique is its fork-based architecture - every analysis runs in a dedicated database fork, enabling true agent isolation, parallel processing, and fearless experimentation without polluting the main database.

Why Agentic Postgres?

Traditional approaches would require complex application-level coordination, locking mechanisms, and careful state management. With Tiger's zero-copy forks:

✅ Instant isolation: Each job gets its own database snapshot in milliseconds
✅ Zero overhead: No data duplication, forks share underlying storage
✅ Clean rollback: Failed analyses simply discard their fork
✅ Parallel execution: Multiple jobs run simultaneously without interference
✅ Agent collaboration: Agents communicate through fork-local tables

Demo

Repository

GitHub: https://github.com/AamishB/Sloppy

Screenshots

1. Extension in Action

The Chrome extension analyzing a webpage with the Sloppy icon showing the quality score:

Real-time WebSocket updates show progress as agents work through the analysis

2. In-Page Highlighting

Color-coded highlights directly on the webpage showing problematic content:

Yellow highlights for medium slop (20-60%), red for high slop (>60%)

3. Detailed Tooltips

Hovering over highlights reveals specific issues detected:

Paragraph-level scoring with reasons: template phrases, repetition, AI patterns

4. Results Dashboard

The extension popup showing overall quality metrics:

Overall score: 59% - indicating significant quality issues detected

How to Try It

Clone the repo: git clone https://github.com/AamishB/Sloppy.git
Install dependencies: pip install -r requirements.txt
Set up environment variables in .env (DATABASE_URL, TIGER_CLI_PATH)
Initialize Database Schema: psql $DATABASE_URL -f db/schema.sql
Run the server: uvicorn fastapi_app.main:app --reload --port 8000
Load the extension/ folder in Chrome as an unpacked extension
Visit any webpage and click the Sloppy icon to analyze

Test page included: test_page.html contains intentional AI slop for testing

How I Used Agentic Postgres

🔱 Zero-Copy Database Forks (Core Innovation)

Every analysis job creates its own fork using Tiger CLI:

# Create isolated fork for this job
tiger service fork job_name --now --name fork_page_abc123 --no-set-default

Why this matters:

Isolation: Each job operates in complete isolation without locks or conflicts
Performance: Zero-copy means instant fork creation (< 100ms vs minutes for traditional clones)
Clean slate: Failed analyses don't pollute the main database
Parallel execution: 10+ jobs can run simultaneously, each in their own fork

🤖 Multi-Agent Collaboration Architecture

Three specialized agents collaborate within each fork:

Key code snippet (from fastapi_app/main.py):

async def run_agents_with_fork(job_id: str, raw: str):
    """Orchestrate agents in isolated fork"""

    # Create zero-copy fork
    fork_id = await create_tiger_fork(job_id)

    try:
        # Agent 1: Collect & embed
        paragraphs = split_paragraphs(raw)
        embeddings = embed_texts(paragraphs, model)
        await insert_paragraphs(fork_id, paragraphs, embeddings)

        # Agent 2: Evaluate in parallel (uses fork connection)
        tasks = [evaluate_paragraph(p, fork_id) for p in paragraphs]
        await asyncio.gather(*tasks)

        # Agent 3: Curate results
        results = await aggregate_evaluations(fork_id)

        # Merge results back to main
        await merge_fork_results(fork_id, job_id)

    finally:
        # Clean up fork (instant)
        await cleanup_tiger_fork(fork_id)

🔍 Hybrid Search: pgvector + pg_trgm

The Evaluator agent uses Tiger's optimized extensions for powerful hybrid search:

1. Semantic Similarity (pgvector):

-- Find semantically similar paragraphs (AI pattern detection)
SELECT content, embedding <=> $1::vector as distance
FROM paragraphs
WHERE embedding <=> $1::vector < 0.3
ORDER BY distance
LIMIT 5;

2. Full-Text Matching (pg_trgm):

-- Detect template phrases and repetition
SELECT content, similarity(content, $1) as sim
FROM paragraphs
WHERE content % $1  -- pg_trgm similarity operator
ORDER BY sim DESC;

3. Combined Power:

# Evaluator agent combines both approaches
semantic_matches = await find_similar_embeddings(paragraph, threshold=0.3)
template_matches = await find_template_phrases(paragraph, templates)

slop_score = calculate_score(
    semantic_similarity=len(semantic_matches),
    template_count=len(template_matches),
    repetition_score=calculate_trigram_similarity(paragraph, all_paragraphs)
)

This hybrid approach catches both:

Meaning-based slop: Paragraphs that say the same thing differently (pgvector)
Pattern-based slop: Repeated phrases and templates (pg_trgm)

🚀 Tiger CLI Integration

Sloppy uses Tiger CLI for automated fork management:

Fork Creation:

tiger service fork job_name --now --name fork_page_abc123 --no-set-default --no-wait

Fork Deletion:

tiger service delete fork_page_abc123 --confirm --no-wait

Snapshot Creation (for caching):

tiger service fork snapshot fork_page_abc123

Implementation (Windows-compatible):

def create_fork():
    result = subprocess.run(
        [TIGER_CLI_PATH, "service", "fork", service_name, 
         "--now", "--name", fork_name, "--no-set-default"],
        capture_output=True,
        encoding='utf-8',
        timeout=30
    )
    return result.returncode, result.stdout, result.stderr

# Run in thread pool for Windows asyncio compatibility
returncode, stdout, stderr = await asyncio.get_event_loop().run_in_executor(
    executor, create_fork
)

🌊 Fluid Storage Pattern

Sloppy demonstrates Tiger's fluid storage capabilities:

1. Data flows through fork lifecycle:

Raw text → Fork created → Paragraphs inserted → 
Evaluations added → Results aggregated → 
Merged to main → Fork deleted

2. Caching with forks:

# Check cache in main DB
if text_hash in cache:
    return cached_result

# Create fork, run analysis
fork_id = await create_fork()
results = await analyze_in_fork(fork_id)

# Cache results in main DB
await cache_results(text_hash, results)

3. Snapshot preservation:

# Curator agent creates snapshot before fork deletion
await create_snapshot(fork_id)  # Preserves state for debugging/auditing
await merge_results(fork_id)
await delete_fork(fork_id)

📊 Performance Benefits

With Tiger's Agentic Postgres:

⚡ Fork creation: ~50-100ms (zero-copy)
⚡ Parallel jobs: 10+ concurrent analyses without blocking
⚡ Cache hit rate: 40-60% (repeated analyses instant)
⚡ Average analysis time: 5-15 seconds for 20-50 paragraphs

Without forks (traditional approach):

❌ Would require complex application-level locking
❌ Risk of dirty reads and race conditions
❌ Difficult to roll back failed analyses
❌ Limited parallelism due to contention

Overall Experience

What Worked Brilliantly ⭐⭐⭐⭐⭐

1. Zero-Copy Forks Changed Everything

Coming into this challenge, I expected database forks to be expensive. I was wrong. Tiger's zero-copy architecture completely changed how I approach agent orchestration:

Fearless parallelism: I can spin up 10+ analysis jobs simultaneously without worrying about conflicts
Clean architecture: Each job is truly isolated - no more defensive programming around shared state
Instant cleanup: Failed analyses just discard their fork - no need to carefully undo changes

The "aha moment" was realizing I could treat database forks like Git branches - cheap, disposable, and merga ble. This mental model transformed my architecture from "careful shared-state management" to "fearless fork-per-job isolation".

2. Hybrid Search is Powerful

Combining pgvector and pg_trgm created something greater than the sum of its parts:

pgvector catches semantic slop: "This solution offers cutting-edge innovation" vs "Our platform provides state-of-the-art technology" (different words, same empty meaning)
pg_trgm catches pattern slop: Repeated phrases, template structures, generic transitions
Together they achieve ~85% accuracy in slop detection (tested on 100+ pages)

3. Tiger CLI Integration

The Tiger CLI is remarkably well-designed:

Clear command structure: tiger service fork <service-id> --now --name <fork-name>
JSON output support for parsing: --output json
Fast execution: Commands complete in 100-200ms
Great error messages: When I got flags wrong, errors were immediately obvious

Surprise: The --no-set-default flag was crucial - without it, every fork would become my default service, breaking subsequent forks. This isn't obvious from docs but makes perfect sense for automated fork management.

What Surprised Me 🤔

1. Windows Subprocess Compatibility

Hit an interesting Windows-specific issue: asyncio.create_subprocess_exec doesn't work on Windows with ProactorEventLoop (raises NotImplementedError).

Solution: Wrap synchronous subprocess.run with ThreadPoolExecutor:

executor = ThreadPoolExecutor(max_workers=4)

def create_fork():
    result = subprocess.run([TIGER_CLI_PATH, ...], capture_output=True)
    return result.returncode, result.stdout, result.stderr

returncode, stdout, stderr = await loop.run_in_executor(executor, create_fork)

This pattern works perfectly and maintains async compatibility.

2. Fork Lifecycle Management

Initially struggled with fork cleanup timing:

Too early → results not yet merged
Too late → accumulating orphaned forks
Just right → asyncio.create_task(cleanup_tiger_fork(fork_id)) after results merge

The async fire-and-forget pattern lets the main request complete while cleanup happens in background.

3. UTF-8 Encoding with Tiger CLI

Tiger CLI occasionally outputs special characters (progress indicators, icons). Without explicit UTF-8 handling:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x90

Fix: Always specify encoding:

subprocess.run([...], encoding='utf-8', errors='replace')

The errors='replace' ensures non-UTF-8 bytes don't crash the process.

What I'd Build Next 👷🏼‍♂️

1. Fork-Per-Agent Architecture

Currently agents share a single fork. Next evolution:

Main DB
  ├─ Fork 1: Collector (writes paragraphs)
  ├─ Fork 2: Evaluator (reads paragraphs, writes evaluations)
  └─ Fork 3: Curator (reads evaluations, writes results)

Each agent gets its own fork, merging results upstream. This would showcase Tiger's fork merge capabilities even more dramatically.

2. Historical Analysis with Snapshots

Use Tiger snapshots to track content quality over time:

-- Compare article quality across snapshots
SELECT 
    snapshot_id,
    created_at,
    AVG(slop_score) as avg_quality
FROM evaluations
GROUP BY snapshot_id
ORDER BY created_at;

This would enable "quality regression detection" - alerting when a website's content quality degrades.

3. Collaborative Filtering

Use pgvector to find users with similar taste in content quality:

-- Find users who rated similar content similarly
SELECT user_id, 
       embedding <=> $1::vector as taste_similarity
FROM user_preferences
ORDER BY taste_similarity
LIMIT 10;

Build a recommendation system: "Users who flagged this as slop also flagged..."

Final Thoughts

Tiger's Agentic Postgres is a paradigm shift. I came in thinking forks were an interesting optimization. I left convinced they're a fundamentally better way to architect agent systems.

The traditional approach (locks, transactions, careful state management) feels archaic now. Why coordinate agents with complex application logic when the database can provide isolation for free?

Key insight: Database forks are to agent coordination what Git is to code collaboration. Cheap, disposable, mergeable isolation that enables fearless experimentation.

Thank you Tiger team for building something genuinely innovative. Agentic Postgres isn't just faster Postgres - it's a new way of thinking about data and agents.

DEV Community