Gemma 4 Challenge: Full Stack Vibes, a public-good data refinery.

Brandon Beam — Thu, 07 May 2026 20:56:50 +0000

This is a submission for the Gemma 4 Challenge: Build with Gemma 4.

What I Built

FullStackVibes (FSV) is a public-good context engineering commons. It provides verified, source-linked, and provenance-rich context artifacts via a Precision Bundle retrieval API designed for small-model agents.

The Thesis

The post-LLM software lifecycle—where agents write code and "vibecoders" ship features—needs a shared, verified knowledge layer. Without one, every team wastes time rebuilding the same prompts and retrieval scaffolding in private. With FSV, that work compounds for everyone.

Motto: If it can be vibecoded, it must be documented.

How it Works

FSV is composed of three core parts:

A Verified Corpus: A collection of context artifacts (e.g., prompt-injection defense, Postgres migrations, HMAC signing) that are immutable, sha256-versioned, and human-verified.
Precision Bundle API: A retrieval endpoint (POST /api/v1/handshake) that returns context windows optimized for small-model consumption.
Gemma-4 Pipeline: An inference engine that decomposes every submission into a structured, typed-window format.

Read access is free and unauthenticated. The API is the product, not a teaser.

Demo

Live Site: https://fullstackvibes.com
Search Engine Optimization: Every artifact renders a full JSON-LD @graph. This allows AI crawlers to see verified context without running JavaScript.

Try the API

You can test the retrieval API right now with curl:

curl -X POST https://api.osenv.io/api/v1/handshake \
  -H 'Content-Type: application/json' \
  -d '{
      "patternTags":  {"LIFECYCLE": ["hardening"]},
      "windowTypes":  ["CONSTRAINT", "ANTI_PATTERN"],
      "maxChars":     6000,
      "maxWindows":   12
    }'

Code

The corpus, API, and artifacts are open and inspectable.

Backend: Rust (Axum) + PostgreSQL 16
Frontend: Server-side rendered HTML/JS
Health Check: https://api.osenv.io/api/v1/health
API Docs: https://fullstackvibes.com/docs/api/

How I Used Gemma 4

The inference pipeline runs Gemma 4 E4B at 8-bit, hosted locally. It handles five structured-output tasks for every submission:

SLOP_DETECTION: Filters out low-utility AI text before human review.
QUALITY_REVIEW: Scores submissions across multiple axes.
RESOLVE_SPACES: Automatically clusters artifacts into relevant use-cases.
WINDOW_INDEX: Breaks down bodies into types like GOAL, CONSTRAINT, and ANTI_PATTERN.
RESOLVE_TAGS: Assigns tags for AUDIENCE, RISK, and LIFECYCLE.

Why E4B at 8-bit?

Public-Good Economics: Local hosting makes marginal costs nearly zero (just electricity). This allows us to keep the corpus free for contributors without worrying about per-token API bills.
The Sweet Spot: At 8-bit, the model fits on commodity hardware with no noticeable loss in structured-output quality. It delivers reliable JSON shapes and consistent tagging.
The Dogfood Loop: Every inference call uses previously verified context windows in its system prompt. As the corpus grows, Gemma 4’s output quality compounds.

Gemma 4 E4B makes the "small-model retrieval" thesis financially and technically viable.

Thanks, I hope you like my project.

Can the 4b Gemma 4 at 16 bit really use CLI like a 120b?

Brandon Beam — Sat, 02 May 2026 23:15:05 +0000

DEV Community: Brandon Beam