divyaprakash D

Posted on May 24

Vitreus: Local-First Spreadsheet Intelligence with Gemma 4

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

TL;DR: Vitreus is a spreadsheet agent that lets you ask natural-language questions of CSV and XLSX workbooks, then uses Gemma 4 to return an auditable JSON action manifest. The manifest can highlight rows, write values, and add formulas; Vitreus then applies those changes to a CSV snapshot or a real .xlsx file with colors and formulas preserved.

I built it because spreadsheets are where a lot of real business logic lives: budgets, project trackers, sales forecasts, invoices, HR reviews, and messy exported reports. But the moment you ask an AI assistant to "just update the sheet," you run into a trust problem: what exactly did it change, and why?

Vitreus answers that by splitting the job in two:

Gemma 4 reasons about the workbook.
A deterministic driver applies only structured, reviewable actions.

The model never mutates the workbook directly.

What I Built

Vitreus is a local-first spreadsheet intelligence tool for developers, analysts, and teams who want AI help inside sensitive workbook workflows without giving the model unrestricted control.

The current project includes:

Capability	Status	Why it matters
CSV workbook snapshots	Working	Fast, portable test format for spreadsheet data
XLSX input and output	Working	Preserves cell values, formulas, and highlight colors
One-shot `analyze --output` flow	Working	No separate "generate JSON" and "apply JSON" steps
JSON action manifests	Working	Every model action is inspectable before execution
Local Gemma 4 via Ollama	Supported	Private local inference path
Google AI Studio backend	Supported	API-key path for users without local GPU access
Deterministic fallback planner	Working	CI and demos can run without a model
Chart/receipt image payload prep	Working	Foundation for multimodal spreadsheet workflows
Bash and Nushell command references	Working	Easy manual testing on Linux/Nushell systems

At a high level:

CSV/XLSX workbook
    |
    v
WorkbookSnapshot
    |
    v
compact JSON sheet context
    |
    v
Gemma 4 planner
    |
    v
JSON manifest: highlight / write_value / formula
    |
    v
InMemoryCalcDriver
    |
    v
CSV + sidecar JSON, or XLSX with colors/formulas

This is not a chatbot bolted onto a spreadsheet. It is a small, testable agent pipeline where the model is responsible for reasoning and the application is responsible for safe execution.

Demo

Demo 1: Ask Gemma 4 to find budget problems

Input file: examples/sample_workbook.csv

It contains employee/project rows with columns like:

Name,Department,Q1_Target,Q1_Actual,Q2_Target,Q2_Actual,Score,Status,Budget,Spent,Notes
Ada Lovelace,Research,120000,128000,130000,135000,94,On Track,200000,184000,
Alan Turing,Security,90000,87000,95000,91000,72,Under Review,110000,135000,
...

Command:

uv run vitreus analyze examples/sample_workbook.csv \
  "Highlight all rows where Spent exceeds Budget, and for each over-budget row write OVER BUDGET in the Notes column" \
  --backend google

Actual output from the live API-key run:

{
  "model": {
    "primary": "gemma4:31b",
    "drafter": "gemma4:4b",
    "rationale": "Identified rows where Spent exceeds Budget: Alan Turing (Row 3) and Dennis Ritchie (Row 9). Applied highlights to these rows and updated the Notes column to 'OVER BUDGET'."
  },
  "actions": [
    {
      "type": "highlight",
      "range": "Sheet1!A3:K3",
      "color": "#f97316",
      "reason": "Spent (135000) exceeds Budget (110000)"
    },
    {
      "type": "write_value",
      "cell": "Sheet1!K3",
      "value": "OVER BUDGET",
      "reason": "Spent exceeds Budget"
    },
    {
      "type": "highlight",
      "range": "Sheet1!A9:K9",
      "color": "#f97316",
      "reason": "Spent (108000) exceeds Budget (90000)"
    },
    {
      "type": "write_value",
      "cell": "Sheet1!K9",
      "value": "OVER BUDGET",
      "reason": "Spent exceeds Budget"
    }
  ]
}

This is the core Vitreus pattern: Gemma 4 does the semantic reasoning, but the output is still machine-checkable.

Demo 2: One command, real XLSX output

CSV is useful, but CSV cannot store background colors or Excel formulas. So Vitreus supports one-shot XLSX export:

uv run vitreus analyze examples/sample_workbook.csv \
  "Highlight rows where Spent exceeds Budget in orange, write OVER BUDGET in the Notes column" \
  --backend google \
  --output /tmp/vitreus_result.xlsx

The result is a real Excel workbook. Open it in LibreOffice Calc or Excel and the highlighted rows are actually colored.

Demo 3: XLSX in, XLSX out

I also created a larger workbook for testing:

examples/test_workbook.xlsx
├── Sales        25 data rows, quota and commission formulas
├── Expenses     24 data rows, annual budget/actual formulas
└── HR_Reviews   18 data rows, rating/bonus formulas

Command:

uv run vitreus analyze examples/test_workbook.xlsx \
  "In the Expenses sheet, highlight rows where Annual_Actual exceeds Annual_Budget" \
  --backend google \
  --sheet Expenses \
  --output /tmp/vitreus_expenses_result.xlsx

Recent smoke-test result:

{"applied": 1, "saved": "/tmp/vitreus_xlsx_test.xlsx", "errors": []}

Code

https://github.com/divyaprakash0426/vitreus

Important files:

File	Responsibility
`core/driver.py`	Workbook snapshots, CSV/XLSX loading, XLSX saving, manifest execution
`core/reasoning.py`	Gemma 4 model policy, Ollama backend, Google AI Studio backend, manifest parsing
`core/vision.py`	Image metadata and multimodal prompt payload preparation
`interfaces/cli.py`	Typer CLI commands: `models`, `analyze`, `apply-manifest`, `vision`
`examples/sample_workbook.csv`	Small CSV test dataset
`examples/test_workbook.xlsx`	Larger multi-sheet XLSX test workbook
`examples/test_commands.sh`	Bash demo/test command reference
`examples/test_commands.nu`	Nushell demo/test command reference
`tests/`	37 automated tests covering backends, CLI, driver, output, and vision

Try it:

git clone https://github.com/divyaprakash0426/vitreus.git
cd vitreus

uv sync --extra dev
uv run pytest -q
uv run vitreus models

Run with Google AI Studio:

uv sync --extra integrations
export GEMINI_API_KEY="your_key_here"

uv run vitreus analyze examples/sample_workbook.csv \
  "Flag rows where Spent exceeds Budget and write OVER BUDGET in Notes" \
  --backend google

Run locally with Ollama:

uv sync --extra integrations
ollama pull gemma4:31b

uv run vitreus analyze examples/sample_workbook.csv \
  "Summarise department-level spending risks" \
  --backend ollama

Use the lighter local model:

uv run vitreus analyze examples/sample_workbook.csv \
  "Highlight rows that need review" \
  --backend ollama \
  --model gemma4:4b

How I Used Gemma 4

Vitreus is designed around Gemma 4 31B Dense as the primary model.

Spreadsheet intelligence is a long-context reasoning task. A useful spreadsheet agent needs to:

Read many rows without losing the column semantics.
Understand user intent expressed in plain English.
Compare related fields like Budget and Spent.
Decide which cells or rows need attention.
Generate valid spreadsheet references like Sheet1!A3:K3.
Explain why each action is needed.
Return strict JSON instead of prose.

That is why I chose the 31B Dense model as the default planner. It is the best fit for "read this workbook, understand the pattern, and produce a reliable action plan."

The project still supports smaller Gemma 4 models as a deliberate secondary path:

Model family	Role in Vitreus	Why
Gemma 4 31B Dense	Primary planner	Best fit for long-context workbook reasoning
Gemma 4 4B	Drafter / edge assistant	Lower latency for quick previews and constrained hardware
Gemma 4 26B MoE	Future throughput path	Useful when many independent workbook requests need efficient routing

The model policy is encoded directly in the project:

@dataclass(frozen=True)
class GemmaModelChoice:
    primary: str
    drafter: str
    rationale: str

    @classmethod
    def default(cls) -> "GemmaModelChoice":
        return cls(
            primary="gemma4:31b",
            drafter="gemma4:4b",
            rationale=(
                "Gemma 4 31B Dense is the default because Vitreus needs local, "
                "long-context workbook reasoning and stronger multimodal planning; "
                "Gemma 4 4B remains useful as a low-latency drafter on edge hardware."
            ),
        )

The prompt requires JSON only:

You are Vitreus, a spreadsheet intelligence agent running gemma4:31b.
Analyze the spreadsheet data below and respond with ONLY a valid JSON manifest.

Task: Highlight rows where Spent exceeds Budget.

Required JSON response shape:
{
  "model": {"primary": "gemma4:31b", "drafter": "gemma4:4b", "rationale": "..."},
  "actions": [
    {
      "type": "highlight|write_value|formula",
      "range": "Sheet1!A1:B2",
      "cell": "Sheet1!C2",
      "value": "...",
      "formula": "=SUM(A1:A10)",
      "color": "#f97316",
      "reason": "why this action is needed"
    }
  ]
}

That contract is the heart of the project. Gemma 4 is not asked to "edit a spreadsheet." It is asked to produce a plan that Vitreus can inspect and execute.

How It Works Internally

1. WorkbookSnapshot: turn sheets into model context

Vitreus reads CSV and XLSX files into a simple in-memory representation:

@dataclass
class WorkbookSnapshot:
    sheets: dict[str, list[list[Any]]]

The snapshot can load:

WorkbookSnapshot.from_csv("scores.csv")
WorkbookSnapshot.from_xlsx("workbook.xlsx", sheet_name="Expenses")
WorkbookSnapshot.from_file("workbook.xlsx", sheet_name="Sales")

Then it exports the requested range as compact JSON:

[
  {"Name": "Alan Turing", "Budget": 110000, "Spent": 135000},
  {"Name": "Dennis Ritchie", "Budget": 90000, "Spent": 108000}
]

This gives Gemma 4 useful semantic structure: headers become keys, rows become records, and the model does not have to infer everything from raw cell coordinates.

2. Backends: local-first, cloud-optional

Vitreus supports three execution modes:

Backend	Command	Use case
Fallback	default	CI, demos, no model required
Ollama	`--backend ollama`	Local Gemma 4 inference
Google AI Studio	`--backend google`	API-key run when local GPU is unavailable

The backends are small adapter classes:

class OllamaBackend:
    def call(self, prompt: str) -> str:
        from ollama import chat
        response = chat(model=self.model, messages=[{"role": "user", "content": prompt}])
        return response.message.content

class GoogleAIBackend:
    def call(self, prompt: str) -> str:
        from google import genai
        client = genai.Client(api_key=self.api_key)
        response = client.models.generate_content(model=self.model, contents=prompt)
        return response.text

The lazy imports are intentional. The core package can be installed and tested without Ollama, Google GenAI, LibreOffice, or cloud credentials.

3. Manifest execution: structured actions only

The executor supports three action types:

{"type": "highlight", "range": "Sheet1!A3:K3", "color": "#f97316", "reason": "..."}
{"type": "write_value", "cell": "Sheet1!K3", "value": "OVER BUDGET"}
{"type": "formula", "cell": "Sheet1!J11", "formula": "=SUM(J2:J10)", "reason": "..."}

Unsupported action types are rejected instead of silently ignored. This keeps the model inside a narrow, auditable capability boundary.

4. CSV vs XLSX: the boring detail that mattered

CSV cannot store background colors or formulas as spreadsheet formulas. Early testing made that painfully obvious: a manifest could say "highlight this row," but a CSV output could only store text.

So Vitreus handles both formats explicitly:

Output	Behavior
`.csv`	Saves values and writes highlights to `<name>_highlights.json`
`.xlsx`	Saves values, formulas, and real cell background colors

That means users get a clear warning for CSV:

CSV format cannot store cell colors or formulas.
write_value changes are saved in result.csv
Highlight colors -> result_highlights.json
Tip: use --output result.xlsx to preserve everything in one file.

And they get real spreadsheet formatting when they choose XLSX.

The Technical Problems That Shaped the Project

Problem 1: "AI changed my spreadsheet" is not good enough

If a spreadsheet agent directly mutates a workbook, the user has to trust a black box.

The fix was the manifest contract. Every action contains:

the action type,
the exact cell or range,
the value/formula/color,
and the reason.

That makes it possible to log, review, diff, test, or reject model output before execution.

Problem 2: Local-first should not mean "local-only"

My preferred path is Ollama with gemma4:31b, but not every developer has a GPU available. I hit this myself while testing away from my GPU profile.

So Vitreus supports both:

# Local
uv run vitreus analyze examples/sample_workbook.csv "..." --backend ollama

# API key
uv run vitreus analyze examples/sample_workbook.csv "..." --backend google

The model interface stays the same. Only the backend changes.

Problem 3: Multi-sheet XLSX files are the real spreadsheet format

CSV was useful for early tests, but real workbooks have sheets, formulas, styles, and business structure.

The latest version added:

WorkbookSnapshot.from_xlsx(path, sheet_name="Expenses")
WorkbookSnapshot.from_xlsx(path, all_sheets=True)
WorkbookSnapshot.from_file(path, sheet_name="Sales")

The CLI now accepts .xlsx as input:

uv run vitreus analyze examples/test_workbook.xlsx \
  "In the Sales sheet, highlight reps below quota" \
  --sheet Sales \
  --output /tmp/sales_review.xlsx

Problem 4: Tests need to run without secret keys or local models

The project has 37 automated tests. They cover:

backend adapter construction,
CLI flags,
missing API key behavior,
manifest parsing,
CSV save behavior,
XLSX values,
XLSX formulas,
XLSX cell colors,
XLSX input loading,
and the multi-sheet test workbook shape.

The deterministic fallback planner is not a replacement for Gemma 4. It exists so the execution pipeline can be tested without depending on a network call or a local model.

Why This Is a Good Gemma 4 Use Case

Gemma 4 is doing real work here. It is not decorative.

The model is responsible for the part that is hard to encode as rules:

understanding workbook headers,
mapping natural-language requests to spreadsheet operations,
comparing values across columns,
deciding which rows need attention,
generating formulas,
and explaining each action.

The surrounding application does the parts software should do:

loading files,
constraining the action schema,
validating JSON,
applying known operations,
preserving output formats,
and keeping the workflow auditable.

That division is what makes the project useful. Gemma 4 supplies reasoning; Vitreus supplies guardrails.

Current Limitations

Vitreus is already useful for CSV/XLSX workflows, but there are areas I would keep improving:

Area	Current state	Next step
LibreOffice live control	Adapter planned from blueprint	Wire PyUNO to a running Calc socket
Multimodal receipts/charts	Payload prep implemented	Feed images into Gemma 4 multimodal backend
Multi-sheet reasoning	Sheet-specific input works	Add whole-workbook summarization
Formula safety	Formula strings are written	Add formula linting and policy controls
Review UI	Terminal-first	Add a small manifest review screen

The design intentionally keeps these as separable layers. The workbook reader, reasoning engine, and manifest executor can evolve independently.

What I Learned

The biggest lesson was that a good spreadsheet agent is less about "letting AI use Excel" and more about designing a trustworthy boundary between reasoning and execution.

Gemma 4 is capable enough to understand messy tabular context and produce useful spreadsheet plans. But the application still needs to say:

here is the allowed action vocabulary,
here is the exact JSON shape,
here is how output will be applied,
and here is what happens when a format cannot represent an action.

That is the difference between an impressive demo and a tool I would trust with a real workbook.

Acknowledgements

Built with:

Gemma 4 31B Dense as the primary reasoning model
Gemma 4 4B as the drafter/edge model path
Ollama for local model execution
Google AI Studio for API-key testing
Typer for the CLI
openpyxl for XLSX input/output
pytest for the test suite
LibreOffice Calc as the target spreadsheet environment

DEV Community