Hermes Lakehouse Engineer: An Agentic Databricks Data Engineering Teammate

#hermesagentchallenge #devchallenge #agents

Hermes Agent Challenge Submission: Build With Hermes Agent

What I Built

I built Hermes Lakehouse Engineer, an agentic Databricks data engineering experience. It turns a source contract and sample feed into a repeatable lakehouse delivery run: implementation planning, Databricks bundle artifacts, live Delta tables, quality checks, Unity Catalog-style governance metadata, lineage, release notes, and workspace evidence.

The project has a local-first fallback for demo reliability, and the live Databricks path creates real bronze, silver, and gold Delta tables in a serverless workspace job. The point of the build is to show Hermes Agent behaving like a practical data engineering teammate: it carries the work from intake to proof instead of stopping at a generated notebook.

Demo

The demo walks through a new vendor sales feed becoming a governed Lakehouse data product:

Demo video: docs/demo_assets/hermes_lakehouse_engineer_demo.mp4

Recommended screenshots:

Databricks job run showing TERMINATED SUCCESS
Live table list under workspace.sales
Local Hermes delivery loop output
Governance documentation packet

The local demo command is:

python scripts/run_agentic_delivery.py

It writes a timestamped local evidence packet under runs/ so reviewers can inspect exactly what the agent validated, documented, and prepared. The generated packet is ignored by git; the curated live proof is in docs/live_databricks_evidence.md.

The live Databricks demo is:

cd databricks
databricks bundle validate --target dev
databricks bundle deploy --target dev
databricks bundle run vendor_sales_release --target dev

Hermes reads the source contract and sample CSV.
Hermes plans the data engineering work.
Hermes prepares Databricks bundle, job, quality, and governance artifacts.
Hermes deploys a serverless Databricks job.
Hermes creates live bronze, silver, and gold Delta tables.
Hermes validates quality checks against the live tables.
Hermes prints data product documentation evidence for review.

Latest captured live proof:

Job result: TERMINATED SUCCESS
Tables: workspace.sales.bronze_vendor_sales_orders, workspace.sales.silver_sales_orders, workspace.sales.gold_sales_order_metrics
Quality checks: all pass with zero failing rows
Evidence packet: docs/live_databricks_evidence.md

Code

Repository: https://github.com/kenkondo/hermes-lakehouse-engineer

My Tech Stack

Hermes Agent
Databricks Asset Bundles
Databricks serverless jobs
Unity Catalog tables, comments, and governance conventions
Python
SQL
YAML
Markdown

How I Used Hermes Agent

Hermes Agent powers the project as the autonomous data engineering operator. It reads the source contract, reasons through the implementation plan, prepares pipeline and governance artifacts, deploys the Databricks job, validates live tables, and summarizes the release.

The most important Hermes capabilities are planning, tool use, multi-step reasoning, and file-aware execution. Data engineering is not a single code-generation step; it requires understanding the source, designing target tables, enforcing quality gates, documenting governance, and communicating readiness. Hermes is a strong fit because it can carry that workflow across the full local delivery loop.

The Unity Catalog documentation packet is a key part of the project. Hermes acts as both data engineer and data steward by generating table comments, column comments, data classifications, ownership metadata, lineage, and quality expectations before the asset is promoted.

What I liked most about using Hermes here is that the agentic loop matches the real work. A data engineer does not simply write Spark code. They read the contract, make tradeoffs, create deployment artifacts, validate the result, explain what changed, and leave behind evidence that another person can trust. That is the experience this project tries to make concrete.

DEV Community