What I Built
I built Hermes Lakehouse Engineer, an agentic Databricks data engineering experience. It turns a source contract and sample feed into a repeatable lakehouse delivery run: implementation planning, Databricks bundle artifacts, live Delta tables, quality checks, Unity Catalog-style governance metadata, lineage, release notes, and workspace evidence.
The project has a local-first fallback for demo reliability, and the live Databricks path creates real bronze, silver, and gold Delta tables in a serverless workspace job. The point of the build is to show Hermes Agent behaving like a practical data engineering teammate: it carries the work from intake to proof instead of stopping at a generated notebook.
Demo
The demo walks through a new vendor sales feed becoming a governed Lakehouse data product:
Demo video: docs/demo_assets/hermes_lakehouse_engineer_demo.mp4
Recommended screenshots:
- Databricks job run showing
TERMINATED SUCCESS - Live table list under
workspace.sales - Local Hermes delivery loop output
- Governance documentation packet
The local demo command is:
python scripts/run_agentic_delivery.py
It writes a timestamped local evidence packet under runs/ so reviewers can inspect exactly what the agent validated, documented, and prepared. The generated packet is ignored by git; the curated live proof is in docs/live_databricks_evidence.md.
The live Databricks demo is:
cd databricks
databricks bundle validate --target dev
databricks bundle deploy --target dev
databricks bundle run vendor_sales_release --target dev
- Hermes reads the source contract and sample CSV.
- Hermes plans the data engineering work.
- Hermes prepares Databricks bundle, job, quality, and governance artifacts.
- Hermes deploys a serverless Databricks job.
- Hermes creates live bronze, silver, and gold Delta tables.
- Hermes validates quality checks against the live tables.
- Hermes prints data product documentation evidence for review.
Latest captured live proof:
- Job result:
TERMINATED SUCCESS - Tables:
workspace.sales.bronze_vendor_sales_orders,workspace.sales.silver_sales_orders,workspace.sales.gold_sales_order_metrics - Quality checks: all pass with zero failing rows
- Evidence packet:
docs/live_databricks_evidence.md
Code
Repository: https://github.com/kenkondo/hermes-lakehouse-engineer
My Tech Stack
- Hermes Agent
- Databricks Asset Bundles
- Databricks serverless jobs
- Unity Catalog tables, comments, and governance conventions
- Python
- SQL
- YAML
- Markdown
How I Used Hermes Agent
Hermes Agent powers the project as the autonomous data engineering operator. It reads the source contract, reasons through the implementation plan, prepares pipeline and governance artifacts, deploys the Databricks job, validates live tables, and summarizes the release.
The most important Hermes capabilities are planning, tool use, multi-step reasoning, and file-aware execution. Data engineering is not a single code-generation step; it requires understanding the source, designing target tables, enforcing quality gates, documenting governance, and communicating readiness. Hermes is a strong fit because it can carry that workflow across the full local delivery loop.
The Unity Catalog documentation packet is a key part of the project. Hermes acts as both data engineer and data steward by generating table comments, column comments, data classifications, ownership metadata, lineage, and quality expectations before the asset is promoted.
What I liked most about using Hermes here is that the agentic loop matches the real work. A data engineer does not simply write Spark code. They read the contract, make tradeoffs, create deployment artifacts, validate the result, explain what changed, and leave behind evidence that another person can trust. That is the experience this project tries to make concrete.
Top comments (0)