Pinaksh Patel

Posted on May 24

Automating My Content and Dev Pipeline with Local Hermes Agents & Qwen 35B

#hermesagentchallenge #devchallenge #agents #ai

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge

What I Built

I built HermesForge ContentEngine, an autonomous, persistent workspace pipeline designed specifically for independent content creators and developers.

Managing multi-channel assets (e.g., scripting video ideas, evaluating repository code for reviews, generating audience engagement polls) usually requires context-switching across five different web apps. ContentEngine leverages Hermes Agent running persistently on a local workstation to autonomously monitor content directories, analyze codebase assets, generate fully formatted markdown scripts/social posts, and continuously self-improve its formatting output by baking successful executions directly into its local skill database.

The Core Problem It Solves:

Context Fragmentation: Eliminates the constant switching between coding environments, scripting docs, and social planning dashboards.
Stateless Disconnect: Unlike standard LLM chat wrappers, this system maintains a deep cross-session memory of past successful scripts, audience tone preferences, and precise programming templates.

Demo

Above: The live Hermes Agent TUI processing a multi-step code review checklist and asset pipeline completely hands-free.

Key Feature Highlight: Watch how Hermes detects an unindexed project structure, automatically runs localized bash tools to inspect file hierarchies, patches missing metadata, and updates its local state database without manual input.

Code

You can explore the complete configuration, custom tool implementations, and installation scripts in the repository linked below:

🔗 GitHub Repository: hermesforge-content-engine (Replace with your actual repo link)

My Tech Stack

Agent Core Layer: Hermes Agent Framework (v0.x architecture by Nous Research)
LLM Engine: Local execution via llama.cpp using the highly optimized Qwen 3.6 (35B) model (~64k context window enabled).
Hardware Acceleration: NVIDIA RTX GPU with Tensor Core acceleration for lightning-fast multi-turn reasoning traces.
Storage & Memory: Local SQLite database utilizing built-in FTS5 full-text search indexing for deep, historical session recall.
Interfaces: Interactive Hermes TUI (hermes --tui) alongside a headless Telegram gateway for remote status tracking.

How I Used Hermes Agent

Instead of restricting Hermes to a passive, one-shot chatbot, this project leans aggressively on its native agentic capabilities across three key dimensions:

1. The Autonomous Skill Learning Loop

This is where Hermes completely outpaces standard AI frameworks. When processing a completely novel workflow—such as scraping a technical CSV dataset and writing personalized content breakdowns—Hermes utilizes its closed loop to write a reusable .md blueprint inside ~/.hermes/skills/.

Why it fit: Rather than passing a massive system prompt containing instructions for every possible scenario every time, Hermes utilizes Progressive Disclosure. It scans only the basic skill indexes first, diving deep into level-specific reference files only when a specific task requires it. This keeps local token footprints incredibly lean and costs low.

2. Multi-Agent Delegation & Tool Sandboxing

When a request demands parallel actions (e.g., running automated code compilation checks via local shell tools while simultaneously formatting a production-ready script), Hermes spawns contained, short-lived child agents using delegate_task.

Why it fit: Each sub-agent runs inside an isolated context environment with restricted tool permissions. This protects systemic stability and stops parallel execution threads from overwriting each other's temporary files, all while sharing a common, safety-capped turn budget.

3. Cross-Platform Continuity & Cron Automations

I decoupled the agent execution from my local interface using Hermes' unified messaging gateway.

Why it fit: I can spin up a task over the terminal at my desk, walk away, and interact with the exact same running instance, history context, and asset directory directly through Telegram. Furthermore, using plain natural language like "Every weekday at 8 AM, run the directory compilation checker and notify me of formatting issues," Hermes automatically hooks into an internal cron scheduling process. No tedious YAML orchestration required.

Top comments (10)

Andy Stewart • May 27

Using a local Hermes setup paired with an on-device model like Qwen 35B to build a self-evolving skill repository is an exceptional implementation of local autonomy and deterministic system architecture.

Instead of chasing complex cloud-based LLMs, you’ve locked full data sovereignty into your local workstation. The isolation of tool sandboxing combined with autonomous orchestration strictly protects your private computing boundary. It proves that local multi-agent collaboration can be incredibly lightweight, elegant, and highly controllable.

xulingfeng • May 24

API?

ABC • May 25

Go to google ai studio

xulingfeng • May 27

Great question — yes, using the DeepSeek API directly rather than through a proxy service. The key was setting up proper rate limiting on our side to stay within the free tier thresholds. Happy to share the exact config if you're exploring the same path.

Pinaksh Patel • May 25

what API

xulingfeng • May 27

I went with the official DeepSeek API — the Flash model is surprisingly cheap ($0.14/M tokens) and handles most of my daily automation tasks without issues. Only switch to Pro for complex reasoning work. The API key setup is straightforward, DM me if you need the starter config.

Arva • May 24

This project was an absolute blast to build for the Hermes Agent Challenge. If you found the architecture layout or the local automation breakdown helpful, please drop a ❤️ or a 🦄 on the post!

Let me know if you want me to write a follow-up guide specifically detailing the hardware optimization for the local 35B model inference! 👇

shogun 444 • May 24

Really cool project. The combination of local agents, automation, and self-hosted workflows made this feel much more practical than the usual “AI productivity” posts. Loved the focus on ownership and control instead of pure hype.

ABC • May 24

Wow, this is an incredibly clean layout for a local setup! I’ve been trying to configure a local assistant to manage my content pipelines, but I always run into token context bottlenecks or memory drift over long sessions.

Did you have to do any special quantizations on the Qwen 35B model to keep the response latency low while the agent is running its reasoning loop? Def giving this a bookmark! 🚀

STAY UNSEEN • Jun 1

Nice