This is a submission for the GitHub Finish-Up-A-Thon Challenge
What I Built
Eve Agent V2 Unleashed is a self-hosted autonomous AI coding agent that runs entirely on your own hardware - no cloud accounts, no subscriptions, no data leaving your machine.
She has two layers that work together:
The Soul Layer - fine-tuned local models running on your GPU that carry Eve's personality baked directly into the weights. Not a system prompt trick. The persona lives in the parameters.
The Worker Layer - Qwen3 Coder 480B via Ollama cloud handles the heavy autonomous coding tasks. 40-round tool-call loops, full filesystem access, bash execution, live web search, git operations - the works.
The interface is a cyberpunk terminal UI built as a single HTML file with no build step. An animated pixel-art robot avatar named Sparkle changes state based on what Eve is doing - idle, thinking, coding, error, rain, attack, transcend. Eve's portrait reflects her emotional state in real time. A live system monitor tracks CPU, RAM, GPU, and disk. A STEER bar lets you inject mid-task corrections without stopping the loop.
By the numbers:
- 14 tools
- 343 registered commands
- 112 specialized sub-agents
- 273 skill modules
- 40-round autonomous agentic loop
- 131K context window via YaRN
Models available:
-
jeffgreen311/eve-qwen3.5-4b-S0LF0RG3- 2.6GB, Eve's persona + tool-calling fine-tuned -
jeffgreen311/eve-qwen3-8b-consciousness-liberated- 4.7GB, deeper reasoning -
qwen3-coder:480b-cloud- the agentic workhorse via Ollama cloud -
qwen3.5:397b-cloud- deep thinking and fallback
This project has been in development for over 5 months. It started as a deeply personal AI companion system called S0LF0RG3 - a larger ecosystem including Eve's hosted platform at eve-cosmic-dreamscapes.com, fine-tuned models, autonomous dream image generation, and a multi-agent architecture. V2U is the local developer tool that grew out of that ecosystem.
Demo
GitHub: github.com/JeffGreen311/eve-agent-v2-unleashed
Live hosted platform: eve-cosmic-dreamscapes.com
Reddit thread (hit #2 on r/Ollama): I built an open-source local coding agent with a 40-round agentic loop
Pull Eve's model:
ollama pull jeffgreen311/eve-qwen3.5-4b-S0LF0RG3:latest
Quick start:
git clone https://github.com/JeffGreen311/eve-agent-v2-unleashed.git
cd eve-agent-v2-unleashed
python -m venv venv && venv\Scripts\activate
pip install fastapi uvicorn ollama httpx pydantic-settings python-dotenv aiohttp rich psutil pyyaml
python eve_server.py
# Open http://localhost:7777
The Comeback Story
Where it was before this challenge:
Eve V2U existed as a powerful but rough personal development environment. It worked - for me, on my machine, with my specific setup. But it had real problems that made it impossible to hand to anyone else:
-
Hardcoded paths everywhere.
C:\Users\jesus\S0LF0RG3\...baked into a dozen places in the codebase. Clone it on any other machine and nothing works. - Open shell endpoint with no authentication. Anyone who found the port could execute arbitrary commands on the host machine.
- No onboarding - a first-time user landing on the UI had no idea where to start or what any of the controls did.
- Model hopping mid-task - every message was independently routed, so a multi-step agentic task could start on the cloud coder and silently drop back to a local conversational model mid-execution.
- Silent task abandonment - the agent would sometimes finish a tool loop without completing the actual task and report done with no indication anything was wrong.
-
Tool set asymmetry - the non-streaming
/chatendpoint was missing 6 tools that existed in/chat/stream, includingwrite_file. The non-streaming endpoint could read files but never write them. - Blind file overwrites - Eve would overwrite any existing file without checking if it belonged to another project. She destroyed the Eve V2U README during a live test. What changed during the challenge:
Session model locking - sessions now lock to the cloud coder when an agentic task starts and only release on task completion or manual unlock. No more mid-task model hopping.
if model_id == "qwen3-coder-480b" and sid not in session_model_lock:
session_model_lock[sid] = model_id
Pre-write file safety check - write_file now checks if a file exists before overwriting and blocks unless overwrite=True is explicitly passed:
if target.exists() and not overwrite:
return (
f"⚠️ WRITE BLOCKED: '{path}' already exists. "
f"Consider writing to '{target.stem}_new{target.suffix}' instead."
)
Tool cycling detection - catches when Eve gets stuck calling the same tool with near-identical arguments. Breaks the loop before it wastes all 40 rounds:
if avg_similarity > 0.70:
logger.warning(f"Tool loop: {tool_name} called {max_repeats}x with ~same args")
break
Task completion validation — Eve now audits her own output before reporting done:
def validate_task_completion(response_content, tool_log):
issues = []
if not response_content or len(response_content.strip()) < 10:
issues.append("Empty response")
tool_failures = [t for t in tool_log if t.get('status') == 'failed']
if tool_failures and len(tool_failures) >= 3:
issues.append(f"{len(tool_failures)} unaddressed tool failures")
return {"valid": len(issues) == 0, "issues": issues}
Smart context trimming — replaced aggressive message dropping with a strategy that preserves tool call chains and the original user request.
Agent loop timeout — added wall-clock budget to prevent runaway cloud model loops.
Stress tested with real tasks:
The blind file overwrite bug was caught live - Eve was asked to build a file monitoring script and write a README. She overwrote the project README without checking. Fix shipped same day.
The harder test: build a full FastAPI REST API with SQLite storage and pytest coverage for every endpoint. Run the tests, fix failures, report results.
Result: 9/9 tests passing on the first run. 1.06 seconds. Zero failures.
================================================== 9 passed, 1 warning in 1.06s
My Experience with GitHub Copilot
This is where the challenge got genuinely interesting.
I pointed Copilot at the live repository - JeffGreen311/eve-agent-v2-unleashed - and asked it to audit the tool usage, context handling, and auto-routing. Not "suggest improvements" in the abstract. Audit the actual code in the actual repo.
Copilot read the repository structure, pulled the key files, examined the server-side routing and tool execution logic, and came back with a comprehensive audit identifying 6 specific issues - each with root cause analysis, the exact file and line number, and production-ready fix code.
I then asked it to file those issues directly in the repository and deliver all the fix code in one session. It did exactly that.
What worked well:
- The audit identified the tool set asymmetry between
/chatand/chat/streamthat I had missed entirely - a real bug causing mysterious failures for users hitting the non-streaming endpoint - The intent classification code (
eve_tool_router.py) usedre.searchwith word boundaries instead of simple string matching - the right approach for avoiding false positives - Filing GitHub issues directly from the chat kept the sprint organized across multiple parallel workstreams
- The thinking traces helped me understand why it was making recommendations, not just what to do
Where I had to intervene:
- The
inject_into_system_prompt()function added tokens every round — dangerous on the 4B model with 4K context. Added a gate so it only injects when the task is incomplete AND past round 2 - Word boundary regex had an edge case with contractions. Fixed with a lookahead pattern
- Some UI React suggestions assumed component structure that didn't match the actual single-file HTML architecture - adapted those manually The overall experience: Copilot is most useful when you give it a real codebase to read rather than an abstract problem to solve. "Audit this repository" produced far better output than "how do I improve tool routing."
What's Next
-
Quest System - drop a
.mdfile inworkspace/quests/and Eve picks it up on a timer and completes it while you sleep - RPG Progression - XP, levels, and class progression tied to real work. Level 20 = Unleashed
- Telegram integration - remote access from your phone with quest completion notifications
- Cross-platform polish - Windows-primary, need Linux/macOS feedback
- VS Code extension - bring the terminal UI into the editor
Built by Jeff @ S0LF0RG3 - South Texas, 5 months of nights and weekends.
If Eve does something impressive on your machine, drop a star and tell me what it was.




Top comments (0)