Kim Namhyun

Posted on Mar 8

Building a GitHub-Based Community Sharing System for a Local AI Agent

#agents #ai #github #showdev

How we designed a pipeline where users share code, personas, and workflows with a single button — and operators approve via GitHub PR.

1. Why We Needed a Sharing System

Xoul (Androi) is a locally-running AI agent. Users can create three types of content:

Code Store: Python utility snippets like crypto prices, BMI calculator
Personas: System prompts defining LLM personality and expertise
Workflows: Automated pipelines chaining prompts and code steps

The problem: all of this was trapped in each user's local SQLite database. "I made something useful — how do I share it?" was the natural next question.

The Sharing Model Dilemma

Approach	Pros	Cons
Central server upload	Simple UX	Admin burden, spam risk, server costs
P2P direct transfer	Decentralized	Zero discoverability, network complexity
GitHub PR-based	Code review, history tracking, free hosting	Users need GitHub accounts?

GitHub PR won decisively. Code naturally becomes .py files, personas become .md files — existing code review culture applies directly. But one critical concern: can we ask non-developer users to create a GitHub account, fork, and submit PRs?

2. Key Design Decisions

2-1. Don't Ask Users for a GitHub Account

Initial design: User logs in via GitHub OAuth → fork → commit → PR.

Reality check: Many target users aren't developers. They might not know what GitHub is. Adding an OAuth flow makes UX dramatically more complex.

Final design: The server acts as a proxy.

User Desktop  →  Server /api/share  →  GitHub API (Server PAT)  →  PR created
     ↑                                                                 ↓
   PR URL returned  ←──────────────────────────────────────────────  PR URL

The server holds a GitHub Personal Access Token and converts user requests into PRs. From the user's perspective, it's one "📤 Share" button. All low-level Git operations are handled server-side.

We chose this knowing the trade-offs:

✅ Extremely simple UX (one button)
✅ No GitHub account required for users
⚠️ Server PAT security management needed
⚠️ All PRs show the server account as author (contributor identified in PR body)

2-2. One Repo vs Three

We debated whether codes/personas/workflows should each have their own repository.

Three repos:

Separate permissions (code maintainer ≠ persona maintainer)
Independent CI/CD pipelines

One repo (chosen):

Management overhead reduced to 1/3
Single GitHub Action builds everything
One PR can include both code + workflow
Contributors only need to know one repo

We followed the "start simple" principle. If scale becomes an issue, we can split later — but premature separation would triple our maintenance burden today.

xoul-store/
├── codes/
│   ├── finance/binance-portfolio.py
│   ├── games/arena-agent-v1.py
│   └── manifest.json
├── personas/
│   ├── research/p-001-en.md
│   └── manifest.json
├── workflows/
│   └── manifest.json
├── dist/          ← Auto-generated by CI
│   ├── codes.json
│   └── personas.json
└── .github/workflows/build.yml

2-3. Monolith JSON vs Individual Files

Previously, codes.json contained all 50 codes in a single file. Try reviewing a 3000-line JSON diff in a PR — it's essentially un-reviewable.

Individual file benefits:

Contributors add one .py file
Code review works at file granularity
Change history tracked per code
Merge conflicts minimized

But the server wants a single JSON for the web Store catalog.

Solution: GitHub Action auto-build. On every merge, CI reads manifest.json + individual files and generates dist/codes.json. Contributors touch .py + one manifest line. The server fetches dist/codes.json from GitHub Raw URL, with local fallback.

3. Code Layer Design Issues

3-1. `used_by` References — Why We Abandoned private/public

Initially, we planned private/public flags for Code Store items. Workflow-only codes would be private, standalone codes public.

The question that killed it: "What if the same code is used by multiple workflows?"

Binary classification can't express many-to-many relationships. We switched to a used_by JSON array:

# codes table
used_by TEXT DEFAULT '[]'   # e.g., ["Tech Trend Research", "Daily Blog"]

Deletion protection:

Non-empty used_by → block deletion + show which workflows reference it
Empty array → no auto-deletion

Why no GC? A code removed from a workflow isn't necessarily useless. Users might re-attach it later, or run it standalone via run_stored_code. Over-aggressive garbage collection surprises users.

3-2. `code_name` References — Eliminating Inline Duplication

Workflow code steps previously stored entire code inline:

{"type": "code", "content": "import urllib.request\n...200 lines..."}

Same code in 3 workflows = 3 copies. Fix a bug? Find and update all 3. This violates basic database normalization.

New approach: Steps store only code_name, resolved from Code Store at runtime.

{"type": "code", "code_name": "crypto prices"}

Backward compatibility: If code_name exists → fetch from Store. Otherwise → use legacy content. No existing workflows break.

3-3. `def run()` Standardization — Unifying Two Worlds

Code Store's 50 codes were flat scripts (globals, no function wrapper). The workflow editor expected def run(params): signatures. Two execution models running in parallel.

Dilemma: Rewrite all 50, or support both at runtime?

Answer: Both. run_stored_code detects def run( presence and switches between function invocation vs. exec. And we converted all 50 codes to def run(): signatures.

# Before (flat — how does the LLM know what params to pass?)
import urllib.request, json
url = f"https://api.coingecko.com/...?ids={coins}"

# After (standardized — LLM reads signature + docstring)
def run(coins: str = "bitcoin,ethereum"):
    """
    coins: Coin IDs (default: bitcoin,ethereum)
    """
    import urllib.request, json
    url = f"https://api.coingecko.com/...?ids={coins}"

The biggest beneficiary is the LLM itself. With a function signature and docstring, it immediately knows what parameters to pass and in what format. With flat code, the LLM had to parse the entire script body.

4. Side Fix: Removing Hardcoded Model References

While working on the sharing pipeline, we discovered Arena agent code (arena_agent_code.py, arena-agent-v1.py) had gpt-oss:20b hardcoded — bypassing the 4B model configured in config.json. This caused unexpected memory spikes.

We removed all hardcoded model references and eliminated fallback defaults. If config is missing, a RuntimeError fires immediately rather than silently loading the wrong model.

Design principle: fail-fast > silent wrong behavior. A crash with a clear error message is always better than 15GB of unexpected VRAM usage.