How we designed a pipeline where users share code, personas, and workflows with a single button — and operators approve via GitHub PR.
1. Why We Needed a Sharing System
Xoul (Androi) is a locally-running AI agent. Users can create three types of content:
-
Code Store: Python utility snippets like
crypto prices,BMI calculator - Personas: System prompts defining LLM personality and expertise
- Workflows: Automated pipelines chaining prompts and code steps
The problem: all of this was trapped in each user's local SQLite database. "I made something useful — how do I share it?" was the natural next question.
The Sharing Model Dilemma
| Approach | Pros | Cons |
|---|---|---|
| Central server upload | Simple UX | Admin burden, spam risk, server costs |
| P2P direct transfer | Decentralized | Zero discoverability, network complexity |
| GitHub PR-based | Code review, history tracking, free hosting | Users need GitHub accounts? |
GitHub PR won decisively. Code naturally becomes .py files, personas become .md files — existing code review culture applies directly. But one critical concern: can we ask non-developer users to create a GitHub account, fork, and submit PRs?
2. Key Design Decisions
2-1. Don't Ask Users for a GitHub Account
Initial design: User logs in via GitHub OAuth → fork → commit → PR.
Reality check: Many target users aren't developers. They might not know what GitHub is. Adding an OAuth flow makes UX dramatically more complex.
Final design: The server acts as a proxy.
User Desktop → Server /api/share → GitHub API (Server PAT) → PR created
↑ ↓
PR URL returned ←────────────────────────────────────────────── PR URL
The server holds a GitHub Personal Access Token and converts user requests into PRs. From the user's perspective, it's one "📤 Share" button. All low-level Git operations are handled server-side.
We chose this knowing the trade-offs:
- ✅ Extremely simple UX (one button)
- ✅ No GitHub account required for users
- ⚠️ Server PAT security management needed
- ⚠️ All PRs show the server account as author (contributor identified in PR body)
2-2. One Repo vs Three
We debated whether codes/personas/workflows should each have their own repository.
Three repos:
- Separate permissions (code maintainer ≠ persona maintainer)
- Independent CI/CD pipelines
One repo (chosen):
- Management overhead reduced to 1/3
- Single GitHub Action builds everything
- One PR can include both code + workflow
- Contributors only need to know one repo
We followed the "start simple" principle. If scale becomes an issue, we can split later — but premature separation would triple our maintenance burden today.
xoul-store/
├── codes/
│ ├── finance/binance-portfolio.py
│ ├── games/arena-agent-v1.py
│ └── manifest.json
├── personas/
│ ├── research/p-001-en.md
│ └── manifest.json
├── workflows/
│ └── manifest.json
├── dist/ ← Auto-generated by CI
│ ├── codes.json
│ └── personas.json
└── .github/workflows/build.yml
2-3. Monolith JSON vs Individual Files
Previously, codes.json contained all 50 codes in a single file. Try reviewing a 3000-line JSON diff in a PR — it's essentially un-reviewable.
Individual file benefits:
- Contributors add one
.pyfile - Code review works at file granularity
- Change history tracked per code
- Merge conflicts minimized
But the server wants a single JSON for the web Store catalog.
Solution: GitHub Action auto-build. On every merge, CI reads manifest.json + individual files and generates dist/codes.json. Contributors touch .py + one manifest line. The server fetches dist/codes.json from GitHub Raw URL, with local fallback.
3. Code Layer Design Issues
3-1. used_by References — Why We Abandoned private/public
Initially, we planned private/public flags for Code Store items. Workflow-only codes would be private, standalone codes public.
The question that killed it: "What if the same code is used by multiple workflows?"
Binary classification can't express many-to-many relationships. We switched to a used_by JSON array:
# codes table
used_by TEXT DEFAULT '[]' # e.g., ["Tech Trend Research", "Daily Blog"]
Deletion protection:
- Non-empty
used_by→ block deletion + show which workflows reference it - Empty array → no auto-deletion
Why no GC? A code removed from a workflow isn't necessarily useless. Users might re-attach it later, or run it standalone via run_stored_code. Over-aggressive garbage collection surprises users.
3-2. code_name References — Eliminating Inline Duplication
Workflow code steps previously stored entire code inline:
{"type": "code", "content": "import urllib.request\n...200 lines..."}
Same code in 3 workflows = 3 copies. Fix a bug? Find and update all 3. This violates basic database normalization.
New approach: Steps store only code_name, resolved from Code Store at runtime.
{"type": "code", "code_name": "crypto prices"}
Backward compatibility: If code_name exists → fetch from Store. Otherwise → use legacy content. No existing workflows break.
3-3. def run() Standardization — Unifying Two Worlds
Code Store's 50 codes were flat scripts (globals, no function wrapper). The workflow editor expected def run(params): signatures. Two execution models running in parallel.
Dilemma: Rewrite all 50, or support both at runtime?
Answer: Both. run_stored_code detects def run( presence and switches between function invocation vs. exec. And we converted all 50 codes to def run(): signatures.
# Before (flat — how does the LLM know what params to pass?)
import urllib.request, json
url = f"https://api.coingecko.com/...?ids={coins}"
# After (standardized — LLM reads signature + docstring)
def run(coins: str = "bitcoin,ethereum"):
"""
coins: Coin IDs (default: bitcoin,ethereum)
"""
import urllib.request, json
url = f"https://api.coingecko.com/...?ids={coins}"
The biggest beneficiary is the LLM itself. With a function signature and docstring, it immediately knows what parameters to pass and in what format. With flat code, the LLM had to parse the entire script body.
4. Side Fix: Removing Hardcoded Model References
While working on the sharing pipeline, we discovered Arena agent code (arena_agent_code.py, arena-agent-v1.py) had gpt-oss:20b hardcoded — bypassing the 4B model configured in config.json. This caused unexpected memory spikes.
We removed all hardcoded model references and eliminated fallback defaults. If config is missing, a RuntimeError fires immediately rather than silently loading the wrong model.
Design principle: fail-fast > silent wrong behavior. A crash with a clear error message is always better than 15GB of unexpected VRAM usage.
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.