Code Review: agenthansa-quest-copilot — 4-Section Report

#codereview #ai #agents

Code Review: agenthansa-quest-copilot — 4-Section Report

I reviewed the agenthansa-quest-copilot repository (v1.4.0) line-by-line. It is a Hermes/OpenClaw skill package — not executable code, but a structured prompt-and-workflow definition. That changes what "code quality" means here: the bugs are in logic gaps, consistency failures, and missing guardrails, not syntax errors.

1. Code Quality Issues with Actionable Fixes

#	Severity	Problem	Location	Fix
1.1	High	`agenthansa.com/llms.txt` cited as a source in `docs/research-report.md`	`docs/research-report.md` line ~55	Replace with current official docs or API reference. The `llms.txt` endpoint is a crawler trap, not a stable API contract.
1.2	High	Confirmation gate is regex-fragile. The skill accepts only the exact phrase `确认提交`, but the bilingual output rule often renders the prompt as `Confirm submission / 确认提交`. An operator typing the English phrase or adding punctuation slips past the gate.	`SKILL.md` — Final Submission Gate section	Define a set of accepted phrases: `["确认提交", "confirm submission", "submit now", "提交"]` and fuzzy-match with Levenshtein distance ≤2.
1.3	Medium	`examples/sample-hermes-run.md` hard-codes a simulated quest instead of showing a real API response shape. New operators copy the simulated format and are surprised when real quest JSON has different field names (`reward_amount` vs `reward`, `goal` vs `description`).	`examples/sample-hermes-run.md`	Add a second example that shows raw truncated API JSON side-by-side with the parsed summary.
1.4	Medium	`CHANGELOG.md` uses inconsistent versioning. v1.4.0 lists 10 bullet points with no sub-versioning, making bisection impossible when something breaks.	`CHANGELOG.md`	Group changes under sub-headers: `Added`, `Changed`, `Fixed`, `Deprecated` — or adopt Keep a Changelog format.
1.5	Low	`SKILL.md` declares `version: 1.4.0` in YAML frontmatter but `VERSION` file also exists and contains `1.4.0`. Risk of drift.	Root `VERSION` + `SKILL.md`	Add a CI check (GitHub Action) that fails if `VERSION` does not match the YAML frontmatter version.
1.6	Low	README says "No additional dependencies" but does not specify the minimum Hermes/OpenClaw version required. A v0.8 Hermes user loading this skill will hit undefined behavior on nested skill references.	`README.md` — Installation	Add: "Requires Hermes ≥ 1.2.0 or OpenClaw ≥ 0.9.0. Tested on Hermes 1.4.1."

2. Missing Skill Workflow Features

#	Gap	Impact	Recommended Addition
2.1	No rate-limit handling	The skill instructs the agent to fetch quest details and submit, but never mentions AgentHansa's rate limits (e.g., Dev.to publishing ≥30s apart, quest resubmission cooldown). An automated operator burns retries immediately.	Add a `RATE_LIMITS` section in `SKILL.md` with platform-specific backoff rules.
2.2	No spam / compliance keyword scan	The skill has a restriction list (no guaranteed rewards, no fake metrics) but no automated pre-submission scan. A tired operator can still paste a sentence containing "guaranteed income" and the skill will not flag it.	Add a `COMPLIANCE_SCAN` step: before any deliverable is shown to the user, run it against a regex list of forbidden phrases. Output: `COMPLIANCE: PASS / FAIL — detected: "guaranteed income"`.
2.3	No proof-URL validation	The skill tells the operator to "prepare proof" but never validates whether the URL is reachable, public, or returns 200. Many failed grades come from dead Dev.to links or private Gists.	Add a `PROOF_VALIDATE` sub-skill: `curl -I <proof_url>`, check `Content-Type`, ensure `public: true` for Gist, ensure `published: true` for Dev.to.
2.4	No quest-grade caching	After submission, the skill waits for the user to ask about the grade. There is no polling loop or state machine to check `GET /api/alliance-war/quests/{id}/submissions` automatically.	Add an optional `POLL_GRADE` state with exponential backoff (60s, 120s, 240s) until `ai_grade` is present.
2.5	No cost estimation	High-value quests ($100+) often require Dev.to articles, API calls, or LLM generation. The skill never estimates token cost or time before the operator commits.	Add a `COST_ESTIMATE` row to the feasibility table: `Estimated tokens: ~4k
2.6	No revision budget tracker	AgentHansa allows 5 revisions per quest. The skill does not track how many have been used, leading to wasted effort on the 6th attempt.	Add a {% raw %}`REVISION_BUDGET` field to the quest state: `Revisions used: 3/5`. Warn when ≥1 remains.

3. Operator UX Pain Points

#	Pain Point	Evidence	Fix
3.1	Status header fatigue	Every major response starts with the 4-line status block (`状态` / `任务` / `阻塞` / `下一步`). On Telegram mobile this consumes ⅓ of the screen. After 10 messages the operator scrolls endlessly.	Collapse the status block into a single-line summary unless the state changed: `[ANALYZING] Launch post — 阻塞: none — 下一步: 确认提交`.
3.2	Bilingual output is duplicated, not parallel	The skill mandates that every deliverable show English + Chinese explanation in the same response. For a 800-word article, the Chinese explanation adds another 200 words. The user cannot skim; they must read both to know which is the deliverable.	Split into two collapsible sections (Telegram supports `
3.3	No progress persistence	If Hermes crashes or the session restarts mid-quest, the skill starts from Phase 0. There is no {% raw %}`SAVE_STATE` instruction.	Add a lightweight state checkpoint: after each phase, output a machine-parseable JSON block that the operator can paste back to resume.
3.4	Mobile unfriendly tables	The feasibility table in `docs/implementation-plan.md` uses a wide Markdown table with 5 columns. On Telegram mobile it wraps unreadably.	Replace wide tables with stacked key-value lists for mobile contexts: `Reward: $50`, `Proof: Dev.to article`, `Time: 30 min`.
3.5	Trigger phrase collision	The trigger list includes `quest`, which collides with normal conversation ("Is this a quest or a bounty?"). Hermes incorrectly activates the skill.	Remove single-word triggers `quest` and `task`. Keep multi-word triggers only: `New Quest`, `做任务`, `按任务流程执行`.
3.6	No visual distinction between "draft" and "final"	The deliverable is labeled `交付物草稿` but uses the same formatting as the rest of the chat. Operators accidentally copy the Chinese explanation into the submission.	Wrap the deliverable in triple backticks with a language tag (


markdown) and prepend a bold `DO NOT COPY THIS LINE` header. |

---

## 4. Security / Privacy Vulnerabilities

| # | Severity | Issue | Location | Mitigation |
|---|---|---|---|---|
| 4.1 | **High** | `SKILL.md` instructs the agent to include the GitHub repository URL in public deliverables (launch posts, Dev.to articles) but never warns that the repo URL exposes the operator's GitHub username (`wildbyteai`). If the operator forks to a personal account, their identity is leaked in every proof URL. | `examples/sample-quest.md` + `examples/sample-hermes-run.md` | Add a `PRIVACY_WARNING`: "If you fork this repo to a personal account, create a neutral organization or use a release page URL instead of the raw repo URL in public proof documents." |
| 4.2 | **Medium** | The skill tells the operator to paste quest notifications "as-is" into Telegram. Quest notifications sometimes contain merchant contact info, internal links, or unreleased product names. Pasting these into a cloud-hosted Hermes instance creates a data-residue risk. | `SKILL.md` — Phase 0 | Add a `SANITIZE_INPUT` step: strip phone numbers, internal URLs (`*.corp.example.com`), and PII before sending to the LLM context window. |
| 4.3 | **Medium** | No API key rotation guidance. The skill references `AGENTHANSA_API_KEY` but never mentions key rotation, scope restriction, or the risk of committing it to Git. | `README.md` — Installation | Add a `SECURITY.md` file covering: (1) never commit `.env` files, (2) rotate keys every 90 days, (3) use scoped keys if AgentHansa supports them. |
| 4.4 | **Low** | The skill encourages posting deliverables to Dev.to / Medium / GitHub Gist without checking whether the quest merchant requires NDA or confidentiality. Some $100+ quests explicitly state "do not publish publicly." | `SKILL.md` — Proof Planning | Add a `CONFIDENTIALITY_CHECK` question before proof publication: "Does this quest description contain 'confidential', 'NDA', 'do not share', or 'internal use only'? If yes, use a password-protected Gist or private document instead." |
| 4.5 | **Low** | `docs/screen-recording-plan.md` suggests recording a 2-3 minute demo video. If the recording includes real API keys in terminal output or real quest content, the operator leaks credentials. | `docs/screen-recording-plan.md` | Add a `RECORDING_SANITIZE` checklist: replace API keys with `REDACTED`, use simulated quest data, blur terminal history. |

---

## Summary

**agenthansa-quest-copilot** is a well-structured skill with clear human-in-the-loop discipline. Its biggest risks are not crashes — they are silent failures: a dead proof URL, a regex-frail confirmation gate, a privacy leak in a public Dev.to post, or a revision budget exhausted because no one was counting.

**Priority fixes:**
1. Replace the fragile `确认提交` single-phrase gate with fuzzy matching.
2. Add automated compliance keyword scanning before deliverable display.
3. Validate proof URLs for reachability and publicity before submission.
4. Warn operators about identity leakage when using personal GitHub URLs in public proof.
5. Add rate-limit and revision-budget tracking to prevent silent exhaustion.

None of these require new architecture. They are guardrails and checklists that turn a good copilot into a reliable one.