title: "4-Party Collaboration Technical White Paper: Human(Garlic farmer) × Claude Opus 4.5 × Gemini 3.0 × MiniMax 2.5"
published: true
tags: [GarlicLang, MultiLLM, DSL, Termux, AI]
Disclaimer: This article is a personal experiment and log written by a garlic farmer in South Korea.
All numerical values shown here were actually extracted from agents running on a Unihertz Titan 2 mobile phone.
All responsibility lies with the human author.
This document was originally written in Korean and translated into English.
Due to the experimental conditions, since the data resides in the phone's system, there was no need for separate processing, so the article was finalized with only minimal review.
This is a personal AI experiment by a garlic farmer in the countryside, so the English translation may be a bit rough — please be understanding.
Date written: February 26, 2026
Authors: Human (Project Director) · Claude Opus 4.5 (Architecture Design/Verification) · Google Gemini 3.0 (Implementation) · MiniMax 2.5 (Implementation/Verification)
Keywords: GarlicLang, Multi-LLM Collaboration, Structured Error Handling, DSL, Termux, Autonomous Agent
- Executive Summary
This document is a technical white paper recording the GarlicLang error handling system improvement work conducted over the course of one day, February 26, 2026. One Human and 3 LLM models (Claude Opus 4.5, Gemini 3.0, MiniMax 2.5) collaborated to convert the existing string-based error return system into a JSON structured system.
Key achievements:
- Converted error return format from strings to a 6-field JSON structure
- Automatic inclusion of Korean resolution suggestions for 11 error types
- Zero-downtime improvement through 4-step gradual implementation
- 3 files modified (errors.py, interpreter.py, garliclang_bridge.py), AST verification 100% passed
Work duration: Approximately 4 hours (estimated 19:00 ~ 23:00 KST)
- Background and Problem Definition
2.1 What is GarlicLang
GarlicLang is a Korean-based DSL (Domain Specific Language) that runs in the Android Termux environment. It serves as an intermediate language that converts the user's natural language instructions into safe system commands.
Basic syntax example:
[파일쓰기]
경로: ~/test.py
내용: print("hello")
[실행]
명령어: python3 ~/test.py
[검증]
종류: 출력포함
대상: hello
2.2 Limitations of the Existing Error Handling
The error handling method of GarlicLang before improvement:
errors.py (before improvement)python
def err_kr(key: str, *args) -> str:
msg = ERRORS_KR.get(key, key)
if args:
msg = msg.format(*args)
return msg # Simple string return
interpreter.py (before improvement)python
raise RuntimeError(err_kr('undefined_variable', name))
Output: "정의되지 않은 변수입니다: x" (Undefined variable: x)
Problems:
- LLM has difficulty identifying the cause from the error message alone
- No error location information (line, block)
- No resolution suggestions
- Automatic correction/retry is inefficient
2.3 Causes of LLM GL Script Generation Failures
LLM GL generation failure patterns observed during work:
| Failure Type | Frequency | Cause |
|---|---|---|
| Undefined variable | High | Missing [변수설정] when referencing $result |
| Syntax confusion | Medium | Mixing YAML or Python syntax |
| Path error | Medium | Confusion between ~/ and absolute paths |
| Argument mismatch | Low | Wrong number of arguments in function calls |
- Participating Models and Roles
3.1 Human (Project Director)
Role: Final decision-making, work direction setting, quality verification
Observed characteristics:
- Directly executed commands and confirmed results in CLI environment
- Distributed and coordinated work between LLMs
- Intervened when unexpected problems occurred
- Demanded an "ultra-objective" approach
3.2 Claude Opus 4.5 (Architecture Design/Verification)
Role: Overall design, step-by-step plan establishment, verification command provision, documentation
Observed characteristics:
- Proposed a 4-step gradual implementation strategy
- Immediately provided AST verification commands after each step completion
- Consistently applied the backup-first principle
- Responsible for writing instructions for MiniMax/Gemini
- Immediately presented alternatives when problems occurred
Strengths:
- Long context retention (grasped full context across the 4-hour session)
- High system architecture understanding
- Safe work sequence design
Weaknesses:
- Cannot execute code directly (dependent on Human)
3.3 Google Gemini 3.0 (Implementation)
Role: Steps 1-2 implementation (errors.py err_json addition, interpreter.py structuring)
Observed characteristics:
- Followed work rules after reading SOUL.md
- Safe modifications through GL script + Python combination
- Followed backup → modification → AST inspection sequence
- Automatically generated detailed work reports
Step 1 work (err_json addition):
Execution result: PASS:1 FAIL:0
Output: RESULT: SUCCESS_ADDED
Time taken: Approximately 2 minutes
Step 2 work (interpreter.py structuring):
Execution result: PASS:1 FAIL:0
Output: RESULT: SUCCESS_UPDATED
Modified lines: 27, 81, 119-120, 254-255
Strengths:
- High GL script syntax accuracy
- Succeeded in complex string replacement work
- Automatic documentation after work completion
Weaknesses:
- Multiple retries when initial tool:read failed
- Intermittent response delays
3.4 MiniMax 2.5 (Implementation/Verification)
Role: Steps 3-4 implementation (bridge.py error_details, SUGGESTIONS addition), verification
Observed characteristics:
- Quick work start after receiving instructions
- Tendency to use tool:exec directly instead of GL scripts
- Proposed designs considering backward compatibility
- Proposed 3-level error classification (Parse/Runtime/Logic)
Step 3 work (bridge.py):
Execution result: PASS:1 FAIL:0
Output: RESULT: SUCCESS_UPDATED
Modified lines: 75-76, 117-118
Step 4 work (SUGGESTIONS):
Execution result: PASS:1 FAIL:0
Output: RESULT: SUCCESS_UPDATED
Added error types: 11
Strengths:
- Fast response speed
- Practical suggestions (Dynamic Suggestion Engine)
- Suitable for verification tasks
Weaknesses:
- Tendency to not follow GL script rules (prefers tool:exec)
- 1 freeze incident during Step 4
- Model name error when saving instructions (saved as Gemini)
- Collaboration System and Workflow
4.1 Role Distribution Principle
H: Decision-making, Execution, Verification
↓ (CLI commands)
Opus 4.5: Design, Instruction writing, Verification command provision
↓ (Web UI instructions)
Gemini/MiniMax: Implementation, GL script generation, Execution
↓ (Results)
Human: Confirmation → Opus 4.5: Proceed to next step
4.2 Standard Instruction Format
All work instructions followed this format:
Let's work. First read cat ~/garlic-agent/SOUL.md.
[Task Name]
Check current status:
- tool:read [file path]
- tool:exec [command]
Goal:
[Specific goal description]
Work sequence:
- Create [script.py] with tool:write
- Create [script.gl] with tool:write
- Execute with tool:garlic
- AST inspection
Save after completion:
tool:write ~/garlic-agent/prompts/2026-02-26/number[task name].md
4.3 Prompts Archive System
Instruction storage system introduced during work:
~/garlic-agent/prompts/
└── 2026-02-26/
├── 001_minimax_garlic_terminal_분석.md
├── 002_gemini_garlic_terminal_분석.md
├── 003_comparison_minimax_gemini.md
├── 004_minimax_에러처리의견.md
├── 005_gemini에러처리의견.md
├── 006_gemini에러처리1단계.md
├── 007_gemini에러처리2단계.md
├── 008_minimax에러처리3단계.md
└── 009_minimax에러처리_4단계.md
- 4-Step Implementation Details
5.1 Step 1: Adding err_json() Function
Assigned to: Gemini 3.0
File: ~/garliclang_full/garliclang/errors.py
Added code:python
def err_json(key: str, line: int = None, block: str = None,
context: dict = None, *args) -> dict:
"""Structured error return function (v20.4)"""
return {
'error_type': key,
'message': err_kr(key, *args),
'line': line,
'block': block,
'context': context,
'suggestion': None # To be changed to SUGGESTIONS.get(key) in Step 4
}
Verification result:
$ grep -n "def err_json" errors.py
41:def err_json(key: str, line: int = None, ...
$ python3 -c "import ast; ast.parse(open('errors.py').read()); print('OK')"
errors.py OK
5.2 Step 2: interpreter.py Structuring
Assigned to: Gemini 3.0
File: ~/garliclang_full/garliclang/interpreter.py
Changes:
- Import added (line 27):python from garliclang.errors import ERRORS_KR, err_kr, err_json, BreakException, ContinueException
- Initialization added (line 81):python self.last_error: dict = None
- undefined_variable structured (lines 119-120):python self.last_error = err_json('undefined_variable', None, None, None, name) raise RuntimeError(self.last_error['message'])
- wrong_arg_count structured (lines 254-255):python self.last_error = err_json('wrong_arg_count', None, None, None, name, len(func.params), len(args)) raise RuntimeError(self.last_error['message']) 5.3 Step 3: bridge.py error_details Return
Assigned to: MiniMax 2.5
File: ~/garlic-agent/garliclang_bridge.py
Changes (lines 75-76, 117-118):python
if interpreter.last_error:
result["error_details"] = interpreter.last_error
Backward compatibility: Existing result["error"] maintained as string
5.4 Step 4: SUGGESTIONS Automatic Inclusion
Assigned to: MiniMax 2.5
File: ~/garliclang_full/garliclang/errors.py
Added SUGGESTIONS dictionary (line 33):python
SUGGESTIONS = {
'undefined_variable': '변수명을 확인하세요. 오타가 있거나 [변수설정]으로 먼저 정의해야 합니다.',
'undefined_function': '함수 정의를 확인하세요. [함수정의]로 먼저 정의해야 합니다.',
'wrong_arg_count': '함수 호출 시 인자 개수를 확인하세요.',
'file_not_found': '파일 경로를 확인하세요. ~/로 시작하는지, 파일이 존재하는지 확인하세요.',
'division_by_zero': '나누는 값이 0인지 확인하세요.',
'type_error_math': '숫자끼리만 연산 가능합니다. 변수 타입을 확인하세요.',
'type_error_compare': '비교할 수 없는 값입니다. 타입을 확인하세요.',
'index_out_of_range': '인덱스 범위를 확인하세요. 배열 길이보다 작은 값을 사용하세요.',
'not_an_array': '배열이 아닙니다. []로 감싸거나 리스트를 사용하세요.',
'while_max_iterations': '반복 조건을 확인하세요. 무한 루프가 발생한 것 같습니다.',
'import_not_found': 'import 파일 경로를 확인하세요.',
}
err_json modified (line 63):python
'suggestion': SUGGESTIONS.get(key)
- Verification and Test Results
6.1 Final JSON Error Outputpython
from garliclang.errors import err_json
result = err_json('undefined_variable', 10, '[실행]', {'name': 'x'}, 'x')
Output:json
{
"error_type": "undefined_variable",
"message": "정의되지 않은 변수입니다: x",
"line": 10,
"block": "[실행]",
"context": {
"name": "x"
},
"suggestion": "변수명을 확인하세요. 오타가 있거나 [변수설정]으로 먼저 정의해야 합니다."
}
6.2 AST Verification Results
| File | Result |
|---|---|
| errors.py | OK |
| interpreter.py | OK |
| garliclang_bridge.py | OK |
6.3 site-packages Synchronization
Modified files were copied to Python site-packages for system-wide application:
cp ~/garliclang_full/garliclang/errors.py /usr/lib/python3.12/site-packages/garliclang/
cp ~/garliclang_full/garliclang/interpreter.py /usr/lib/python3.12/site-packages/garliclang/
- Gemini 3.0 vs MiniMax 2.5 In-Depth Comparison
7.1 Same Question Response Comparison
Question: "Please give your opinion on GarlicLang error handling improvement"
| Item | Gemini 3.0 | MiniMax 2.5 |
|---|---|---|
| JSON direction agreement | ✅ | ✅ |
| Independent proposal | Contextual Snapshot (variable dump) | Dynamic Suggestion Engine |
| Error classification | Unified structure integration | 3-level classification (Parse/Runtime/Logic) |
| Priority 1 | undefined_variable | undefined_variable |
| Priority 2 | wrong_arg_count | wrong_arg_count |
| Emphasis on caution | Integration of fragmented error handling | Maintaining backward compatibility |
7.2 Code Generation Quality
| Item | Gemini 3.0 | MiniMax 2.5 |
|---|---|---|
| GL script syntax | Accurate | Sometimes prefers tool:exec |
| Python script | Complete structure | Complete structure |
| AST inspection included | ✅ | ✅ |
| Backup logic included | ✅ | ✅ |
| Error handling | Includes try-except | Includes try-except |
7.3 Instruction Compliance
| Item | Gemini 3.0 | MiniMax 2.5 |
|---|---|---|
| SOUL.md reading | Always executed | Always executed |
| Work sequence compliance | High | High |
| Result document saving | Automatically performed | Automatically performed |
| Freeze incidents | 0 times | 1 time (early Step 4) |
7.4 Response Style
Gemini 3.0:
- Detailed analysis report format
- High frequency of table usage
- Closing phrases like "Static analysis complete"
MiniMax 2.5:
- Concise summary format
- Presents immediately executable code
- Conversational closing like "Do you have any additional work?"
- Additional Achievements
8.1 History Load Improvement
Problem: Only 24 entries were read from session.json, while 268 entries in chat_sessions.db were ignored
Solution: Modified session_manager.py's get_history() to read directly from chat_sessions.db
8.2 HANDOVER Document Expansion
| Section | Content |
|---|---|
| 9 | System Architecture |
| 10 | GL Script Usage |
| 11 | Operation Commands Guide |
| 12 | Multi-LLM Collaboration System |
| 13 | GarlicLang v20.0 Syntax Reference |
| 14 | Verification Script Patterns |
8.3 Verification Script Pattern Establishmentpython
def check(name, condition):
status = "O" if condition else "X"
print(f"[{status}] {name}")
return condition
Usage example
check("config.json JSON syntax", True)
check("AST inspection passed", ast_ok)
8.4 apifree Verification Script
A tool was created to verify whether GL scripts were executed locally without API billing:
$ apifree
[1] Recently created GL scripts (within 1 hour)
📄 fix_history_load.gl
📄 update_errors.gl
[2] API call code inspection
✅ fix_history_load.py: No API code found
[Conclusion] ✅ Local execution confirmed - No API billing
- Conclusion
9.1 Achievement Summary
On February 26, 2026, the GarlicLang error handling system was successfully improved through collaboration of 1 Human and 3 LLM models.
Quantitative achievements:
- Modified files: 3
- Added code: Approximately 80 lines
- Error type coverage: 11
- AST verification: 100% passed
- Work time: Approximately 4 hours
Qualitative achievements:
- LLM can now immediately identify error causes and solutions
- Expected improvement in automatic correction/retry efficiency
- Multi-LLM collaboration system verified
9.2 Collaboration Effect Analysis
| Role | Contribution |
|---|---|
| Human | Execution, Decision-making, Quality Management |
| Opus 4.5 | Design, Coordination, Documentation |
| Gemini 3.0 | Complex Implementation (Steps 1-2) |
| MiniMax 2.5 | Fast Implementation (Steps 3-4), Verification |
Key Insight: Multi-LLM collaboration with divided roles is more effective for complex system improvement than a single LLM.
9.3 Future Plans
- Expand error coverage: Apply err_json to remaining RuntimeError points
- Utilize context field: Automatic capture of variable state at the time of error occurrence
- Line field accuracy: Track line numbers during the AST parsing stage
agent.py integration: Pass error_details directly to LLM to guide automatic correction
Appendix
10.1 Full Paths of Modified Files
~/garliclang_full/garliclang/errors.py
~/garliclang_full/garliclang/interpreter.py
~/garlic-agent/garliclang_bridge.py
/data/data/com.termux/files/usr/lib/python3.12/site-packages/garliclang/errors.py
/data/data/com.termux/files/usr/lib/python3.12/site-packages/garliclang/interpreter.py
10.2 Backup Files
~/garlic-agent/archive/bak/errors.py.20260226_2000.bak
~/garlic-agent/archive/bak/interpreter.py.20260226_2000.bak
~/garlic-agent/archive/bak/garliclang_bridge.py.20260226_2000.bak
/storage/emulated/0/Download/garlic-agent-1.5.6_20260226_2000.tar.gz (78MB)
10.3 Reference Documents
~/garlic-agent/SOUL.md
~/garlic-agent/HANDOVER_FINAL_v1.5.6.md
~/garlic-agent/CHANGELOG.md
~/garlic-agent/prompts/2026-02-26/*.md
End of document
This white paper was written through joint collaboration of Human(Garlic farmer), Claude Opus 4.5, Gemini 3.0, and MiniMax 2.5.
Top comments (0)