DEV Community

c ck
c ck

Posted on

How a Korean Garlic Farmer and 3 LLMs Rebuilt an Error Handling System in 4 Hours on a Phone

title: "4-Party Collaboration Technical White Paper: Human(Garlic farmer) × Claude Opus 4.5 × Gemini 3.0 × MiniMax 2.5"
published: true
tags: [GarlicLang, MultiLLM, DSL, Termux, AI]
Disclaimer: This article is a personal experiment and log written by a garlic farmer in South Korea.
All numerical values shown here were actually extracted from agents running on a Unihertz Titan 2 mobile phone.
All responsibility lies with the human author.
This document was originally written in Korean and translated into English.
Due to the experimental conditions, since the data resides in the phone's system, there was no need for separate processing, so the article was finalized with only minimal review.
This is a personal AI experiment by a garlic farmer in the countryside, so the English translation may be a bit rough — please be understanding.

Date written: February 26, 2026

Authors: Human (Project Director) · Claude Opus 4.5 (Architecture Design/Verification) · Google Gemini 3.0 (Implementation) · MiniMax 2.5 (Implementation/Verification)

Keywords: GarlicLang, Multi-LLM Collaboration, Structured Error Handling, DSL, Termux, Autonomous Agent

  1. Executive Summary

This document is a technical white paper recording the GarlicLang error handling system improvement work conducted over the course of one day, February 26, 2026. One Human and 3 LLM models (Claude Opus 4.5, Gemini 3.0, MiniMax 2.5) collaborated to convert the existing string-based error return system into a JSON structured system.

Key achievements:

  • Converted error return format from strings to a 6-field JSON structure
  • Automatic inclusion of Korean resolution suggestions for 11 error types
  • Zero-downtime improvement through 4-step gradual implementation
  • 3 files modified (errors.py, interpreter.py, garliclang_bridge.py), AST verification 100% passed

Work duration: Approximately 4 hours (estimated 19:00 ~ 23:00 KST)

  1. Background and Problem Definition

2.1 What is GarlicLang

GarlicLang is a Korean-based DSL (Domain Specific Language) that runs in the Android Termux environment. It serves as an intermediate language that converts the user's natural language instructions into safe system commands.

Basic syntax example:
[파일쓰기]
경로: ~/test.py
내용: print("hello")

[실행]
명령어: python3 ~/test.py

[검증]
종류: 출력포함
대상: hello
2.2 Limitations of the Existing Error Handling

The error handling method of GarlicLang before improvement:

errors.py (before improvement)python
def err_kr(key: str, *args) -> str:
msg = ERRORS_KR.get(key, key)
if args:
msg = msg.format(*args)
return msg # Simple string return
interpreter.py (before improvement)python
raise RuntimeError(err_kr('undefined_variable', name))
Output: "정의되지 않은 변수입니다: x" (Undefined variable: x)

Problems:

  1. LLM has difficulty identifying the cause from the error message alone
  2. No error location information (line, block)
  3. No resolution suggestions
  4. Automatic correction/retry is inefficient

2.3 Causes of LLM GL Script Generation Failures

LLM GL generation failure patterns observed during work:

Failure Type Frequency Cause
Undefined variable High Missing [변수설정] when referencing $result
Syntax confusion Medium Mixing YAML or Python syntax
Path error Medium Confusion between ~/ and absolute paths
Argument mismatch Low Wrong number of arguments in function calls
  1. Participating Models and Roles

3.1 Human (Project Director)

Role: Final decision-making, work direction setting, quality verification

Observed characteristics:

  • Directly executed commands and confirmed results in CLI environment
  • Distributed and coordinated work between LLMs
  • Intervened when unexpected problems occurred
  • Demanded an "ultra-objective" approach

3.2 Claude Opus 4.5 (Architecture Design/Verification)

Role: Overall design, step-by-step plan establishment, verification command provision, documentation

Observed characteristics:

  • Proposed a 4-step gradual implementation strategy
  • Immediately provided AST verification commands after each step completion
  • Consistently applied the backup-first principle
  • Responsible for writing instructions for MiniMax/Gemini
  • Immediately presented alternatives when problems occurred

Strengths:

  • Long context retention (grasped full context across the 4-hour session)
  • High system architecture understanding
  • Safe work sequence design

Weaknesses:

  • Cannot execute code directly (dependent on Human)

3.3 Google Gemini 3.0 (Implementation)

Role: Steps 1-2 implementation (errors.py err_json addition, interpreter.py structuring)

Observed characteristics:

  • Followed work rules after reading SOUL.md
  • Safe modifications through GL script + Python combination
  • Followed backup → modification → AST inspection sequence
  • Automatically generated detailed work reports

Step 1 work (err_json addition):
Execution result: PASS:1 FAIL:0
Output: RESULT: SUCCESS_ADDED
Time taken: Approximately 2 minutes

Step 2 work (interpreter.py structuring):
Execution result: PASS:1 FAIL:0
Output: RESULT: SUCCESS_UPDATED
Modified lines: 27, 81, 119-120, 254-255

Strengths:

  • High GL script syntax accuracy
  • Succeeded in complex string replacement work
  • Automatic documentation after work completion

Weaknesses:

  • Multiple retries when initial tool:read failed
  • Intermittent response delays

3.4 MiniMax 2.5 (Implementation/Verification)

Role: Steps 3-4 implementation (bridge.py error_details, SUGGESTIONS addition), verification

Observed characteristics:

  • Quick work start after receiving instructions
  • Tendency to use tool:exec directly instead of GL scripts
  • Proposed designs considering backward compatibility
  • Proposed 3-level error classification (Parse/Runtime/Logic)

Step 3 work (bridge.py):
Execution result: PASS:1 FAIL:0
Output: RESULT: SUCCESS_UPDATED
Modified lines: 75-76, 117-118

Step 4 work (SUGGESTIONS):
Execution result: PASS:1 FAIL:0
Output: RESULT: SUCCESS_UPDATED
Added error types: 11

Strengths:

  • Fast response speed
  • Practical suggestions (Dynamic Suggestion Engine)
  • Suitable for verification tasks

Weaknesses:

  • Tendency to not follow GL script rules (prefers tool:exec)
  • 1 freeze incident during Step 4
  • Model name error when saving instructions (saved as Gemini)
  1. Collaboration System and Workflow

4.1 Role Distribution Principle
H: Decision-making, Execution, Verification
↓ (CLI commands)
Opus 4.5: Design, Instruction writing, Verification command provision
↓ (Web UI instructions)
Gemini/MiniMax: Implementation, GL script generation, Execution
↓ (Results)
Human: Confirmation → Opus 4.5: Proceed to next step
4.2 Standard Instruction Format

All work instructions followed this format:
Let's work. First read cat ~/garlic-agent/SOUL.md.

[Task Name]

Check current status:

  1. tool:read [file path]
  2. tool:exec [command]

Goal:
[Specific goal description]

Work sequence:

  1. Create [script.py] with tool:write
  2. Create [script.gl] with tool:write
  3. Execute with tool:garlic
  4. AST inspection

Save after completion:
tool:write ~/garlic-agent/prompts/2026-02-26/number[task name].md
4.3 Prompts Archive System

Instruction storage system introduced during work:
~/garlic-agent/prompts/
└── 2026-02-26/
├── 001_minimax_garlic_terminal_분석.md
├── 002_gemini_garlic_terminal_분석.md
├── 003_comparison_minimax_gemini.md
├── 004_minimax_에러처리의견.md
├── 005_gemini
에러처리의견.md
├── 006_gemini
에러처리1단계.md
├── 007_gemini
에러처리2단계.md
├── 008_minimax
에러처리3단계.md
└── 009_minimax
에러처리_4단계.md

  1. 4-Step Implementation Details

5.1 Step 1: Adding err_json() Function

Assigned to: Gemini 3.0
File: ~/garliclang_full/garliclang/errors.py

Added code:python
def err_json(key: str, line: int = None, block: str = None,
context: dict = None, *args) -> dict:
"""Structured error return function (v20.4)"""
return {
'error_type': key,
'message': err_kr(key, *args),
'line': line,
'block': block,
'context': context,
'suggestion': None # To be changed to SUGGESTIONS.get(key) in Step 4
}
Verification result:
$ grep -n "def err_json" errors.py
41:def err_json(key: str, line: int = None, ...

$ python3 -c "import ast; ast.parse(open('errors.py').read()); print('OK')"
errors.py OK
5.2 Step 2: interpreter.py Structuring

Assigned to: Gemini 3.0
File: ~/garliclang_full/garliclang/interpreter.py

Changes:

  1. Import added (line 27):python from garliclang.errors import ERRORS_KR, err_kr, err_json, BreakException, ContinueException
  2. Initialization added (line 81):python self.last_error: dict = None
  3. undefined_variable structured (lines 119-120):python self.last_error = err_json('undefined_variable', None, None, None, name) raise RuntimeError(self.last_error['message'])
  4. wrong_arg_count structured (lines 254-255):python self.last_error = err_json('wrong_arg_count', None, None, None, name, len(func.params), len(args)) raise RuntimeError(self.last_error['message']) 5.3 Step 3: bridge.py error_details Return

Assigned to: MiniMax 2.5
File: ~/garlic-agent/garliclang_bridge.py

Changes (lines 75-76, 117-118):python
if interpreter.last_error:
result["error_details"] = interpreter.last_error
Backward compatibility: Existing result["error"] maintained as string

5.4 Step 4: SUGGESTIONS Automatic Inclusion

Assigned to: MiniMax 2.5
File: ~/garliclang_full/garliclang/errors.py

Added SUGGESTIONS dictionary (line 33):python
SUGGESTIONS = {
'undefined_variable': '변수명을 확인하세요. 오타가 있거나 [변수설정]으로 먼저 정의해야 합니다.',
'undefined_function': '함수 정의를 확인하세요. [함수정의]로 먼저 정의해야 합니다.',
'wrong_arg_count': '함수 호출 시 인자 개수를 확인하세요.',
'file_not_found': '파일 경로를 확인하세요. ~/로 시작하는지, 파일이 존재하는지 확인하세요.',
'division_by_zero': '나누는 값이 0인지 확인하세요.',
'type_error_math': '숫자끼리만 연산 가능합니다. 변수 타입을 확인하세요.',
'type_error_compare': '비교할 수 없는 값입니다. 타입을 확인하세요.',
'index_out_of_range': '인덱스 범위를 확인하세요. 배열 길이보다 작은 값을 사용하세요.',
'not_an_array': '배열이 아닙니다. []로 감싸거나 리스트를 사용하세요.',
'while_max_iterations': '반복 조건을 확인하세요. 무한 루프가 발생한 것 같습니다.',
'import_not_found': 'import 파일 경로를 확인하세요.',
}
err_json modified (line 63):python
'suggestion': SUGGESTIONS.get(key)

  1. Verification and Test Results

6.1 Final JSON Error Outputpython
from garliclang.errors import err_json
result = err_json('undefined_variable', 10, '[실행]', {'name': 'x'}, 'x')
Output:json
{
"error_type": "undefined_variable",
"message": "정의되지 않은 변수입니다: x",
"line": 10,
"block": "[실행]",
"context": {
"name": "x"
},
"suggestion": "변수명을 확인하세요. 오타가 있거나 [변수설정]으로 먼저 정의해야 합니다."
}
6.2 AST Verification Results

File Result
errors.py OK
interpreter.py OK
garliclang_bridge.py OK

6.3 site-packages Synchronization

Modified files were copied to Python site-packages for system-wide application:
cp ~/garliclang_full/garliclang/errors.py /usr/lib/python3.12/site-packages/garliclang/
cp ~/garliclang_full/garliclang/interpreter.py /usr/lib/python3.12/site-packages/garliclang/

  1. Gemini 3.0 vs MiniMax 2.5 In-Depth Comparison

7.1 Same Question Response Comparison

Question: "Please give your opinion on GarlicLang error handling improvement"

Item Gemini 3.0 MiniMax 2.5
JSON direction agreement
Independent proposal Contextual Snapshot (variable dump) Dynamic Suggestion Engine
Error classification Unified structure integration 3-level classification (Parse/Runtime/Logic)
Priority 1 undefined_variable undefined_variable
Priority 2 wrong_arg_count wrong_arg_count
Emphasis on caution Integration of fragmented error handling Maintaining backward compatibility

7.2 Code Generation Quality

Item Gemini 3.0 MiniMax 2.5
GL script syntax Accurate Sometimes prefers tool:exec
Python script Complete structure Complete structure
AST inspection included
Backup logic included
Error handling Includes try-except Includes try-except

7.3 Instruction Compliance

Item Gemini 3.0 MiniMax 2.5
SOUL.md reading Always executed Always executed
Work sequence compliance High High
Result document saving Automatically performed Automatically performed
Freeze incidents 0 times 1 time (early Step 4)

7.4 Response Style

Gemini 3.0:

  • Detailed analysis report format
  • High frequency of table usage
  • Closing phrases like "Static analysis complete"

MiniMax 2.5:

  • Concise summary format
  • Presents immediately executable code
  • Conversational closing like "Do you have any additional work?"
  1. Additional Achievements

8.1 History Load Improvement

Problem: Only 24 entries were read from session.json, while 268 entries in chat_sessions.db were ignored

Solution: Modified session_manager.py's get_history() to read directly from chat_sessions.db

8.2 HANDOVER Document Expansion

Section Content
9 System Architecture
10 GL Script Usage
11 Operation Commands Guide
12 Multi-LLM Collaboration System
13 GarlicLang v20.0 Syntax Reference
14 Verification Script Patterns

8.3 Verification Script Pattern Establishmentpython
def check(name, condition):
status = "O" if condition else "X"
print(f"[{status}] {name}")
return condition

Usage example
check("config.json JSON syntax", True)
check("AST inspection passed", ast_ok)
8.4 apifree Verification Script

A tool was created to verify whether GL scripts were executed locally without API billing:
$ apifree
[1] Recently created GL scripts (within 1 hour)
📄 fix_history_load.gl
📄 update_errors.gl
[2] API call code inspection
✅ fix_history_load.py: No API code found
[Conclusion] ✅ Local execution confirmed - No API billing

  1. Conclusion

9.1 Achievement Summary

On February 26, 2026, the GarlicLang error handling system was successfully improved through collaboration of 1 Human and 3 LLM models.

Quantitative achievements:

  • Modified files: 3
  • Added code: Approximately 80 lines
  • Error type coverage: 11
  • AST verification: 100% passed
  • Work time: Approximately 4 hours

Qualitative achievements:

  • LLM can now immediately identify error causes and solutions
  • Expected improvement in automatic correction/retry efficiency
  • Multi-LLM collaboration system verified

9.2 Collaboration Effect Analysis

Role Contribution
Human Execution, Decision-making, Quality Management
Opus 4.5 Design, Coordination, Documentation
Gemini 3.0 Complex Implementation (Steps 1-2)
MiniMax 2.5 Fast Implementation (Steps 3-4), Verification

Key Insight: Multi-LLM collaboration with divided roles is more effective for complex system improvement than a single LLM.

9.3 Future Plans

  1. Expand error coverage: Apply err_json to remaining RuntimeError points
  2. Utilize context field: Automatic capture of variable state at the time of error occurrence
  3. Line field accuracy: Track line numbers during the AST parsing stage
  4. agent.py integration: Pass error_details directly to LLM to guide automatic correction

  5. Appendix

10.1 Full Paths of Modified Files
~/garliclang_full/garliclang/errors.py
~/garliclang_full/garliclang/interpreter.py
~/garlic-agent/garliclang_bridge.py
/data/data/com.termux/files/usr/lib/python3.12/site-packages/garliclang/errors.py
/data/data/com.termux/files/usr/lib/python3.12/site-packages/garliclang/interpreter.py
10.2 Backup Files
~/garlic-agent/archive/bak/errors.py.20260226_2000.bak
~/garlic-agent/archive/bak/interpreter.py.20260226_2000.bak
~/garlic-agent/archive/bak/garliclang_bridge.py.20260226_2000.bak
/storage/emulated/0/Download/garlic-agent-1.5.6_20260226_2000.tar.gz (78MB)
10.3 Reference Documents
~/garlic-agent/SOUL.md
~/garlic-agent/HANDOVER_FINAL_v1.5.6.md
~/garlic-agent/CHANGELOG.md
~/garlic-agent/prompts/2026-02-26/*.md
End of document

This white paper was written through joint collaboration of Human(Garlic farmer), Claude Opus 4.5, Gemini 3.0, and MiniMax 2.5.

Top comments (0)