Most technical ebooks have code that doesn't run. The author wrote it, it looked right, but it was never executed. Readers discover this the hard way.
Here's a two-layer validation system that makes it structurally impossible to ship broken code examples.
Why Two Layers?
A single validator isn't enough:
- Syntax-only (AST): Fast, but doesn't catch import errors, missing files, or runtime failures
- Runtime-only (subprocess): Slow if run on every keystroke, and running code in your main process is dangerous
The solution: run them in order. Syntax first (cheap), then runtime (thorough).
Layer 1: AST Syntax Check
import ast
from typing import tuple
def check_syntax(code: str) -> tuple[bool, str]:
"""
Returns (is_valid, error_message).
Pure static analysis — no code is executed.
"""
try:
ast.parse(code)
return True, ""
except SyntaxError as e:
return False, f"SyntaxError at line {e.lineno}: {e.msg}"
except ValueError as e:
return False, f"ValueError: {e}"
This handles:
- Missing colons, mismatched brackets
- Invalid escape sequences
- f-string syntax errors
- Invalid encoding declarations
It does NOT handle: missing imports, undefined variables, runtime logic errors.
Layer 2: Subprocess Isolation
import subprocess
import tempfile
import os
from typing import tuple
def check_runtime(code: str, timeout: int = 10) -> tuple[bool, str]:
"""
Executes code in an isolated subprocess within a temporary directory.
Returns (is_valid, error_output).
"""
with tempfile.TemporaryDirectory() as tmpdir:
# Write code to a temp file
script_path = os.path.join(tmpdir, "validate_script.py")
with open(script_path, "w", encoding="utf-8") as f:
f.write(code)
result = subprocess.run(
["python3", script_path],
capture_output=True,
text=True,
timeout=timeout,
cwd=tmpdir, # Working directory = temp dir
env={**os.environ}, # Inherit environment
)
if result.returncode == 0:
return True, ""
# Return first 500 chars of stderr for diagnosis
return False, result.stderr[:500].strip()
Key design decisions:
-
tempfile.TemporaryDirectory()— auto-cleaned up when thewithblock exits -
cwd=tmpdir— relative file paths in the script resolve to the temp directory, not your project root -
timeout=10— prevents infinite loops from hanging your pipeline -
capture_output=True— stderr gives you the actual Python traceback
Combining Both Layers
from enum import Enum
import json
import os
class ValidationResult(Enum):
PASS = "PASS"
SYNTAX_FAIL = "SYNTAX_FAIL"
RUNTIME_FAIL = "RUNTIME_FAIL"
def validate_code_snippet(
code: str,
timeout: int = 10
) -> dict:
"""
Full validation pipeline for a code snippet.
Returns structured result dict.
"""
# Layer 1: syntax
syntax_ok, syntax_err = check_syntax(code)
if not syntax_ok:
return {
"result": ValidationResult.SYNTAX_FAIL.value,
"error": syntax_err,
"layer": 1,
}
# Layer 2: runtime
runtime_ok, runtime_err = check_runtime(code, timeout)
if not runtime_ok:
return {
"result": ValidationResult.RUNTIME_FAIL.value,
"error": runtime_err,
"layer": 2,
}
return {
"result": ValidationResult.PASS.value,
"error": None,
"layer": None,
}
Integrating With a Chapter Pipeline
If you're building a content pipeline, hook this into your state machine:
import json
CHECKPOINT_FILE = "checkpoint.json"
def process_chapter(chapter_id: str, code_snippets: list[str]):
"""
Validates all code snippets in a chapter.
Sets chapter status to NEEDS_REVIEW if any fail.
"""
with open(CHECKPOINT_FILE, "r+") as f:
data = json.load(f)
chapter = next(c for c in data["chapters"] if c["id"] == chapter_id)
chapter["status"] = "RUNNING"
f.seek(0)
json.dump(data, f, indent=2)
f.truncate()
# Validate each snippet
failed_snippets = []
for i, snippet in enumerate(code_snippets):
result = validate_code_snippet(snippet)
if result["result"] != "PASS":
failed_snippets.append({
"snippet_index": i,
"result": result,
})
# Update checkpoint
with open(CHECKPOINT_FILE, "r+") as f:
data = json.load(f)
chapter = next(c for c in data["chapters"] if c["id"] == chapter_id)
if failed_snippets:
chapter["status"] = "NEEDS_REVIEW"
chapter["failed_snippets"] = failed_snippets
else:
chapter["status"] = "DONE"
f.seek(0)
json.dump(data, f, indent=2)
f.truncate()
return len(failed_snippets) == 0
What This Catches
Real failures this system blocked during testing:
| Error Type | Layer Caught | Example |
|---|---|---|
| Missing colon | 1 (AST) | def foo() |
| Undefined variable | 2 (subprocess) | print(undefined_var) |
| Import not found | 2 (subprocess) | import nonexistent_pkg |
| Infinite loop | 2 (timeout) | while True: pass |
| File not found | 2 (subprocess) | open("missing.csv") |
| Wrong indentation | 1 (AST) | Misaligned blocks |
What This Doesn't Catch
Be honest about limitations:
- Logic errors — code runs but produces wrong output
- Side effects — scripts that write to disk or network
-
Long-running scripts — adjust
timeoutaccordingly -
Scripts with required user input —
input()calls will hang
For a technical ebook pipeline, these are acceptable tradeoffs. You want to verify the code examples compile and execute cleanly, not full unit test coverage.
Complete Example
# Test the validator
snippets = [
# Valid
"""
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
print(fibonacci(10))
""",
# Syntax error
"""
def broken(
print("missing closing paren")
""",
# Runtime error
"""
import this_package_does_not_exist
""",
]
for i, snippet in enumerate(snippets):
result = validate_code_snippet(snippet.strip())
status = "✅" if result["result"] == "PASS" else "❌"
print(f"Snippet {i+1}: {status} {result['result']}")
if result["error"]:
print(f" Error: {result['error'][:100]}")
Output:
Snippet 1: ✅ PASS
Snippet 2: ❌ SYNTAX_FAIL
Error: SyntaxError at line 2: '(' was never closed
Snippet 3: ❌ RUNTIME_FAIL
Error: ModuleNotFoundError: No module named 'this_package_does_not_exist'
This validation system is part of a larger pipeline I built for producing technical ebooks end-to-end: writing, validation, translation QA, EPUB assembly, and marketing asset generation — all for $20/month.
If you want the complete pipeline with all scripts: germy5.gumroad.com/l/xhxkzz ($19.99, 30-day refund).
Top comments (0)