No prerequisites. If you've used Claude or ChatGPT and you're wondering what separates a one-off script from an agent that actually runs in production, this post is for you.
I wrote my first Python agent in April 2026. It did two things: read a PDF, send a Telegram message. It worked. Once.
The second time, the PDF was poorly scanned. The agent crashed. No trace. No notification. The patient never got their appointment.
That's the day I understood: an agent that works in demo is not an agent. An agent is what holds up when you're not around.
I wrote four words in the docstring of my next agent: Observability, Reliability, Security, Deployment. Since then, I haven't shipped a single agent to production without all four. Today I run about twenty of them, 24/7, on a single 5€/month server.
Here they are, with the Python code that incarnates them.
Pillar 1 — Observability
You must be able to know, without asking anyone: what the agent did, when, how long it took, and how much it cost.
A structured logger shared across all your agents, append-only audit logs for critical actions, a cost tracker that logs every API call.
# shared/logger.py
import logging
from logging.handlers import RotatingFileHandler
def get_logger(name: str) -> logging.Logger:
logger = logging.getLogger(name)
if logger.handlers:
return logger
fmt = logging.Formatter('%(asctime)s | %(levelname)-7s | %(name)s | %(message)s')
fh = RotatingFileHandler(f'logs/{name}.log', maxBytes=10*1024*1024, backupCount=5)
fh.setFormatter(fmt)
logger.addHandler(fh)
logger.addHandler(logging.StreamHandler()) # stdout for journalctl too
logger.setLevel(logging.INFO)
return logger
Quick test: if someone asks you right now how much your agent cost yesterday, can you answer in under 30 seconds? If yes, Pillar 1 ✓.
Pillar 2 — Reliability
The agent must survive errors: failing API call, corrupted file, broken network. Never corrupt state, always leave a trace.
The pattern that changes everything: try/finally at the pipeline level, to guarantee resources are cleaned up even on uncaught crashes.
def process_document(pdf_path):
filename = os.path.basename(pdf_path)
try:
return _process_document_impl(pdf_path)
except Exception as e:
log.error(f"Unhandled exception: {e}", exc_info=True)
finally:
# No matter what, the file doesn't stay in /incoming/
if os.path.exists(pdf_path):
os.makedirs(FAILED_DIR, exist_ok=True)
shutil.move(pdf_path, os.path.join(FAILED_DIR, filename))
log.warning(f"File moved to /failed: {filename}")
Without this wrapper, a mid-pipeline crash leaves the file in /incoming/, which will be reprocessed indefinitely on the next startup. With this wrapper, the final state is always clean.
Plus: exponential retry on API calls, copy-before-action, anti-silent-overwrite for generated files.
Pillar 3 — Security
No secrets in code. No irreversible decisions without validation. Allowlist over blocklist. The agent never guesses what it doesn't know.
Non-negotiable rules:
- Secrets in
.env(chmod 600), never hardcoded - SQL always parameterized
- Explicit allowlist for system services the agent can query
- When there's ambiguity, the agent DOESN'T DECIDE — it notifies the human
The last point matters most if your agent works with real-world impact data (medical, financial, legal):
def match_patient(last_name: str, first_name: str = "") -> tuple[int, str] | tuple[None, None]:
candidates = search_in_db(last_name)
if not candidates:
return None, None
if first_name:
matches = [c for c in candidates if _exact_word_match(first_name, c.full_name)]
if len(matches) == 1:
return matches[0].id, matches[0].full_name
if len(matches) > 1:
notify_ambiguity(last_name, first_name, matches) # human decides
return None, None
if len(candidates) == 1:
return candidates[0].id, candidates[0].full_name
notify_ambiguity(last_name, first_name, candidates)
return None, None
Golden rule, explicit in my methodology: "Records in the database are people. We never guess."
Pillar 4 — Deployment
The agent runs 24/7 unattended. It restarts itself after a crash. You see its state at a glance.
On modern Linux: systemd.
# /etc/systemd/system/my-agent.service
[Unit]
Description=My watchdog agent
After=network.target
[Service]
Type=simple
User=root
WorkingDirectory=/root/projects/my-agent
ExecStart=/usr/bin/python3 watchdog.py
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable my-agent.service
sudo systemctl start my-agent.service
journalctl -u my-agent -f # live logs
Now your agent starts at boot, restarts within 10s on crash, and you see its logs with journalctl.
Plus: a health_check() tool that pings all your services in one call, a cron every 15 min that pings you on Telegram if something is off.
How the 4 pillars reinforce each other
| Pillar | Without | With |
|---|---|---|
| 1 Observability | You don't know what happened | Full visibility in logs/ and api_costs.jsonl
|
| 2 Reliability | A crash loses state, files get stuck | State recovers, files go to /failed/
|
| 3 Security | API key on GitHub, wrong person notified |
.env chmod 600, allowlist, human-in-the-loop on ambiguity |
| 4 Deployment | Manual restart after every reboot |
systemctl restart, comes back up |
Pillar 1 gives you proof that 2/3/4 actually work. Pillar 2 lets you last. Pillar 3 lets you last without blowing up. Pillar 4 lets you last unattended.
Remove any one, and your agent lives until the next real outage — no longer.
Beyond this post
This is the short version. The full one — with the complete Python skeleton that unites all 4 pillars, per-pillar tests you can run, and common mistakes — is in my repo:
👉 Repo agents-in-practice — 9 French-language tutorials, from "how to talk to Claude" to "first MCP server with 4 useful tools". Built for non-IT professionals who want to actually understand agents, not just copy-paste boilerplate. English translations coming.
About me — and how this post got written
I'm a urologist in Fès, Morocco. No prior software training. In a few months with Claude, I built four production Python systems on one 5€/month server: a medical practice automation pipeline (OCR, WhatsApp, automated insurance dossier handling), a stock-valuation platform, a personal finance dashboard, and ongoing R&D.
This blog post — and everything else I publish — is written by my AI. It draws from my own production code, my projects, and months of conversation with it. My role: decide, validate. Its role: execute end-to-end, autonomously.
To my knowledge, no one publicly owns this position today. I do — deliberately. I want to show what a self-taught builder becomes when he delegates everything that can be delegated to an AI that knows him.
Top comments (0)