kusunoki

Posted on Apr 8 • Edited on Apr 11

The Operations Manual for a System You Actually Own — Building Your Private AI Infrastructure [5/5]

#security #devops #linux #selfhosted

Free series. All open-source. The complete operations manual for everything you built.

You built this. Not a vendor. Not a consultant. Not a managed service provider who will send you an invoice next month for the privilege of using what was always supposed to be yours. You opened a terminal, followed a guide, made decisions, fixed the things that broke, and kept going. The system running on your server right now — the eight security layers, the four AI assistants, the private cloud, the monitoring, the backups — exists because you decided it should exist and then made it real.

That matters more than the cost savings. More than the features. More than the security architecture. What you built in these four parts is not just infrastructure. It is proof that the tools which were once reserved for enterprises with six-figure IT budgets are now accessible to anyone willing to invest two weekends and follow a guide. The promise of personal computing — that individuals would own their own capability — took forty years to arrive. It arrived in the form of open-source software, commodity cloud servers, and a free-tier security layer that would have cost a fortune five years ago.

This final part is the operations manual. It does not add features. It ensures that what you built continues to run, continues to be maintained, and continues to serve you for months and years to come. Building a system is an act of engineering. Maintaining a system is an act of responsibility. Both are yours now.

The Cardinal Rule: The Human Decides

Before anything else in this operations manual, one principle must be stated clearly enough that it cannot be forgotten or rationalized away:

Every initial data entry, every final verification, every irreversible action — a human performs it. Not AI. Not automation. Not a scheduled script running at 3 AM. You.

AI drafts your email. You read it before you send it. OpenClaw organizes your files. You check the result before you archive it. The backup script runs nightly. You verify it monthly. The accounting integration queries your books. You review the answer before you act on it.

This is not a limitation on the technology. It is the condition under which the technology operates safely. AI is an extraordinarily capable instrument. The quality of its output depends entirely on the judgment directing it.

The system you built is powerful. The responsibility for what it does is yours. That responsibility does not transfer to any software, any algorithm, or any automation. It remains with the human who gives the instruction, reviews the output, and authorizes the action. This is not a disclaimer. It is the operating principle that makes everything else in this guide trustworthy.

The Full LLM Proxy Application

The proxy makes all four AI models accessible through one authenticated endpoint.

mkdir -p ~/llm-proxy && cd ~/llm-proxy
nano app.js

Paste the complete application:

const express = require('express');
const axios = require('axios');
require('dotenv').config();

const app = express();
app.use(express.json());

const authenticate = (req, res, next) => {
  const token = req.headers['authorization']?.replace('Bearer ', '');
  if (token !== process.env.AUTH_TOKEN) {
    return res.status(401).json({ error: 'Unauthorized' });
  }
  next();
};

app.get('/health', (req, res) => {
  res.json({
    status: 'ok',
    timestamp: new Date().toISOString(),
    models: ['openai', 'anthropic', 'google', 'perplexity']
  });
});

app.post('/v1/chat', authenticate, async (req, res) => {
  const { model = 'openai', messages, max_tokens = 2000, temperature = 0.7 } = req.body;
  try {
    let response;

    if (model === 'openai') {
      response = await axios.post(
        'https://api.openai.com/v1/chat/completions',
        { model: 'gpt-4o', messages, max_tokens, temperature },
        { headers: { 'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`, 'Content-Type': 'application/json' } }
      );
      return res.json({ content: response.data.choices[0].message.content, model: 'gpt-4o', provider: 'openai' });
    }

    if (model === 'anthropic') {
      response = await axios.post(
        'https://api.anthropic.com/v1/messages',
        { model: 'claude-sonnet-4-5', messages, max_tokens },
        { headers: { 'x-api-key': process.env.ANTHROPIC_API_KEY, 'anthropic-version': '2023-06-01', 'Content-Type': 'application/json' } }
      );
      return res.json({ content: response.data.content[0].text, model: 'claude-sonnet-4-5', provider: 'anthropic' });
    }

    if (model === 'google') {
      const gm = messages.map(m => ({ role: m.role === 'assistant' ? 'model' : 'user', parts: [{ text: m.content }] }));
      response = await axios.post(
        `https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=${process.env.GOOGLE_API_KEY}`,
        { contents: gm, generationConfig: { maxOutputTokens: max_tokens, temperature } }
      );
      return res.json({ content: response.data.candidates[0].content.parts[0].text, model: 'gemini-1.5-flash', provider: 'google' });
    }

    if (model === 'perplexity') {
      response = await axios.post(
        'https://api.perplexity.ai/chat/completions',
        { model: 'sonar', messages, max_tokens, temperature },
        { headers: { 'Authorization': `Bearer ${process.env.PERPLEXITY_API_KEY}`, 'Content-Type': 'application/json' } }
      );
      return res.json({ content: response.data.choices[0].message.content, model: 'sonar', provider: 'perplexity' });
    }

    return res.status(400).json({ error: `Unknown model: ${model}` });
  } catch (error) {
    console.error(`[${new Date().toISOString()}] Error with ${model}:`, error.response?.data || error.message);
    return res.status(500).json({ error: 'AI provider error', provider: model, detail: error.response?.data?.error?.message || error.message });
  }
});

app.get('/usage-report', authenticate, async (req, res) => {
  res.json({
    timestamp: new Date().toISOString(),
    providers: {
      openai: { configured: !!process.env.OPENAI_API_KEY },
      anthropic: { configured: !!process.env.ANTHROPIC_API_KEY },
      google: { configured: !!process.env.GOOGLE_API_KEY },
      perplexity: { configured: !!process.env.PERPLEXITY_API_KEY }
    }
  });
});

const PORT = process.env.PORT || 8000;
app.listen(PORT, '127.0.0.1', () => {
  console.log(`[${new Date().toISOString()}] LLM Proxy running on port ${PORT}`);
});

Install and start:

npm init -y
npm install express axios dotenv
node app.js &
curl -s http://localhost:8000/health

Test a model:

curl -s -X POST http://localhost:8000/v1/chat \
  -H "Authorization: Bearer YOUR_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"model":"anthropic","messages":[{"role":"user","content":"Reply with: proxy confirmed."}]}'

Systemd service (run permanently):

sudo nano /etc/systemd/system/llm-proxy.service

[Unit]
Description=LLM Proxy Service
After=network.target

[Service]
Type=simple
User=myadmin
WorkingDirectory=/home/myadmin/llm-proxy
ExecStart=/usr/bin/node app.js
Restart=always
RestartSec=5
EnvironmentFile=/home/myadmin/llm-proxy/.env

[Install]
WantedBy=multi-user.target

sudo systemctl enable --now llm-proxy
sudo systemctl status llm-proxy

OpenClaw Configuration Templates

Place in ~/.openclaw/workflows/. Plain text. Read at startup.

morning-briefing.txt

SCHEDULE: weekdays 07:30
DELIVER_TO: primary_chat

Pull the following and format as a briefing:
1. Unread email count from primary Gmail. Flag any from known client domains.
2. Today's events from Nextcloud Calendar (all calendars).
3. Tasks due today or overdue.
4. Count of tasks due this week.
5. One-line server status: curl http://localhost:8000/health.

Format: plain text, no markdown. Under 200 words.

email-to-task.txt

TRIGGER: message contains "create task" OR "add task" OR "log task"
ACTION: extract task details, create in Nextcloud Tasks
REQUIRED_FIELDS: title, due_date
OPTIONAL_FIELDS: list_name, priority, notes
CONFIRM: always confirm before creating

weekly-summary.txt

SCHEDULE: friday 17:00
DELIVER_TO: primary_chat

1. Tasks completed this week.
2. Tasks due this week still open.
3. Tasks due next week (count + titles).
4. One observation about completion rate.

Format: plain text. Under 150 words.

standing-rules.txt

EMAIL_DEFAULT: draft_only
# OpenClaw must never send email autonomously.
# All email actions produce a draft for human review.
# This rule cannot be overridden by any other instruction.

openclaw reload-workflows
openclaw status

The Monthly Maintenance Checklist

First of each month. ~30 minutes. Thirteen items.

# 1. Firewall
sudo ufw status verbose

# 2. Listening ports
sudo ss -tlnp

# 3. Docker health
docker ps --format 'table {{.Names}}\t{{.Status}}'

# 4. Disk usage
df -h / && du -sh ~/backups/ ~/db-backups/ 2>/dev/null

# 5. Backup timestamps
ls -lht ~/backups/ | head -5

# 6. System updates
sudo apt update && sudo apt upgrade -y

# 7. OpenClaw
openclaw status

# 8. LLM proxy
curl -s http://localhost:8000/health

9. Grafana: review 30-day trends at https://grafana.yourdomain.com.

10. Cloudflare: review access logs at dash.cloudflare.com → Zero Trust → Logs.

# 11. Docker image updates
cd ~/nextcloud && docker-compose pull && docker-compose up -d
cd ~/guacamole && docker-compose pull && docker-compose up -d

# 12. Post-update verification
docker ps --format 'table {{.Names}}\t{{.Status}}'
curl -s http://localhost:8000/health

13. AI spending audit (see dedicated section below).

The Annual Maintenance Checklist

Once per year. ~90 minutes. Eight items.

1. Ubuntu lifecycle

ubuntu-support-status
lsb_release -a

24.04 LTS: standard support through April 2029.

2. SSH key rotation

ssh-keygen -t ed25519 -C "annual-rotation-$(date +%Y)" -f ~/.ssh/id_ed25519_new
cat ~/.ssh/id_ed25519_new.pub >> ~/.ssh/authorized_keys
# Test login with new key before removing old

3. cloudflared update

sudo apt update && sudo apt install --only-upgrade cloudflared
sudo systemctl restart cloudflared
sudo systemctl status cloudflared

4. OpenClaw update

npm update -g openclaw
openclaw status

5. Password rotation

Rotate: Nextcloud admin, Guacamole admin, Grafana admin, PostgreSQL, LLM proxy auth token. Update .env, restart affected services.

6. Access policy audit

Cloudflare Zero Trust → Access → Applications. Remove departed users. Confirm remaining access is appropriate.

7. Backup restoration test

CRITICAL: an untested backup is not a backup. This item cannot be deferred.

mkdir -p /tmp/restore-test

gpg --decrypt --batch --passphrase-file ~/.backup-passphrase \
  ~/backups/$(ls -t ~/backups/*vps-config*.gpg | head -1 | xargs basename) \
  > /tmp/restore-test/config-test.tar.gz

tar tzf /tmp/restore-test/config-test.tar.gz | head -20

psql "$SUPABASE_CONNECTION_STRING" -c "\dt" 2>/dev/null | head -10

rm -rf /tmp/restore-test

Archive lists without errors + Supabase tables visible = pass.

8. Documentation review

Confirm all credentials in Bitwarden are current. Confirm emergency runbook is accessible offline.

Emergency Response Runbook

Keep this accessible offline — printed copy, Bitwarden note, or phone file.

Cloudflare Tunnel down

Symptom: all services unreachable.

sudo systemctl status cloudflared
sudo systemctl restart cloudflared
sudo systemctl status cloudflared

If restart fails:

sudo journalctl -u cloudflared -n 50

Common: update needed or API token expired.

Docker container down

docker ps -a
cd ~/nextcloud && docker-compose restart
cd ~/guacamole && docker-compose restart
sudo systemctl restart prometheus grafana-server prometheus-alertmanager

Disk full

df -h /
sudo journalctl --vacuum-size=200M
docker system prune -f
du -sh ~/backups/* | sort -rh | head -10

API key compromised

Provider dashboard → revoke immediately → generate new key → update .env:

sudo systemctl restart llm-proxy
curl -s http://localhost:8000/health

Device lost or stolen

Cloudflare Zero Trust → Devices → revoke. Change Nextcloud + Guacamole passwords. Review Access logs.

OpenClaw acting unexpectedly

openclaw stop
openclaw logs --tail 100

Do not restart until root cause is understood.

Complete server loss (~2–3 hours)

1. New Vultr server, Ubuntu 24.04 LTS.

2. Rebuild: UFW, fail2ban, Node.js, Docker (Part 2).

3. Restore DB from Supabase:

pg_dump "$SUPABASE_CONNECTION_STRING" > /tmp/nextcloud-db-restore.sql
psql -h localhost -U nextcloud_user -d nextcloud_db < /tmp/nextcloud-db-restore.sql

4. Restore system config:

gpg --decrypt --batch --passphrase-file ~/.backup-passphrase \
  latest-vps-config.gpg > /tmp/restore-config.tar.gz
sudo tar xzf /tmp/restore-config.tar.gz -C /

5. Restore Nextcloud files:

gpg --decrypt --batch --passphrase-file ~/.backup-passphrase \
  latest-nextcloud-files.gpg > /tmp/restore-files.tar.gz
docker run --rm -v nextcloud_nextcloud_data:/data -v /tmp:/backup \
  alpine tar xzf /backup/restore-files.tar.gz -C /data

6. Restart all:

cd ~/nextcloud && docker-compose up -d
cd ~/guacamole && docker-compose up -d
sudo systemctl restart prometheus grafana-server prometheus-alertmanager nginx cloudflared llm-proxy

7. Run Part 4 verification checklist.

The AI Spending Audit

Monthly. Five minutes.

OpenAI: platform.openai.com → Billing → Usage. GPT-4o is highest cost. Consider GPT-4o mini for routine tasks.

Anthropic: console.anthropic.com → Billing. Sonnet handles most business writing at lower cost than Opus.

Google: aistudio.google.com → Dashboard. Flash for routine, Pro for synthesis.

Perplexity: perplexity.ai → Settings → API Usage. Sonar for most factual research.

curl -s -H "Authorization: Bearer YOUR_AUTH_TOKEN" \
  http://localhost:8000/usage-report

If any provider approaches cap with 10+ days remaining: raise cap or reduce workflow frequency.

What to Build Next

1. Multi-location remote desktop. Add a second Guacamole connection for a different workstation.

2. AI-powered document OCR. Tesseract + OpenClaw to classify scanned receipts and invoices.

3. Multi-user access control. Onboard colleagues with role-specific Nextcloud permissions and Cloudflare Access policies.

4. Static business website. Nginx serving public pages from the same server.

5. Contract and invoice generation. OpenClaw populates Collabora templates from structured data.

Final Cost Statement

Component	Monthly
Vultr VPS (Recommended)	~$24
Domain (amortized)	~$1
Cloudflare Zero Trust	$0
Supabase (free tier)	$0
All software (open-source)	$0
AI usage (3 users, moderate)	$15–$35
Total (3–8 person team)	~$35–$50

Equivalent SaaS for 3 users: $240/month AI alone, $400+ total.

This infrastructure costs less, does more, and belongs to you.

The Complete Series

Part	What It Covers
Part 1 — Architecture Overview	Stack, costs, security model
Part 2 — Zero-Trust Server	Vultr, Cloudflare, UFW, fail2ban
Part 3 — The Intelligence Layer	Docker, Nextcloud, Collabora, AI proxy, OpenClaw
Part 4 — Operations & Monitoring	Guacamole, Prometheus, Grafana, backups
Part 5 — The Operations Manual (you are here)	Maintenance, audits, runbook, proxy code

All five parts published and free.

A Final Word: The System Is Yours. The Responsibility Is Yours.

You built a system that monitors itself, backs itself up, and deploys four AI assistants to work on your behalf. That sentence would have been science fiction ten years ago and enterprise-only five years ago. Today it runs on a $24 server and you configured every line of it.

But the system does not think. It does not exercise judgment. It does not know when a draft email will offend a client, when a file has been misclassified in a way that matters legally, or when a financial query has returned a number that is technically correct and practically misleading. Those determinations require a human mind — your mind.

AI does not replace your judgment. It amplifies your capacity. The difference is everything. A tool that amplifies good judgment produces extraordinary results. A tool that amplifies absent judgment produces extraordinary damage.

Consult qualified professionals before acting on AI-generated legal, financial, tax, or medical information. Review every draft before it leaves your control. Verify every automated action. Test your backups. Read your alerts. Maintain your system.

This is self-responsibility. Not as a disclaimer — as a principle. The sovereignty you gained by building this infrastructure comes with the obligation to operate it with care.

What you hold now is not a collection of software. It is a new way of working.

Your data lives on your server. Your security is under your control. Your AI assistants answer to you, not to a subscription tier. SaaS was the bridge. You are no longer renting capability from someone else's servers under someone else's terms. You built your own.

That is not an optimization. That is a revolution.

The age of system sovereignty has arrived. You are already in it. Not because someone sold it to you, but because you built it yourself, one command at a time.

Welcome to the other side.

Legal Disclaimer

The information provided in this series is for educational and informational purposes only. It does not constitute legal, financial, tax, accounting, cybersecurity, or professional advice. All use is at the sole risk of the user. To the maximum extent permitted by applicable law, including the laws of the State of California (Cal. Civ. Code §§1668, 3513) and the State of New York (N.Y. GOL §5-323), the author disclaims all liability for any damages arising from use of this content. References to all third-party products are for informational purposes only. The author has no commercial relationship with any provider mentioned. Non-commercial sharing and attribution are permitted. Commercial reproduction requires explicit written consent.

Questions, corrections, configuration issues: comments. Every one gets read. This is the final installment, but the conversation continues.

Reference Materials
https://www.notion.so/Self-Hosted-AI-Operations-Package-33ca188be27c80099356cdd05cc4d8d3

— Kusunoki
International Tax Specialist & Systems Builder
Sapporo, Japan | @kusunoki

"Fast. Light. Visible."

DEV Community