Four Security Defaults I Baked Into a ₩39K Telegram Bot Kit. Why They Matter More Now After the VSCode Extension Breach

#ai #security #opensource #agents

TL;DR. A malicious VSCode extension breached 3,800 GitHub repositories this week. Hacker News surfaced it at 601 points. The pattern is familiar: a developer tool with broad system access goes rogue and the blast radius is huge. I ship a small Telegram AI bot kit for ₩39,000 on Kmong. It has four security defaults baked in before any of this. None of them are clever. None of them are research-grade. They are the four things a hobby AI bot has no excuse to skip. This post walks through each one, what it actually blocks, and what one-person builders should hold the line on.

The breach pattern, in one paragraph

A malicious VSCode extension was published on the marketplace. Developers installed it. The extension had filesystem access (because VSCode extensions can read and write files freely by design) and outbound network access. It exfiltrated source code from 3,800 repositories. The attack worked because the extension surface trusts the extension. There is no per-extension filesystem sandbox, no per-extension network policy, and the user has no way to enforce one without giving up the extension entirely.

Why this is relevant to AI bots: a Telegram AI bot like the one in my kit is structurally similar. It runs on the user's machine. It has filesystem access by default. It has network access by default. It accepts instructions from a chat interface that can come from anyone with the bot token. If you do not bake in defaults, an AI bot is a malicious VSCode extension waiting to happen. Except now the attack vector is "anyone who messages your bot" instead of "a marketplace extension."

The four defaults below are what I bake in before shipping the kit. They are all in bot.py, a single file you can read in 5 minutes.

Default 1. Path traversal block

The bot has a file read/write tool that the AI model can call. Without a guard, the model can be talked into reading ~/.ssh/id_rsa or ~/AppData/Roaming/.../Telegram/secrets.db because the model has no domain knowledge that those paths are dangerous.

The guard:

WORKSPACE = (Path.home() / "Desktop" / "agent-workspace").resolve()

def _safe_path(user_path: str) -> Path:
    """
    Resolve user_path against WORKSPACE. Reject if it escapes WORKSPACE.
    No symlinks. No '..'. No absolute paths to elsewhere.
    """
    candidate = (WORKSPACE / user_path).resolve()
    try:
        candidate.relative_to(WORKSPACE)
    except ValueError:
        raise PermissionError(
            f"Path '{user_path}' escapes workspace. Refused."
        )
    return candidate

Three things this blocks:

.. traversal: "../../.ssh/id_rsa" resolves outside WORKSPACE, the relative_to check fails, the call is refused before any read.
Absolute paths: "/etc/passwd" resolves to /etc/passwd which is not under WORKSPACE, refused.
Symlink escape: .resolve() follows symlinks before the check, so a symlink that points outside WORKSPACE gets caught.

The trade is small. The model can only read and write inside ~/Desktop/agent-workspace/. Students get a clean sandbox. The kit cannot exfiltrate ~/.ssh/. The same guard runs on every file operation, no exceptions.

Default 2. User ID allowlist

A Telegram bot token, if leaked, lets anyone in the world message the bot. Without an allowlist, the bot will happily respond to strangers, burn the user's API quota, and potentially execute tool calls on their behalf.

The guard:

ALLOWED_USER_IDS = {
    int(uid) for uid in os.environ.get("ALLOWED_USER_IDS", "").split(",") if uid
}

async def on_message(update: Update, context):
    user_id = update.effective_user.id
    if user_id not in ALLOWED_USER_IDS:
        # Silent drop. Do not even acknowledge the bot exists.
        return
    # ... handle message ...

Three things this blocks:

Token leak panic: if the token leaks, the worst the attacker gets is a silent drop on every message. No quota burn, no tool calls, no data leak.
Username scraping: even if the bot's @username is public, strangers messaging it get nothing.
Cost runaway: the user's Gemini API quota stays scoped to their own usage. No surprise bill from a stranger spamming the bot.

The silent drop matters. If the bot replied with "you are not authorized," it would confirm the bot exists and that the path to bypass is "add yourself to the allowlist." Silent drop gives the attacker zero signal.

Default 3. Bounded retry

If the bot crashes on startup because of a misconfiguration (wrong API key, bad token, missing dependency), the default Windows auto-start script will try to restart it. Without a bound, it loops forever. Every loop hits the Telegram API, the Gemini API, the log file. CPU spikes. Notifications flood the user. The user wakes up to thousands of error messages.

The guard:

:: start_bot.bat: bounded retry loop
@echo off
setlocal enabledelayedexpansion
set MAX_RESTART=5
set restart_count=0

:restart_loop
if !restart_count! geq %MAX_RESTART% (
    echo Bot crashed %MAX_RESTART% times in a row. Stopping.
    exit /b 1
)

python bot.py
set last_exit=%errorlevel%

if !last_exit! equ 0 (
    echo Bot exited cleanly. Stopping.
    exit /b 0
)

set /a restart_count+=1
echo Restart %restart_count% of %MAX_RESTART%...
timeout /t 10 /nobreak
goto restart_loop

Five restarts is enough to recover from transient network errors. Six restarts in a row is a configuration problem the human needs to look at, not something to mask. The script stops, leaves a clear message, and waits for the user.

Three things this blocks:

CPU spike: an infinite loop crashing instantly burns one CPU core at 100%.
API quota burn: every restart calls the LLM, eats tokens, costs money.
Notification flood: every Telegram API call from a crashing bot can trigger reconnect logs the user reads in the morning.

The bound is the cheap thing. The thing that takes thought is the failure mode it implies: "if you set this up wrong, the bot will not try to heal forever, it will stop and tell you." That is the right default for a hobby bot. A production-grade service might want different behavior (auto-rollback, alerting, etc), but a hobby bot stopping is correct.

Default 4. Secret env isolation

The API key, the bot token, and the user ID list are all secrets. They never appear in the kit's source code. They go in environment variables and the .env.local file is in .gitignore from day one.

The structure:

.gitignore  →  contains .env.local, *.key, secrets/
.env.local  →  contains GEMINI_API_KEY=... TELEGRAM_BOT_TOKEN=... ALLOWED_USER_IDS=...
bot.py      →  reads from os.environ only

When a student forks the kit on GitHub or pushes it back to their own repo, the secrets do not travel. When a student shares a screenshot of their config or asks for help on a forum, the secrets are not in the source. When a student accidentally pushes to a public repo, the .env.local is ignored.

The default that matters most here is the timing: .gitignore exists in the kit from the first commit. There is no window during which someone forks the kit before the gitignore is added. By the time the first user clones it, the protection is already there. This is the same idea as git secrets but at the kit-distribution level: the secrets default never existed in source, so no archaeology can recover them from history.

What these four defaults are not

They are not defense in depth against a determined attacker. They are not a substitute for an audit. They are not equivalent to a sandboxed VM or a hardened container or a proper capability system. None of these claims would survive a real adversary review.

What they are: the four cheapest things to do correctly that a hobby AI bot has no excuse to skip. Each one is under 20 lines of code. Each one closes off a class of failure that has been observed in the wild this week alone.

The VSCode extension breach happened because a developer tool had no defaults in the dimensions that mattered. The extension marketplace trusts the publisher. The runtime trusts the extension. The user has no enforcement point. When the publisher goes malicious, there is no layer to catch it.

A hobby AI bot is in the same position. The bot has filesystem access. The bot has network access. The bot accepts instructions from a chat. If the four defaults above are not in the kit by default, the user is one prompt-injection or token-leak away from the same failure mode at smaller scale.

Why the timing matters for one-person builders

Right now, two things are happening at once:

Trust in developer tooling is freshly broken. People who installed a VSCode extension this week are paying attention to defaults in a way they were not last week.
AI agents are spreading. Every solo builder is shipping something that has filesystem and network access, often by Tuesday afternoon, often without thinking about it.

These two trends collide. If you ship an AI agent tool now without spelling out what defaults it has and what they block, you are betting that nothing will go wrong. That bet was already losing. After this week it loses faster.

The cheap move is to spell out the defaults. Four sections in a README. Four blocks of code anyone can read. The kit I ship has these four blocks. I am writing this post so other one-person builders shipping similar tools can copy the structure without me having to be the only person doing it.

What I ship

The full bot kit is on GitHub, MIT licensed: github.com/wildeconforce/agent-starter-kit. The four defaults are in bot.py and start_bot.bat, exactly as shown above, with no obfuscation. The Korean-language packaged version with a tutorial walkthrough ships on Kmong at ₩39,000: kmong.com/gig/688290.

The free repo is the substantive part. The Kmong version is the curated version for users who want the setup walkthrough in Korean without piecing it together themselves. Both contain the same four defaults.

If you ship a similar tool, I would prefer you steal these four blocks and put them in your own kit. The point is not market share. The point is that bots running on people's machines should refuse paths they should not read, refuse messages from people they do not know, refuse to retry crashes forever, and refuse to leave secrets in source. If the community gets that to default-on across the small-tools layer, the next breach will not look like this one.