I built a Telegram bot that automatically detects and handles duplicate messages (spam), promotional messages, and blacklisted keywords; logs every action; tracks infractions in Redis; and provides a clean admin surface (planned) for whitelist and keyword management. It runs on Node + TypeScript, uses Telegraf for Telegram, PostgreSQL for persistent data, and Redis for fast infra counters.
This article is a detailed, step by step breakdown of the problem, architecture, implementation, testing, monitoring, and deployment. It’s written as a technical case study so you and other engineers can reproduce, audit, or extend the system.
Table of contents
- Why I built this
- Goals & constraints
- Tech stack
- High-level architecture (diagram)
- Data model (schema.sql)
- Core modules — explanation + code snippets
- Message flow
- ensureChat / ensureUser (DB helpers)
- Duplicate detector (Redis + normalization)
- Promo detector
- Keyword blacklist
- Audit logging
- Infractions and auto-ban
- How I tested each feature (step-by-step)
- Monitoring & alerts (production-lite approach)
- Admin frontend (concept + API contract)
- Deployment notes (free options)
- Security & hardening notes
- Future improvements & roadmap
- Appendix: useful commands and snippets
1. Why I built this
Telegram groups scale quickly and inevitably attract spam, unwelcome promotions, or repeated posts that lower signal to noise ratio. The goal was a resilient, minimal, low cost system that:
- Protects the group from repeated posts (spam) and ads,
- Logs everything for auditability,
- Lets maintainers whitelist trusted users,
- Remains free-hosting friendly and easy to operate.
2. Goals & constraints
- Functional: detect duplicates, detect promos, block blacklisted keywords, warn/delete/ban, persist logs.
- Operational: work on free or cheap infrastructure, be robust, and recoverable.
- Design: simple DB schema, Redis for ephemeral counters, code in TypeScript, clean structure for future admin UI.
3. Tech stack
- Language: Node.js + TypeScript
- Telegram SDK: Telegraf
- DB: PostgreSQL (persistent storage)
- Cache / counters: Redis (fast infra counters + TTL)
- Dev tooling: ts-node-dev for development, tsc for build
- Optional admin frontend: React + Vite + Tailwind (planned)
- Hosting: Docker locally for dev; free cloud choices: Render / Railway / Fly / Vercel for frontend
Notes:
DB stores persistent state (chats, users, policies, whitelist, keywords, audit logs).
Redis stores ephemeral infra counters with TTL (e.g., 24 hours).
The admin API reads/writes DB and talks to Redis to reset infra keys.
5. Data model (schema.sql)
This is the schema used in my project (run it against your Postgres):
-- chats
CREATE TABLE IF NOT EXISTS chats (
chat_id BIGINT PRIMARY KEY,
title TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
-- users
CREATE TABLE IF NOT EXISTS users (
user_id BIGINT PRIMARY KEY,
first_name TEXT,
last_name TEXT,
username TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
-- policies
CREATE TABLE IF NOT EXISTS policies (
chat_id BIGINT PRIMARY KEY REFERENCES chats(chat_id) ON DELETE CASCADE,
duplicate_threshold INT DEFAULT 3,
created_at TIMESTAMP DEFAULT NOW()
);
-- whitelist
CREATE TABLE IF NOT EXISTS whitelist (
chat_id BIGINT REFERENCES chats(chat_id) ON DELETE CASCADE,
user_id BIGINT REFERENCES users(user_id) ON DELETE CASCADE,
PRIMARY KEY (chat_id, user_id)
);
-- audit logs
CREATE TABLE IF NOT EXISTS audit_logs (
id SERIAL PRIMARY KEY,
chat_id BIGINT REFERENCES chats(chat_id) ON DELETE CASCADE,
user_id BIGINT REFERENCES users(user_id) ON DELETE CASCADE,
message_id BIGINT,
rule_triggered TEXT,
action_taken TEXT,
message_text TEXT,
meta JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- keyword blacklist
CREATE TABLE IF NOT EXISTS keyword_blacklist (
chat_id BIGINT REFERENCES chats(chat_id) ON DELETE CASCADE,
keyword TEXT,
PRIMARY KEY(chat_id, keyword)
);
6. Core modules — explanation + code snippets
I’ll walk through the main parts of the code and explain why they were designed that way. (These are extracts adapted to be readable the full project contains types and helpers.)
6.1 Message flow (handleMessage)
Every chat message passes through this handler:
- Persist chat/user
- Skip if whitelisted
- Duplicate detection
- Promo detection
- Keyword blacklist
- Execute action (delete/warn/ban)
- Log action to audit_logs
Key points: always use parameterized SQL queries, protect against missing fields (ctx.chat might be undefined in some update types), and catch errors.
Example (simplified handler):
export async function handleMessage(ctx: Context) {
const chatId = ctx.chat?.id;
const msg = ctx.message;
if (!chatId || !msg || !("text" in msg) || !msg.text) return;
const text = msg.text;
const user = msg.from;
if (!user) return;
await ensureChat(chatId, msg.chat.title);
await ensureUser(user);
// Whitelist check
const wl = await query('SELECT 1 FROM whitelist WHERE chat_id=$1 AND user_id=$2', [chatId, user.id]);
if (wl.rowCount > 0) return;
// Duplicate detection + infra counter
// Promo detection
// Keyword blacklist
// Logging and actions...
}
6.2 ensureChat / ensureUser (db helpers)
These upsert basic metadata so other tables can reference them.
export async function ensureChat(chatId: number, title?: string) {
await query(
`INSERT INTO chats (chat_id, title) VALUES ($1, $2)
ON CONFLICT (chat_id) DO UPDATE SET title = EXCLUDED.title`,
[chatId, title || null]
);
}
export async function ensureUser(user: any) {
if (!user) return;
await query(
`INSERT INTO users (user_id, username, first_name, last_name)
VALUES ($1, $2, $3, $4)
ON CONFLICT (user_id) DO UPDATE SET
username = EXCLUDED.username, first_name = EXCLUDED.first_name, last_name = EXCLUDED.last_name`,
[user.id, user.username || null, user.first_name || null, user.last_name || null]
);
}
6.3 Duplicate detector (Redis + normalizer)
Design:
- Normalize text (lowercase, trim, collapse whitespace/punctuation optionally)
- Use Redis lists or hashes per user+chat or a key pattern like:
- dup:{chatId}:{userId}:{normalizedText} with TTL
- Simpler: push to a Redis list keyed by dup:{chatId}:{userId}, but we used a hash of message text to counts.
Implementation (conceptual):
- pushAndCheckDuplicate(chatId, userId, messageId, text) → returns { duplicateCount }
- If duplicateCount >= DUPLICATE_THRESHOLD → delete+warn and increment infra counter
- Redis usage example:
// pseudo
const key = `dup:${chatId}:${userId}:${hash(normalizedText)}`;
const count = await redis.incr(key);
await redis.expire(key, DUPLICATE_WINDOW); // e.g., 300s
return { duplicateCount: count };
Infra counter:
Key: infra:${chatId}:${userId} → incr on each duplicated incident, TTL 24h
When infra >= 3 → bot attempts ctx.banChatMember(user.id) and logs
6.4 Promo detector
A lightweight regex-based detector is fine to start. Typical checks:
Telegram invites t.me/ or telegram.me/
URLs with suspicious keywords (channel, group, join)
Phone numbers or payment links — policy-dependent
Example:
export function detectPromo(text: string): boolean {
// very simple
return /t\.me\/\w+/.test(text) || /joinchat|telegram\.me/.test(text);
}
You can make this more advanced later (ML / heuristics).
6.5 Keyword blacklist
Per chat table keyword_blacklist. On message:
- Load keywords for chat
- Normalize message text
- Check includes(keyword) on normalized text
- Delete + warn + log
Reason to store in DB: admins can edit keywords and changes are persistent.
6.6 Audit logging
Every action writes a row to audit_logs. Keep meta as JSON for flexible data (duplicate counts, infra counts, matched keyword). Example call:
await logAudit(chatId, user.id, msg.message_id, "duplicate", "deleted+warn", text, { duplicateCount });
7. How I tested each feature (step-by-step)
Testing is crucial. Here’s the exact process I used while developing locally (Docker):
Testing is crucial. Here’s the exact process I used while developing locally (Docker):
Setup (dev)
- Run Postgres + Redis in Docker Compose.
- Start bot in dev (npm run dev).
- Invite bot to a test group and send messages.
Tests
Duplicate detection
- Ensure not whitelisted.
- Send Hi three times.
- Expect:
- On 3rd message, bot deletes the last message and warns the user.
- Redis key infra:: increments after repeated offenses.
- Check audit_logs for rule_triggered = 'duplicate'.
Promo detection
- Send Check this channel: t.me/something.
- Expect deletion, a warning, and audit_logs row rule_triggered = 'promo'. ** Keyword blacklist**
1. Insert keyword into DB:
INSERT INTO keyword_blacklist (chat_id, keyword)
VALUES (<CHAT_ID>, 'spoiler')
ON CONFLICT DO NOTHING;
2. Send a message containing spoiler.
3. Expect deletion + warning + audit log.
Whitelist
- Insert whitelist entry:
INSERT INTO whitelist (chat_id, user_id)
VALUES (<CHAT_ID>, <USER_ID>)
ON CONFLICT DO NOTHING;
- Send spam/promo/keyword messages — expect bot to ignore.
Audit logging
- Trigger each rule
- Query audit_logs to verify message text, rule trigger, action, and meta.
Infra / Ban
To test real banning, use a non-admin test account (bot cannot ban owners/admins). Repeat infra-causing actions to trigger ban and confirm user is removed from group.
8. Monitoring & alerts (production-lite)
I wanted something free, reliable, and simple to operate.
8.1 Logging
- Use winston or pino to write structured logs to files + console.
- Keep logs in JSON for easy parsing.
8.2 Admin alerts
On critical errors (unhandled exceptions, DB connection errors), notify a private Telegram admin chat:
// example on error
await bot.telegram.sendMessage(ADMIN_CHAT_ID, `Bot error: ${err.message}`);
8.3 Health checks
- Expose /health endpoint on the admin API that checks:
- DB connectivity
- Redis connectivity
- Bot can call getMe or similar
- Use a free uptime monitor (UptimeRobot) to ping /health every 5 minutes and notify you by email if it fails.
8.4 Metrics (basic)
- Periodically compute counts (audit logs per day) and send a digest to admin chat (once per day).
- Optional: lightweight Prometheus client + Grafana later (overkill for now).
- Admin frontend (concept + API contract)
I planned a React admin frontend with the following responsibilities:
- CRUD for whitelist and blacklist
- View & filter audit logs
- Edit policies (duplicate threshold, promo action)
- Reset infra counters (calls Redis)
- Dashboard metrics & charts
API contract (examples):
`- GET /api/chats?page=1&size=20 → list chats
- GET /api/chats/:chatId → chat detail with policies, whitelist, keywords
- POST /api/chats/:chatId/whitelist → add whitelist
- DELETE /api/chats/:chatId/whitelist/:userId → remove whitelist
- POST /api/reset-infra → body { chatId, userId } (backend clears Redis key)`
The frontend starts with a mock API so development is frictionless. When ready, swap VITE_API_BASE_URL.
10. Deployment notes (free options)
- Backend (bot + admin API)
- Deploy as Docker containers (Render, Railway, Fly). Railway/Render provide free tiers for small projects — good for prototyping.
- Ensure PG and Redis are persistent volumes or managed services.
Frontend
- Deploy the React app on Vercel or Netlify (both have free tiers and easy integration).
Production considerations
- Use environment variables for BOT_TOKEN, DATABASE_URL, REDIS_URL.
- Use a process manager or container auto-restart to keep bot up.
- Add daily backups for Postgres (use pg_dump or provider snapshots).
11. Security & hardening
- Never commit tokens. Use env variables and secrets manager where possible.
- SQL safety: use parameterized queries — no string concatenation.
- Rate limits: respect Telegram API limits (avoid mass requests).
- Permissions: bot should have only the permissions it needs in the group (delete messages, ban members only if necessary).
- Admin API: protect with a token. Later add OAuth / Telegram login if needed.
- Error reporting: do not leak sensitive info in alerts.
12. Future improvements & roadmap
Short term:
- Add admin UI (React) and wire to real admin API
- Add more flexible penalty steps (mute → temp ban → perm ban)
- Better normalization for duplicates (strip emojis, repeated punctuation)
Mid-term:
- ML based spam classifier to reduce false positives
- Bulk audit exports, per-chat analytics
- Multi-bot management (SaaS style)
Long term:
- SaaS: multi-tenant admin console, billing, onboarding flows
- Appendix — useful commands & snippets
version: "3.8"
services:
postgres:
image: postgres:15
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: telegram_bot
ports:
- "5432:5432"
volumes:
- pgdata:/var/lib/postgresql/data
redis:
image: redis:7
ports:
- "6379:6379"
volumes:
pgdata:
Apply DB schema (copy schema.sql into container, then run)
docker cp schema.sql telegram-bot-postgres-1:/schema.sql
docker exec -i telegram-bot-postgres-1 psql -U postgres -d telegram_bot -f /schema.sql
Reset infra in Redis
docker exec -it telegram-bot-redis-1 redis-cli
DEL infra:<chatId>:<userId>
GET infra:<chatId>:<userId>
Run bot in dev
npm install
npm run dev # uses ts-node-dev
Closing notes
This project shows how much you can achieve with a small, well-designed stack: Telegraf + Postgres + Redis + a bit of Node glue. The important design principles I applied were:
- Keep persistent vs ephemeral data separate (Postgres vs Redis)
- Keep logic stateless where possible and log all decisions for auditability
- Keep admin controls outside the bot (admin API + frontend) so operators can change rules without redeploying the bot



Top comments (0)