Building a Fairer Anti-Spam System: How We Handle Links, Warnings, and New Chats

#telegram #ai #antispam #webdev

just shipped three changes to our Telegram anti-spam bot (ModerAI) that fundamentally change how we handle edge cases. Here's what we built and why.

The Problem With Binary Decisions
Most anti-spam bots make binary decisions: spam or not spam. Ban or allow.

This creates two failure modes:

False positives — legitimate users banned for having a link in their bio
False negatives — spammers who learn the rules and work around them
We needed a middle ground.

Change 1: Contextual Bio Link Analysis

Before:
if "t.me/" in user.bio:
ban(user) # crude but effective... and unfair

After:
link_target = analyze_link_context(user.bio)
if link_target.category in ["spam_channel", "scam", "adult"]:
ban(user)
elif link_target.category in ["game_referral", "personal_channel", "community"]:
allow(user) # legitimate use case

AI analyzes what the link actually points to. A Hamster Kombat referral? Fine. A channel selling "guaranteed 500% returns"? Ban.

Change 2: Progressive Warning System
Instead of ban-on-first-offense, we implemented a 3-strike system:

Strike 1: delete message + warn ("у вас ещё 2 попытки")
Strike 2: delete message + warn ("у вас ещё 1 попытка")
Strike 3: ban

Exception: edited message → instant ban (no strikes)

The edit detection is key. Spammers who post "Hello everyone!" then edit to a scam link 5 minutes later get zero warnings. This pattern is always intentional.

Change 3: Fresh Chat Grace Period
When ModerAI connects to a new chat, it has zero context. Every user is "unknown."

Aggressive bio scoring on day 1 would ban half the existing members. So we added a 48-hour grace period:

chat_age = now() - chat.connected_at

if chat_age < 48_hours:
# Relaxed mode: skip suspicious bio scoring
# Still ban critical threats (adult, drugs, obvious scam)
if threat_level == "critical":
ban(user)
else:
allow(user) # gather data first
else:
# Normal mode: full scoring pipeline
run_full_analysis(user)
After 48 hours, the bot has enough context to make accurate decisions.

Results
These changes reduced false positive rate from ~0.3% to ~0.1% while maintaining 99.7% spam detection.

The key insight: fairness and accuracy aren't opposites. A system that gives legitimate users the benefit of the doubt can still be ruthless with actual spammers — you just need smarter decision-making, not stricter rules.

ModerAI: $9/month per chat. 7-day free trial.

→ personym-ai.com/moderator-ai

Questions about the implementation? Happy to discuss in the comments.