PersonymAi

Posted on Mar 25

Lessons From Processing Millions of Telegram Messages: What We Learned About Spam

#security #ai #telegram #python

We've spent years building an AI anti-spam system for Telegram. After processing millions of messages across hundreds of communities, here are the patterns we discovered.

Spam Has Evolved

Forget the obvious stuff — links, ALL CAPS, "CLICK HERE FOR FREE MONEY."

Modern Telegram spammers are sophisticated:

The Edit Trick
Post a normal message. Wait an hour. Edit it into a scam link. Most bots never re-check edited messages. We learned this the hard way and built edit monitoring into our core pipeline.

The Trust Builder
Join a group. Post 5-10 normal messages over a few days. Build credibility. Then drop the spam. Keyword filters can't catch this because the spam message itself might look innocent — it's the pattern that's suspicious.

The Avatar Bait
Create accounts with provocative profile photos. Join groups. Post nothing — the avatar itself is the spam (drives clicks to the profile with links in bio). This requires pre-message analysis that most bots don't do.

The Multi-Account Wave
Hit a group with 20 different accounts in 5 minutes. Even if the admin bans them, the damage is done — members saw the spam. Speed of response matters more than accuracy here.

What Keyword Filtering Gets Wrong

We analyzed false positive rates across traditional moderation bots. The results were painful:

The word "investment" triggers bans in 73% of keyword-based bots. But in crypto and trading groups, it's used in normal conversation hundreds of times per day.

"Free" is flagged by 61% of bots. But "free trial", "free tier", and "free update" are perfectly legitimate.

The fundamental problem: context determines whether a message is spam, not individual words.

The Global Network Effect

Our biggest insight came from connecting multiple chats into a shared ban network.

When Chat A bans a spammer, Chats B through Z know about it instantly. The spammer can't just move to the next group.

After connecting 100+ chats:

New spam accounts were blocked on first appearance in 89% of cases
The average time to neutralize a spam wave dropped from 15 minutes to under 30 seconds
False positive rate decreased as more data flowed through the network

The network gets smarter with every chat added. It's not linear growth — it's exponential.

Trust Is Better Than Rules

Early versions of our system were too aggressive. We caught spam, but we also annoyed legitimate users.

The breakthrough was shifting from "block suspicious behavior" to "build and track trust."

Every user in our system has a trust score based on:

Message count and quality
Behavior consistency over time
Reputation across the network
Account age and profile completeness

High-trust users are never bothered. New users get gradually more freedom. Spammers never build enough trust to bypass the system.

This reduced false positives to near zero while maintaining 99.7% detection accuracy.

Fingerprinting Beyond Accounts

Banning an account is easy. Banning a person is hard.

Spammers create new accounts constantly. But their behavior patterns are remarkably consistent:

Message timing intervals
Text structure and formatting habits
Target selection patterns
Time-of-day activity profiles

Our fingerprint system identifies these patterns even across completely new accounts. It doesn't matter if the username and phone number are different — the behavior signature matches.

Numbers After Years of Production

99.7% spam detection accuracy
~0% false positive rate
Sub-second average decision time
Millions of messages analyzed
Hundreds of active communities protected

What's Next

Spam evolves constantly. We're working on:

Voice message spam detection
Image and media content analysis
Predictive blocking (identifying potential spammers before they act)
Cross-platform intelligence sharing

The arms race never ends, but with AI that understands context and a network that shares intelligence, the defenders finally have the advantage.

ModerAI is part of PersonymAI. If you manage Telegram communities and want to test it: personym-ai.com — 7 days free.

Questions about our architecture or approach? Happy to discuss below.

DEV Community