DEV Community

Cover image for Spam sucks
Ben Halpern for The DEV Team

Posted on

Spam sucks

If you've been around DEV for the last few days, we apologize for the feed having too much spam.

The spam fight is an ongoing battle for any platform like ours, and it's something we always need to improve on, especially as we propagate our open source code into the universe via Forem more and more.

We have had several spam mitigation efforts in place, but they're only as good as the patterns we're preventing, and this bout with spammers is providing us with opportunities to really close the loop on some of the big outstanding issues.

The biggest problem with our current spam tactics is... observability... We have functionality in place, but it's a little burried and hard for us to get a sense of what is going on. So our improvements need to be steeped in allowing us to identify when spammy tactics are occurring so we can adjust our code.

Fighting spam is a multifaceted issue that touches rate limiting, user experience for early users, balancing concerns over false negatives vs false positives, etc.

Patterns

A quality of spam is that it usually easy to spot patterns... That is because for spam needs scale to be effective, as well as a particular outcome in mind.... Totally chaotic spam does not have the same incentives as precise spam.... Though chaos is also worth fighting.

We already fight off certain patterns, but we need to do more of that, and we need the actions to be as observable as possible. Currently we just don't raise the issue enough to the people involved and it's hard to be aggressive when you're treating things as too much of a black box.

Add ability for admin to add anti-spam terms #10615

What type of PR is this? (check all applicable)

  • [ ] Refactor
  • [x] Feature
  • [ ] Bug Fix
  • [ ] Optimization
  • [ ] Documentation Update

Description

This adds the ability for admins to modify a list of terms which may indicate spam. But unlike past "quiet" spam indicators, this automatically creates a vomit reaction which an admin can later manually reverse or at least be aware of. In the past we have modified spam-related scores but it hasn't really worked effectively into our workflows. I think this reversible action should be how we raise spam automatically in general.

With comments I decided to limit it to newer accounts because we're examining the whole comment and not just the title. But this can be modified over time. If a support admin is seeing false positives with a term they should consider removing those. We can alter the logic over time to ensure as few false-positive scenarios as possible.

This is the start of a pattern-based spam prevention approach that raises the issue to human mods. This will get more sophisticated over time. It will pair with more adjustments to rate limiting and onboarding.

Further adjustments to the feed are also forthcoming, to ensure that even if there is spam, it affects fewer users directly.

As an open source company we look forward to squashing these issues in the open and sharing all of our learnings going forward.

Happy coding ❤️

Latest comments (36)

Collapse
 
marcellothearcane profile image
marcellothearcane

@ben , can we get feedback on how we (trusted users - thanks for that by the way) are doing? I'm clicking away downvotes and reports on things that look like spam, but I don't know if it's doing a good job or hindering.

Can I see a list of things that I've reported that were successfully deleted as spam? A bit like how Stackexchange does it with flags: meta.stackexchange.com/questions/1...

Collapse
 
leob profile image
leob

I'm seeing it occasionally, it's pretty rare ... not a big problem by any means. When I do come across it, I'm like "wtf", I chuckle a bit, and I move on (like probably almost everyone is doing).

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

The biggest problem in blocking spam, isn't blocking spam, it's allowing ham through.

Filters can create often undetectable bubbles of information, where legitimate information is suppressed. This happens frequently from big providers, like say GMail, where certain addresses are shunted to spam for no apparent reason.

You need to have a feedback mechanism to report incorrectly marked spam.

Collapse
 
khmarbaise profile image
Karl Heinz Marbaise

First I would like to congratulate the whole dev.to team behind that cause I can imagine it's a cat-and-mouse game ... The time the team has responded is awesome and apart from that a big thank you for making this platform.

Collapse
 
sandordargo profile image
Sandor Dargo

Thanks for following up on this and taking it seriously. By the way are you interested in my customer care number? 😂

Collapse
 
yo profile image
Yogi • Edited

We in our project(taskord.com) flag users if more than 2 users associated with the same IP and count the post profanity if profanity count exceeds 10 the system will automatically flag the user. But in the first place, we don't allow disposable emails to prevent fake accounts.

When a user is flagged all his entities are hidden from the public and returns 404, this will increase good UX for other users.

After flagging, it will come to notice for all staffs and we will take necessary action weather to suspend or un-flag the user.

Next plan is I need to implement some ML models to find spammy posts and users and working on rate limiting based flagging too!

This is our mini-mod panel where all the action takes place here!

Collapse
 
defman profile image
Sergey Kislyakov

Community mods does not have to solve the captcha. Just saying... Though I don't even know how one could become a community mod. I guess there are some algorithms behind that.

Collapse
 
pavanbelagatti profile image
Pavan Belagatti

Yesterday I reported two threads that were spam. Thanks Ben for making Dev community all great again.

Collapse
 
scrabill profile image
Shannon Crabill

I appreciate the action you all are taking regarding SPAM. I'll keep reporting it when I see it. I will say, reporting SPAM accounts, comments, posts, etc made for a productive alternative for doom scrolling.

I did have a question. If I have to view a post/comment to confirm a post/comment is spam before reporting it, does that view figure into the algorithm that determines which posts should be more visible? Or does reporting/vomiting/marking as abuse cancel out any views, etc?

Collapse
 
jacobmgevans profile image
Jacob Evans

Is AI/Machine Learning something you're all looking into for this problem... Sorry if it was mentioned I skim read most of it.

Collapse
 
vtrpldn profile image
Vitor Paladini

Now they are spamming forem issues. Seems like the fix made them salty, haha

A screenshot of Forem repo issues

Collapse
 
calummoore profile image
Cal

I really don't understand how that could be effective for the spammer?! Who reads that and thinks, oh I must call that number immediately. 🤔

Collapse
 
vtrpldn profile image
Vitor Paladini

I'd bet that it is probably an SEO thing. Having those words and number in other places might bump their website a bit.

It's kind of like when WordPress websites get hacked and the abuser creates thousands of pages linking to their websites.

Thread Thread
 
amorpheuz profile image
Yash Dave

It is, not only are they trying to falsely improve their SEO, but it is also a kind of phishing attempt. They make google display the wrong number in their top results for legit brands (like Google pay, etc.) and end up Scamming unaware folks who think that these are legit customer care phone numbers / websites. This kind of fraud has been doing its rounds in India recently.

Pretty toxic stuff. 😔

Collapse
 
ben profile image
Ben Halpern

Those fuckers

Collapse
 
manishfoodtechs profile image
manish srivastava

If you find 10 integers in topic.... Most probably it's spam.

Collapse
 
scrabill profile image
Shannon Crabill

That's what I was thinking about this current batch of SPAM.

Collapse
 
lucretius profile image
Robert Lippens

Glad to find this post - I just scrolled through and reported a few and stumbled upon this, glad to see its being addressed. Thanks to the DEV team!

Collapse
 
karandpr profile image
Karan Gandhi

From what I have seen.

The spam posts have 4 buzz words.
They have a phone number with a random letter/s attached.
I think a regex based spam filter can combat the issue effectively.

Collapse
 
anuraganand profile image
anurag-anand

I too report spammed 2 posts but both of the time..if was kind of that infinte captcha..the likes of which you get in tor browser.. but I still did it twice..coz I love dev.to but for the third time I didn't have that much patience.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.