Ben Halpern

for The DEV Team

Posted on Aug 15, 2024

The Fight Against Low-Quality Automation

#discuss #meta

I wanted to open up a discussion about how DEV can do a better job combatting low quality automated posts.

This is an area we are putting a lot of attention on, but it is a hard problem.

I can say that we have been making a lot of improvements in terms of how we catch and deal with low quality content, but the tools for creating low-quality content are also getting better.

We have been shipping on internal tools and processes to do a better job with this stuff, and will be shipping some new initiatives in the near future which should be helpful and exciting in this regard.

But we are curious to hear any input on the matter. We are all ears. We love and respect how much this matters to folks. We want to help people connect with people, and to effectively separate signal from noise as we act as a vital resource for your developer career.

Top comments (26)

Ben Halpern The DEV Team • Aug 15 '24

I hope this acts as both a "Heads up, we're working hard on this and have solutions and upcoming annoucements", as well as "Open floor for input of any kind."

This is an issue across all social media, but the solutions we come up with here are as valuable as any because we are developers effecting this ourselves, and our platform is open source and who knows what the future impact of everything is.

Ingo Steinke, web developer • Oct 1 '24 • Edited

Automated or not, DEV will always attract spammers when even first posts by new users may contain outbound backlinks without a rel=nofollow attribute. This is a nice SEO-side-effect of posting on DEV, but it should be available only for users with a high reputation, or for those willing to pay for it (DEV++ ?).

But basing reputation only on quantitative metrics like number of likes might make the situation worse, as automation could help spammers build networks of user accounts upvoting each other's posts.

Still, more restrictions are probably necessary.

Neurabot • Aug 21 '24

I find topic a bit annoying. I really love contents posted here. Most of time, they are high-quality.

Jess Lee The DEV Team • Aug 15 '24

I want to add that elevating high quality content (or at least ensuring they doesn't fall into the sea of low quality automated posts) is also a big part of the discussions we've had internally, as a complement to all of this.

Best Codes • Aug 15 '24 • Edited

but the tools for creating low-quality content are also getting better.

Are the tools producing better content or getting better at producing junk? 😂

I think it's important to maintain good content on a community like this, but it's good to be careful how you maintain it. Always keep in mind that the root of the issue is the bad content itself, not the tools used for generating it. (Thinking about AI here. AI can be used for great good. Don't ban AI itself, just ban poor content.)

Whatever you guys do, I'm sure it will be great. My journey with DEV has been so far. Keep up the good work!

Eckehard • Aug 15 '24

Always keep in mind that the root of the issue is the bad content itself, not the tools used for generating it.

AI-generated content often looks great, this makes it much harder to distinguish bad from good content. This is even worse if people use AI, who don't even see the nonsense they got. AI-generated content is not necessarily "bad", it is just not reliable.

Over time we will get a second problem with this content: If AI uses this content as a source, we will get an "Echo-Chamber" where the nonsense is used to create even more nonsense. Maybe we should call the model behind this the LNM: The Large Nonsense Model!

Best Codes • Aug 16 '24

As you can see in a recent comment I made here:

Best Codes • Aug 15

Who wants to read something written by a bot? People want to read what you wrote, not what a bot did.

Actually, people usually want to read good content. Writing should be enjoyed for it's quality, not for the attributes of it's authors.

Also, these seemingly helpful AI tools tend to propagate homogeneity, bias, and academic dishonesty.

Humans are just as capable of doing the same (in fact, more capable). AI can even be used to combat this.

The AI tools aren't "seemingly" helpful. They are helpful, but like some other helpful things, there are ethical considerations.

You can view AI like a tool - for instance, a hammer. A hammer has great potential for good; we can use it to make building projects much easier. A hammer also has great potential for bad. We could use the hammer as a harmful weapon, hurting people or destroying property with it.

We shouldn't regulate the tool, but we should regulate the use of the tool. We have laws against violence but no laws against hammers.

In the case of AI, it think that it is good to respect people's preferences about what they choose to read - it would be wise to tag AI written content as AI written, and, if possible, disclose the AI model that wrote it as well.

Side note: There are actually more reasons than people's preferences to tag AI content as AI generated. The tag is also very useful for companies training new AI models. Training AI models on their own output is very unproductive. So when it is easy to avoid AI content in training data, it is easier to train new AI models as well.

The use of artificial intelligence, specifically generative AI, promotes homogeneity, bias, and even academic dishonesty, and thus it should not be used in an educational or professional environment.

That statement is a form of hasty generalization. It makes a broad claim that generative AI will produce negative outcomes in educational or professional environments. This generalization is formed without consideration for the various contexts in which AI could be used positively or the diversity of AI applications.

Well, I could go on. 🤪 But I'm done for now...

P.S. Your writing style is great. I love a lot of your articles, so I followed you. 🙂

I am not at all against labeling AI content as AI!

Humans are capable of making content that looks great as well. AI certainly makes that easier, which is a reason why I'm not against tagging AI content.

I don't mean to paint AI with a broad brush. It can be used for great good and great bad. We should regulate the bad content, not the tool itself. We can, ironically, use AI to detect AI. (Site like GPTZero and similar are pretty good at it, although there are considerations like false positives, etc.)

Anyway, thanks for your input. It's always nice to learn other people's opinions.

Martin Baun • Aug 16 '24

At the end of the day, AI content doesn't feel as fulfilling as human-written content, admittedly. There's something satisfactory about knowing what you're consuming comes from active research and hard work.

Best Codes • Aug 16 '24

Personally, I enjoy human content because humans put effort into what they create and I can appreciate the effort. But when it comes to writing itself, writing is about quality and not effort.
AI is an efficient means of producing quality content (and low-quality content as well). Although I don't completely agree with "Work smarter, not harder" (it leaves too much unsaid and can be twisted to promote laziness), that's basically what AI (as a tool) does for writers.

There are cases, such as generating AI content without checking if it's true, where the hammer stops helping you build the house and instead builds the house for you... which might be OK if the hammer doesn't make any mistakes, but a hammer building a house alone is very likely to make mistakes.
(In the hammer / house analogy the hammer is AI and the house is the content, e.g. an article).

AI content always requires proofreading as a bare minimum.

So, when it comes to reading content that will be beneficial (and often enjoyable as well), I usually prefer good content, whether it's written by AI, humans, or both.

Ingo Steinke, web developer • Aug 15 '24

Link to a related recent discussion: What do you think we should do about AI generated content?

What do you think we should do about AI generated content?

Oscar ・ Aug 14

#ai #discuss #meta

Anmol Baranwal • Aug 15 '24

I was going to reference the same discussion :)

Paulo Henrique • Aug 16 '24

I'll post basically the same thing I wrote on the Discord channel

It's kinda sad, you know? It's not exactly DEV's fault, the internet is mostly AI-generated content and listicles these days. Google is unusable unless you add dozens of filters, Upwork turned into an AI nightmare (clients and freelancers just automated everything), Linkedin offers AI-GENERATED CONTENT AND COMMENTS, and even Reddit is filled with LLM crap. It's tiring to even think about searching for good content to read.

I greatly advocate for AI, but this is out of hand. I hope the bubble pops fast and we can get back to the stupid internet of before.

Meanwhile, maybe an automatic quality checker would be nice on DEV, like an API that checks after an article is posted how are the chances of it being AI-generated?

Rens Jaspers • Aug 16 '24

I wouldn't ban AI completely because it helps non-native speakers like myself write in good English and make their points clear. When I see a blog with poor English, I don't want to read it, which is a shame because the content might be useful.

How do you feel about making the high quality / low quality moderation buttons more prominent for trusted community members? Perhaps placing them next to the regular like buttons? I think I'd be more likely to flag a post as AI junk if it only took me one click.

Paulo Henrique • Aug 16 '24

There's a big difference between using AI to translate and better communicate your ideas (I do, with Grammarly) and using AI to create bad-crafted content for the only purpose of linking to external websites or getting some kind of virtual reputation.

Rens Jaspers • Aug 19 '24

To you and me there is, but probably not to some automated tool that's designed to detect whether text was written by AI.

I'm concerned about getting banned just because my posts might look AI-generated, even though I only use AI to help refine my original draft.

kvetoslavnovak • Aug 16 '24

Yes, this is the real problem. As someone with DEV moderation rights I see all the AI junk coming in. And it is so frustrating that I almost stopped reviewing articles.
Unfortunately, I am starting to feel the same as a reader as well.

Makar • Aug 17 '24

I have been thinking about this issue the moment I went through a few posts in the Moderation dashboard here. I immediately faced a dilemma (and quit moderating...): have all thee lads spent hours on writing an overall interesting post with a lot of extra details -- and thus they should be somewhat encouraged, or they have just catbotted and should be banned or something. It is mostly impossible to tell for sure...

A way to combat this could be to give both mods and readers an opportunity to submit a structured feedback -- not a like or dislike, not a comment but some list of checkboxes like "too broad", "inconclusive", "title not matching the content" etc. At the end of the day, all we care is good content, even if it has been generated by AI, so maybe enabling the community to offer feedback in a straightforward and transparent manner (like Germans do 🍻) could help address this issue -- given that won't come with an increase of toxicity.

Oscar • Aug 17 '24 • Edited

Glad to see that the awesome people at Dev.to have noticed this issue! Y'all are really cool.

I'm tired and frankly don't have the time to go through and look at all the ideas, but here's my two cents: We need to better define the problem we're trying to solve. I've mentioned this in previous posts and comments, but saying "let's get junk off the platform" or "AI content needs to go" isn't sufficient. That just leads to bickering about specifics. While this is just something that popped into my mind, here's a suggestion for a tighter definition:

Content that is nearly entirely generated by LLM's such as ChatGPT (any version), Google Bard, or Claude, is not acceptable on Dev.to. However, usage of AI tools is allowed when it comes to issues like language barriers.

Bradston Henry • Aug 22 '24

This is a topic of personally think about quite often in regard to Dev.to. I am a huge supporter of this space, as I felt it's given me the opportunity to learn and grow as and SWE and to share knowledge that hopefully has helped other devs. I have truly appreciated you (@ben ) and team for creating this space for people like me.

But to respond to your post directly...

Sadly, automated tools will always be limited in combating this issue as the bots will always "automate" to find ways around automated tools.

I think there might need to be a heavier emphasis on human moderation to help combat the problem (at least for now)

Here a few ideas I have:

New Account Post Verification

Add New Poster Moderation system that forces all new accounts posts to be approved by moderators before they can post. When they are rejected they have an opportunity to resubmit after some period of time.
After a certain number of positive verifications, user can post freely and are passed through existing moderation system

Limit Ability of Users to Post who Post Consistently Low-Quality content (For existing accounts)

If moderators flag an account for multiple low-quality posts, accounts are reverted to "New account Status" and they must go through the "New Account Post Verification
Requires "New Poster Verification" Process

Create Verified and Unverified User Designations

After an account creates multiple high-quality posts, designate account as "verified" users. Add a filter to the homepage that allows users to filter out unverified accounts to reduce opportunity to be exposed to low quality content. This will sadly suppress/limit the opportunity for newer talented writers/posters to be found immediately but if the system works properly, they will become verified over time and grow their influence later.

I have some other ideas but they would be more focused on identifying "Trusted Writers" and surfacing them but it would not directly suppress AI/bot content.

Bradston Henry • Aug 22 '24

Also, I think it would be reasonable to create a AI model that could flag posts as AI for special review before they are published to the public.

As I have begun getting back into moderating, there are some very clear patterns that AI created content follows. It is likely possible to create a lightweight AI model that detects the likelihood a post is AI.

View full discussion (26 comments)