DEV Community

Ben Sinclair
Ben Sinclair

Posted on

How should we handle duplicate content on dev.to?

As more posts appear here I'm noticing more duplicates. People don't always search to see if their ideas have already been covered before writing a new one or making a new spin on an existing post.

Stack Overflow handles this by shutting the new posts down. Yahoo! Answers handles it by not caring and generally being full of it. In every sense.

We're still early days with this site really, so what are other people's ideas about how to handle it?

We currently have "trending" and "classic posts" as asides to posts, but maybe we could have a "related" section based on keywords or something?

Duplicate isn't always duplicate either - an almost identically-named post could be about the news of a pending event or about how that event has changed the world a year on, for example.

Thoughts?

Top comments (8)

Collapse
 
sudiukil profile image
Quentin Sonrel

Well, I posted earlier today something about breaks at work. I did so knowing they were already some posts about that, but still, I felt like asking it again could be useful to gather some new input.

I feel like this site is more about discussion rather than the classical question/answer or problem/solution process we can find on sites like StackOverflow. And to me, discussion always benefits from fresh input (especially if the context varies), so previous posts might not always be relevant.

Of course, that doesn't mean we should just post without searching and spawn thousands of duplicates either, but I do think there's a little gray area depending on the nature of posts.

Collapse
 
ben profile image
Ben Halpern

This is exactly our policy right now.

I wrote about it here:

I do think we need to keep evolving it around the edges, and also build in product features to make this approach as beneficial as possible.

We probably should have the concept of merging threads when they pop up simultaneously. This happened the other day during the GitHub acquisition. We let it sort itself out naturally. One sank and one fell. In an ideal world we might have a simple concept of officially pointing folks to a canonical thread, or adjusting one so they're not really exact dupes.

But in general I think we can work out the concept of allowed dupes as we all work together to help one another out. This website is much more for being kind and helpful to the person requesting help than Stack Overflow.

Collapse
 
darksmile92 profile image
Robin Kretzschmar • Edited

I totally agree with that!
I'd rather read the topic once in a while again in my feed and see some fresh input on it than having it only once on the whole board.

Because the nature of discussion topics and posts is to get lost in storage.
If we are honest, we open dev.to, scroll through and maybe search for one term or click on a tag to dig deeper on a topic we just saw and see what other have to say about it.

Speaking of me, I love to see some topics again to be reminded of them. Because no one opens up the site and remembers "ahhh there was this one post about how to set up a docker environment 2 years ago, lets see if someone added a new comment".

This makes dev.to awesome :-)

Collapse
 
sudiukil profile image
Quentin Sonrel

This website is much more for being kind and helpful to the person requesting help than Stack Overflow.

And this is why it's awesome πŸ˜€

Collapse
 
fnh profile image
Fabian Holzer

It boils down to the question, why do people engage here in the first place.

Personally, I don't write for posteriority's sake, but to have a conversation with an actual person, albeit asychronously. I love talking shop. I don't want to be limited to the actual shop I work for to do so.

You and your team created a platform that is versatile and can be many different things to many different people.

The screen real estate is limited and even fresh topics go out of sight very fast, at least if they don't generate a lot of interaction. This "growth pain" with duplicated topics is a symptome of that.

One thing, that is not so obvious to me is how the "trending topics" work. When bulletin boards were fashionable, the sorting order often was that the thread with the newest reply came first. I'm under the impression that "trending" does take other things into account as well.

One thing I personally would find very helpful, is if I could use the tag box on the main page to just filter the topics on the main page. I don't want to unfollow tags, just because I'd temporarily would like to have a more narrow view on the topics. Also, I find it unfortunate that the layout of the topic subpage is so different from the main layout, e.g. there is no tag box on the left which would allow me to switch between one tag at a time in one step (instead I have to go back to main and select another tag).

I'm certain that if dev.to indeed manages tp keep up the spirit of, as you put it, "being kind and helpful to the person requesting help", then such technical concerns will turn out be but minor matters for which solutions emerge over time.

Collapse
 
agusarias__ profile image
Agus Arias

A duplicate article of duplicates handling, nice haha

Collapse
 
bgadrian profile image
Adrian B.G.

Make a N-N connection with "similar post", because even on stack overflow some "duplicates" is not 100% duplicate, maybe focuses on another part of the same problem.

The largest one (got most traction, usually the first one), should be the first in the list.

A more aggresive way is to alert the users before commenting on a "replica" (you may want to post here..in the canonical/largest replica).

The platform (dev.to) should show this connections in top/bottom/right of the article.

Users should suggest "similar/duplicate", mods can accept them?

Collapse
 
cjbrooks12 profile image
Casey Brooks

My thoughts, given that I'm fairly new here on dev.to. It seems like make of the posts here are for sharing knowledge with the community so that they can learn, but also so that the developer can learn more by organizing their thoughts around it and engaging with the community.

Posts on a dev's own personal blog tend to feel more formal and polished, while much of what I see here seems to be a bit more informal, trying to organize one's own thoughts while also actively seeking feedback and improvement in this topic from a welcoming community.

And I think this is great; it's not just one person teaching the rest, or one person asking questions without trying hard to solve the problem. It's one person taking a good crack at a problem, sharing what they learned in the process, and also hoping to learn even more. So even if the content of the post itself might have been posted before, the discussions around that topic are always evolving, and for that reason we shouldn't strive to eliminate all duplicate content in the same way that SO does.