loading...
Cover image for Analysis of dev.to comment frequency over the lifetime of a post

Analysis of dev.to comment frequency over the lifetime of a post

peter profile image Peter Kim Frank ・2 min read

What / Why

I wanted to take a high-level look at when DEV comments are created relative to the thread they're posted on.

DEV members encounter articles from a variety of sources — their home feed, on-site notifications, social media, search engine queries, etc. I was curious about what percentage of comments were on "fresh" threads, vs. the growing long-tail of articles here on the site.

How

I don't have much experience with data analysis or even SQL, but it was pretty quick/easy to prepare this high-level report. To simplify matters, I focused only on comments created in August.

Step 1: Grab the created_at date and article ID for all comments (5400 in total)

Step 2: Grab the published_at date for all articles grabbed in Step 1

Step 3: Calculate the difference (in days) for Article Date - Comment Date.

Results

As you might expect, it's heavily front-loaded with a rapid decrease as the article gets older.

30 day result graph

I decided to clean up the data, and create a "bucket" for 5-30, and 30+:

results graph

Highlights:

  • 33% of comments are on articles/threads published that day
  • 29% on an article that’s 1 day old
  • 9% on an article that’s 2 days old
  • 4% for 3 days old
  • 3% for 4 days old
  • 15% for 5-30 days old
  • 7% for 30+ days old

Closing Thoughts

I think it's fairly interesting that 7% of comments are posted on articles that were published at least 30 days prior. The "long tail" of evergreen content serves as a great resource for the broader developer community, and it's great to know that the wonderful content contributed here continues to be enjoyed and discussed even beyond the initial burst of exposure.

Hope you found this interesting!

Discussion

pic
Editor guide
Collapse
rpalo profile image
Ryan Palo

Yaaas! I've been dreaming of the day I can do data analysis on the Dev.to data :) Also, what's up with the weird bump of comments on day 16? Maybe that's the average amount of days it takes before a post is tweeted out by the Twitter account.

Neat!

Collapse
ben profile image
Ben Halpern

Yeah that is funny. This isn’t the largest data set in the world, could be affected by one big outlier.

Collapse
ben profile image
Ben Halpern

In the future I imagine we'll probably be able to open source anonymized data sets to go along with the code, so the whole community can play around like this.

Collapse
malikbenkirane profile image
Malik Benkirane

This is a very good idea : )
Can't wait for that to happen.
I would really get into that project but I should probably first get closer to the existing API so to the said data.

@pkfrank I would love to know the technicals details you've been through in steps 1 & 2 grabbing data or any ressource you started with should be enough though. Awesome initiative BTW. It was fun to discover 👍