Peter Kim Frank

Posted on Aug 17, 2018

Analysis of dev.to comment frequency over the lifetime of a post

#meta

What / Why

I wanted to take a high-level look at when DEV comments are created relative to the thread they're posted on.

DEV members encounter articles from a variety of sources — their home feed, on-site notifications, social media, search engine queries, etc. I was curious about what percentage of comments were on "fresh" threads, vs. the growing long-tail of articles here on the site.

How

I don't have much experience with data analysis or even SQL, but it was pretty quick/easy to prepare this high-level report. To simplify matters, I focused only on comments created in August.

Step 1: Grab the created_at date and article ID for all comments (5400 in total)

Step 2: Grab the published_at date for all articles grabbed in Step 1

Step 3: Calculate the difference (in days) for Article Date - Comment Date.

Results

As you might expect, it's heavily front-loaded with a rapid decrease as the article gets older.

I decided to clean up the data, and create a "bucket" for 5-30, and 30+:

Highlights:

33% of comments are on articles/threads published that day
29% on an article that’s 1 day old
9% on an article that’s 2 days old
4% for 3 days old
3% for 4 days old
15% for 5-30 days old
7% for 30+ days old

Closing Thoughts

I think it's fairly interesting that 7% of comments are posted on articles that were published at least 30 days prior. The "long tail" of evergreen content serves as a great resource for the broader developer community, and it's great to know that the wonderful content contributed here continues to be enjoyed and discussed even beyond the initial burst of exposure.

Hope you found this interesting!

Top comments (4)

Ryan Palo • Aug 17 '18

Yaaas! I've been dreaming of the day I can do data analysis on the Dev.to data :) Also, what's up with the weird bump of comments on day 16? Maybe that's the average amount of days it takes before a post is tweeted out by the Twitter account.

Neat!

Ben Halpern • Aug 17 '18

Yeah that is funny. This isn’t the largest data set in the world, could be affected by one big outlier.

Ben Halpern • Aug 17 '18

In the future I imagine we'll probably be able to open source anonymized data sets to go along with the code, so the whole community can play around like this.

Malik Benkirane • Aug 18 '18

This is a very good idea : )
Can't wait for that to happen.
I would really get into that project but I should probably first get closer to the existing API so to the said data.

@pkfrank I would love to know the technicals details you've been through in steps 1 & 2 grabbing data or any ressource you started with should be enough though. Awesome initiative BTW. It was fun to discover 👍

DEV Community

Analysis of dev.to comment frequency over the lifetime of a post

What / Why

How

Results

Closing Thoughts

Top comments (4)

Read next

what is Interpretor ?

Personal Portfolio Website

frontend :the part of design

Toán tử trong Javascript