DEV Community

Cover image for πŸ“Š A couple of interesting Stack Overflow queries to play with [Query Stack Overflow] πŸ“ˆ
GrahamTheDev
GrahamTheDev

Posted on • Updated on

πŸ“Š A couple of interesting Stack Overflow queries to play with [Query Stack Overflow] πŸ“ˆ

I recently reached a milestone on Stack Overflow. I have answered 300 questions (and most are answered correctly πŸ˜‹) on the [accessibility] tag.

It is part of a personal challenge I set myself nearly two years ago.

Since September 2019 I have tried to give the best answers I can on the [accessibility] and [pagespeed-insights] tags on Stack Overflow, linking with my two passions in web development, load speed and inclusivity.

I find answering questions to be the best way to learn (and reinforce any imposter syndrome or conversly, make sure I don't suffer from the Dunning Kruger effect! hehe.

Anyway, I know I have been very active answering on those tags but the narcissist in me wanted to know how I compared to everyone else.

Luckily Stack Overflow has a great feature that few people seem to use / know about Query Stack Overflow (https://data.stackexchange.com/)

It allows you to analyse the Stack Overflow database and grab some interesting statistics / info.

So here are a couple of queries I put together so I could analyse my progress:

Number of comments, questions and answers ranked by number of answers

This query took me ages to put together, referencing several other queries other people had writeen to piece it together. Put it like this, it really highlighted my poor SQL skills! 🀣

It allows you to grab all of the questions, answers and comments on a given tag, ranked by number of answers given initially.

The query

DECLARE @tagName varchar(255) 
SET @tagname = rtrim(##TagName:string##)

DECLARE @start varchar(255) 
SET @start = rtrim(##Start:string##);

DECLARE @end varchar(255) 
SET @end = rtrim(##End:string##);


WITH questsByTags AS (
    SELECT DISTINCT
                q.Id
                , q.OwnerUserId
    FROM        Posts q
    INNER JOIN  PostTags pt     ON q.Id = pt.PostId
    INNER JOIN  Tags t          ON t.Id = pt.TagId
    WHERE       q.PostTypeId    = 1  -- questions
    AND q.CreationDate > @Start
    AND q.CreationDate < @End
    AND (
            t.TagName   = @TagName
    )
),
answersByTags AS (
    SELECT
                a.Id
                , a.OwnerUserId
    FROM        Posts a
    INNER JOIN  questsByTags qbt  ON qbt.Id = a.ParentId
),
commntsByTags AS (
    SELECT
                c.Id
                , c.UserId  AS [OwnerUserId]
    FROM        Comments c
    INNER JOIN (
        SELECT              Id FROM questsByTags
        UNION ALL SELECT    Id FROM answersByTags
    ) AS allPosts
    ON allPosts.Id = c.PostId
),
allUsers AS (
    SELECT          OwnerUserId FROM questsByTags
    UNION SELECT    OwnerUserId FROM answersByTags
    UNION SELECT    OwnerUserId FROM commntsByTags
)
SELECT      au.OwnerUserId      AS [User Link],
            u.DisplayName
            , (SELECT Count (qbt.Id) FROM questsByTags  qbt WHERE qbt.OwnerUserId = au.OwnerUserId)  AS [Num Qsts]
            , (SELECT Count (abt.Id) FROM answersByTags abt WHERE abt.OwnerUserId = au.OwnerUserId)  AS [Num Ans]
            , (SELECT Count (cbt.Id) FROM commntsByTags cbt WHERE cbt.OwnerUserId = au.OwnerUserId)  AS [Num Cmmnts]
FROM        allUsers au
LEFT JOIN   users u
ON          u.id = au.OwnerUserId
WHERE       au.OwnerUserId IS NOT NULL
ORDER BY    [Num Ans] DESC, [Num Cmmnts] DESC, [Num Qsts] DESC

Enter fullscreen mode Exit fullscreen mode

The permalink so you can try it yourself

https://data.stackexchange.com/stackoverflow/query/1410231/number-of-questions-answers-and-comments-per-user-per-tag-set?TagName=accessibility&Start=2019-12-31&End=2020-12-31

Inputs

  • Tag - The Stack Overflow tag you want to see rankings for
  • Start Date - When to query from
  • End Date - When to query to, can be set to a future date if you want it to be "until today".

Outputs

  • User Link - a link to their profile (when exporting to CSV outputs the userID
  • User Display Name - the user name, purely so it can be exported to CSV
  • Number of questions - how many questions that person asked with that tag
  • Number of Answers - number of answers that person gave with that tag on the question
  • Number of Comments - the total number of comments across questions and answers given by that user.

Results

For the tag [accessibility] running from 2019-12-31 to 2020-21-31 (first 20 rows)

User Link DisplayName Num Qsts Num Ans Num Cmmnts
2702894 Graham Ritchie 1 188 512
1971216 QuentinC 0 32 26
4540141 Adam 0 28 22
4274933 Josh 0 22 32
469491 brennanyoung 0 17 22
3825084 XLE_22 0 10 15
1226227 michaelpuckett 0 5 0
3147711 Alex Walczak 0 5 0
4597840 Phil Weaver 0 4 2
10838693 SAURABH 0 4 1
4873295 ShellZero 2 3 8
14129711 Unbywyd 0 3 3
12252274 Poli97 7 2 12
5587356 Super Jade 0 2 10
6002174 Tsundoku 0 2 6
8318731 Lionel Rowe 0 2 5
7406840 Stefany Newman 1 2 4
4733161 cloned 0 2 4
5186515 nonoandy 2 2 3
608042 Andy 1 2 3

For the tag [pagespeed-insights] running from 2019-12-31 to 2020-21-31 (first 20 rows)

User Link DisplayName Num Qsts Num Ans Num Cmmnts
2702894 Graham Ritchie 1 58 152
14063860 Sham 0 9 1
1710628 Ramesh Elaiyavalli 0 5 0
3151817 addyo 4 4 1
5585371 Ferran Buireu 0 2 5
12242328 Grégoire 0 2 0
8698671 Ozik Jarwo 0 2 0
11303070 oreoorbitz 0 2 0
6331353 Sam 6 1 22
10990737 Ifaruki 0 1 7
14273874 Thomas 1 1 2
2441103 David Lorenzo López 1 1 2
218980 szymond 1 1 2
1742382 stylesuxx 0 1 2
6820056 Abhishek Jain 0 1 1
681548 keul 0 1 1
4425964 hiew1 0 1 1
731631 Erik 0 1 1
1177814 Dipen Shah 0 1 1
1237494 Samar Panda 0 1 1

As you can see the narcissist in me is happy with those results (I am Graham Ritchie, so now you know my real name πŸ˜‹).

Not only top of the board but by a decent margin, I consider that challenge complete!

Every single answer given on a particular tag by a user

The other thing I wanted was a word count for a given tag.

Now because I wanted to make this a useful general query there were a few other things I wanted.

I wanted to know if an answer given was the accepted answer (to work out accepted answer rates).

I also wanted to be able to work out the total number of points given from a particular tag between particular dates (luckily the accepted answer combined with the "score" column is sufficient for this).

I wasn't clever enough to work out how to do a word count within the query (or if it is even possible, which I doubt) so I just wanted to grab all of my answers so I could process them elsewhere.

The query

DECLARE @UserID varchar(255) 
SET @userid = rtrim(##UserID:string##)

DECLARE @tagName varchar(255) 
SET @tagname = rtrim(##TagName:string##)

DECLARE @start varchar(255) 
SET @start = rtrim(##Start:string##);

DECLARE @end varchar(255) 
SET @end = rtrim(##End:string##);


select p.*, q.Title AS QnTitle, q.Body AS QnBody, q.AcceptedAnswerId AS AcceptedID, q.tags AS tags, 
CASE
  WHEN q.AcceptedAnswerId = p.Id THEN '1'
  ELSE '0'
END AS isAccepted
from posts AS p 
INNER JOIN posts AS q
ON p.parentID = q.id
INNER JOIN  PostTags pt     
ON q.Id = pt.PostId
INNER JOIN  Tags t          
ON t.Id = pt.TagId
where p.owneruserid=@UserID 
AND t.TagName   = @TagName
AND q.CreationDate > @Start
AND q.CreationDate < @End
AND p.PostTypeId = 2;
Enter fullscreen mode Exit fullscreen mode

The permalink so you can try it yourself

https://data.stackexchange.com/stackoverflow/query/1410405/select-answers-for-given-tag-for-a-user-between-two-dates-includes-whether-accep?TagName=accessibility&Start=2019-12-31&End=2020-12-31&UserId=2702894

Inputs

  • UserID - the user ID you want to query (can be found by going to a persons profile on SO)
  • Tag - The Stack Overflow tag you want to see all answers for
  • Start Date - When to query from
  • End Date - When to query to, can be set to a future date if you want it to be "until today".

Outputs

  • Score - the score is the cumulative up / down votes
  • Tags - all tags on the original question
  • isAccepted - whether this answer was accepted by the OP (for some strange reason you may have to run the query twice for this to show up!)
  • QnTitle - the original title
  • QnBody - the original question
  • Body - the body of the answer
  • A few others as I was lazy and just grabbed the entire entry for the answer

Results

I am not going to show the results here due to the number of columns. Instead run the query yourself for your own user ID.

Conclusion

The two queries above let you analyse your Stack Overflow contributions in some quite interesting ways.

For example I found out I had written over 250,000 words in total last year on Stack Overflow! That is quite scary really (it is one of the reasons why I am starting to write on Dev.to instead so I get some benefit from writing so much!)!

By running the query month by month you can build up an idea of your contributions to a tag over time (something I am still working on).

Living in a "niche" lets you rank well

Obviously I hang around in the quiet tags so it is easy to be top of the board, but for you you might find you are close to the top of the board on some more active tags and decide to put some extra effort in to reach the top!

Or, on the flip side, the queries above might be useful for trying to find people for advice.

Obviously it isn't a guarantee of getting the best person but being high up the rankings does at least give you an idea of whether they are reasonably knowledgeable on a subject.

Do you have any fun queries to share?

I like analysing data, I am not very good at it though!

So does anyone have any interesting queries on data.stackexchange.com they have created or stumbled across? If you have then let me know in the comments!

Maybe you have a query that works out which tags are most popular? Or which user writes the most comments on the site? Whatever it is be sure to share it!

Final question - are you near the top for any tags?

Is there a tag that you perform particularly well on? If so share a link to the query and your Stack Overflow ID / Display name so we can see where you shine!

Don't be afraid of showing off, as you can see this post is half useful, half a way to brag a little bit! 🀣🀣

For those of you who are awaiting my next mad web experiment, the next one is good fun...."I built a fully functional SPA website using a single DOM element, no CSS and a tonne of JS!"...it took me ages but is nearly finished...coming soonβ„’(It is live)

Top comments (12)

Collapse
 
afif profile image
Temani Afif

Let's see the user with the most answers in the CSS tag ... Oh, it's me πŸ˜‹ ... there is 2000 answers of difference with the second user 😲 (I am spending too much time there ..)
I am also the second most hated user in the site (the second one with the most duplicate closure and the most accurate one: data.stackexchange.com/stackoverfl...)

Collapse
 
grahamthedev profile image
GrahamTheDev

I am not surprised you dominate the CSS tag. Quite an eye opener isn’t it! I will go have a play with that query you linked as looks interesting! ❀️

Collapse
 
grahamthedev profile image
GrahamTheDev

What is more telling is looking at the 2020 results for you - 1400 vs 400 second place, not a wonder you are at the 0.01% club! 🀣

Collapse
 
miguelmj profile image
MiguelMJ

First of all, congratulations! It requires consistency and commitment to do this. Great job!

Don't be afraid of showing off

I don't have that much to show off, but with your permission I'm doing a bit of promotion. In my case, I follow many different tags and I don't have much time to spend on Stack Overflow, so it was frustrating that much of it I was just looking for a question that interested me or that at least I was able to answer.

For that reason, I made Answerable a recommendation system for Stack Overflow unanswered questions. It mainly uses the Stack Exchange API and the Stack Overflow RSS feeds, but I'm sure I'll find how to use the data.stackexchange site you just discovered me!

It is a young project and even though it's already functional (and I use every day) there's plenty of room for improvement and contributions! So if anyone is interested, give it a star, submit some feedback, check the contributing guidelines or upvote it on Stack Apps. πŸ˜„

GitHub logo MiguelMJ / Answerable

Recommendation system for Stack Overflow unanswered questions

Collapse
 
grahamthedev profile image
GrahamTheDev

I like the concept and don't worry about promoting it, it will surely help some important questions get answered and I think it is a great idea!

When I get chance I will have a proper look at the project (it may be weeks with how things are at my side at the moment but it is on my list so I will get to it eventually!!!) 🀣

Collapse
 
miguelmj profile image
MiguelMJ

We all have stuff to do, but fortunately the project won't run anywhere πŸ˜‚ so keep it up πŸ’ͺ
Thanks again!

Collapse
 
madza profile image
Madza

Do you prefer doing SO questions to let's say coding challenge sites like CodeWars?

Collapse
 
ingosteinke profile image
Ingo Steinke • Edited

There is a "Hacker Rank" aspect of StackOverflow:

I have answered [...] questions (and most are answered correctly πŸ˜‹)

... and if you add some social intelligence and borrow marketing / SEO strategies like "long tail" (finding your niche), plus you are lucky enough to find a niche that actually matches your knowledge, you can get a good reputation.

Sadly, accessibility and CSS still seemed to be some kind of niche on StackOverflow, but "programming"-ish enough not to count as off-topic. So we are lucky that gamification works in a positive way here and makes people like Graham provide helpful answers to others.

Collapse
 
grahamthedev profile image
GrahamTheDev

Personally I find that I don't gain much rep considering the number of answers I have given.

Especially in Accessibility. If anything the lack of rep I gain can have the opposite effect (I give a great, well thought out answer and it gets 15 views and no votes...I used to find it quite demoralising until I changed my perspective!)

For Accessibility last year I got (314 upvotes * 10 rep) from upvotes and (115 accepted answers * 15 rep) from accepted answers - that is only 4835 rep, on 188 answers (25ish rep per answer on average).

If I was after rep I would just answer JavaScript questions as there are plenty of ones that are low effort and it is a very active tag. (my average answer in accessibility is 800 words and takes me 15-20 minutes to write, a JS answer can be an adjustment to code that would take 2 minutes so I could do 10 answers and only need 2 of them to be accepted / 3 total upvotes to get more rep than I do now)

Hence why I go on number of answers I have given. The only thing I pay attention to nowadays from a gamification perspective is the "All time" leader board, for example if I answer 120 more questions on accessibility I will have answered more than anyone else ever, that is the only bragging right I care about (I am number 2 at the moment...I don't like being second πŸ˜‹πŸ€£πŸ€£).

As for CSS, I can't consider that a niche at all, for Temani to dominate that takes some serious effort, 1400 answers in a year is no small feat! In fact they would probably be much better placed to talk about reputation and whether it actually motivates as they are in the top 0.01% this year.

I find the gamification on Stack Overflow is counter productive personally, I will never just look at the accepted answer and accept it, I have to look at all the other answers to find the real gems, but they probably answered an hour later than the accepted answer.

It is the "fastest finger first" element that really breaks the rep system.

It also makes people delete answers if they get downvotes (which can sometimes actually be a catalyst for a great answer, or link to an article that is perfect but be downvoted as it is a "link only answer")

I answer, primarily, for my own gain, I learn far more answering questions than any other method. I can't recommend answering questions enough as a way to grow, it shines a light on where your knowledge is weak, it lets you look past people's questions and see what they are actually trying to achieve, people will challenge you on points you are sure about and make you think again about a point etc.

Saying all that, my thought process has changed recently towards writing articles rather than answering questions, mainly because I personally will get more benefit (in terms of exposure and building a small following) and I can still benefit from the research process etc. But I will still pop on to Stack Overflow to see what I don't know!

Collapse
 
grahamthedev profile image
GrahamTheDev • Edited

They cover different disciplines, but I personally prefer answering questions for learning (once you have a good base in a subject).

I find coding challenge sites great for thinking about optimal solutions, answering questions good for my reasoning skills.

For example if someone says "I am doing X and want Y", there is scope there that X might not be the best solution in the first place. But there is also scope that X might be necessary due to company restrictions and so you have to work around the restrictions. There may even be scope that Y is a terrible idea in the first place. It seems to correlate more to "real world" reasoning that you need to apply to building a product or solution.

I have found there are additional benefits to answering questions, it has helped me with structuring my thoughts so that I can write more effectively for example.

I used to just do the "wall of text" answers, now I make sure to use headings to break the article down, consider my language choices so that a range of people find the answer useful, create demonstrations for more complex concepts etc.

I may be (probably am!) biased, but personally I think I learned a lot more answering questions than I ever have on challenge sites. But also consider that I answer on tags that are more subjective, [accessibility] is a "broad but shallow" subject so it may work a lot better for a subject like that than for something like [c#] where there is very likely an optimal / accepted way of doing something that is well documented.

Collapse
 
grahamthedev profile image
GrahamTheDev • Edited

For anyone who is interested, here is Jon Skeet's posts for 2020 on C#, the legend himself!

Collapse
 
stevesims2 profile image
SteveSims2

Glad to hear that you are enjoying your experience with Stack Overflow. I have a reasonable rep over there as well. However, some 6-8 months ago, I replied to a post not really knowing the hidden rules. I gave a better answer than someone with 30 times my reputation. Eek. That's when I discovered those issues. The guy latched onto me, and began down voting every post or reply I would make, no matter how much work I might have done to get the answer.

At first I didn't even know what down votes were until I noticed that strangely my reputation was slowly dropping and I looked into it.

The system automatically corrected some of this guys down votes, but from time to time he is able to "escape the algorithm" and I had to suck it up.

After doing the math, I realized that if he wanted me off, because I embarrassed him, he would eventually get his way. So I started looking and found here.