I recently reached a milestone on Stack Overflow. I have answered 300 questions (and most are answered correctly π) on the [accessibility] tag.
It is part of a personal challenge I set myself nearly two years ago.
Since September 2019 I have tried to give the best answers I can on the [accessibility] and [pagespeed-insights] tags on Stack Overflow, linking with my two passions in web development, load speed and inclusivity.
I find answering questions to be the best way to learn (and reinforce any imposter syndrome or conversly, make sure I don't suffer from the Dunning Kruger effect! hehe.
Anyway, I know I have been very active answering on those tags but the narcissist in me wanted to know how I compared to everyone else.
Luckily Stack Overflow has a great feature that few people seem to use / know about Query Stack Overflow (https://data.stackexchange.com/)
It allows you to analyse the Stack Overflow database and grab some interesting statistics / info.
So here are a couple of queries I put together so I could analyse my progress:
Number of comments, questions and answers ranked by number of answers
This query took me ages to put together, referencing several other queries other people had writeen to piece it together. Put it like this, it really highlighted my poor SQL skills! π€£
It allows you to grab all of the questions, answers and comments on a given tag, ranked by number of answers given initially.
The query
DECLARE @tagName varchar(255)
SET @tagname = rtrim(##TagName:string##)
DECLARE @start varchar(255)
SET @start = rtrim(##Start:string##);
DECLARE @end varchar(255)
SET @end = rtrim(##End:string##);
WITH questsByTags AS (
SELECT DISTINCT
q.Id
, q.OwnerUserId
FROM Posts q
INNER JOIN PostTags pt ON q.Id = pt.PostId
INNER JOIN Tags t ON t.Id = pt.TagId
WHERE q.PostTypeId = 1 -- questions
AND q.CreationDate > @Start
AND q.CreationDate < @End
AND (
t.TagName = @TagName
)
),
answersByTags AS (
SELECT
a.Id
, a.OwnerUserId
FROM Posts a
INNER JOIN questsByTags qbt ON qbt.Id = a.ParentId
),
commntsByTags AS (
SELECT
c.Id
, c.UserId AS [OwnerUserId]
FROM Comments c
INNER JOIN (
SELECT Id FROM questsByTags
UNION ALL SELECT Id FROM answersByTags
) AS allPosts
ON allPosts.Id = c.PostId
),
allUsers AS (
SELECT OwnerUserId FROM questsByTags
UNION SELECT OwnerUserId FROM answersByTags
UNION SELECT OwnerUserId FROM commntsByTags
)
SELECT au.OwnerUserId AS [User Link],
u.DisplayName
, (SELECT Count (qbt.Id) FROM questsByTags qbt WHERE qbt.OwnerUserId = au.OwnerUserId) AS [Num Qsts]
, (SELECT Count (abt.Id) FROM answersByTags abt WHERE abt.OwnerUserId = au.OwnerUserId) AS [Num Ans]
, (SELECT Count (cbt.Id) FROM commntsByTags cbt WHERE cbt.OwnerUserId = au.OwnerUserId) AS [Num Cmmnts]
FROM allUsers au
LEFT JOIN users u
ON u.id = au.OwnerUserId
WHERE au.OwnerUserId IS NOT NULL
ORDER BY [Num Ans] DESC, [Num Cmmnts] DESC, [Num Qsts] DESC
The permalink so you can try it yourself
Inputs
- Tag - The Stack Overflow tag you want to see rankings for
- Start Date - When to query from
- End Date - When to query to, can be set to a future date if you want it to be "until today".
Outputs
- User Link - a link to their profile (when exporting to CSV outputs the userID
- User Display Name - the user name, purely so it can be exported to CSV
- Number of questions - how many questions that person asked with that tag
- Number of Answers - number of answers that person gave with that tag on the question
- Number of Comments - the total number of comments across questions and answers given by that user.
Results
For the tag [accessibility] running from 2019-12-31 to 2020-21-31 (first 20 rows)
User Link | DisplayName | Num Qsts | Num Ans | Num Cmmnts |
---|---|---|---|---|
2702894 | Graham Ritchie | 1 | 188 | 512 |
1971216 | QuentinC | 0 | 32 | 26 |
4540141 | Adam | 0 | 28 | 22 |
4274933 | Josh | 0 | 22 | 32 |
469491 | brennanyoung | 0 | 17 | 22 |
3825084 | XLE_22 | 0 | 10 | 15 |
1226227 | michaelpuckett | 0 | 5 | 0 |
3147711 | Alex Walczak | 0 | 5 | 0 |
4597840 | Phil Weaver | 0 | 4 | 2 |
10838693 | SAURABH | 0 | 4 | 1 |
4873295 | ShellZero | 2 | 3 | 8 |
14129711 | Unbywyd | 0 | 3 | 3 |
12252274 | Poli97 | 7 | 2 | 12 |
5587356 | Super Jade | 0 | 2 | 10 |
6002174 | Tsundoku | 0 | 2 | 6 |
8318731 | Lionel Rowe | 0 | 2 | 5 |
7406840 | Stefany Newman | 1 | 2 | 4 |
4733161 | cloned | 0 | 2 | 4 |
5186515 | nonoandy | 2 | 2 | 3 |
608042 | Andy | 1 | 2 | 3 |
For the tag [pagespeed-insights] running from 2019-12-31 to 2020-21-31 (first 20 rows)
User Link | DisplayName | Num Qsts | Num Ans | Num Cmmnts |
---|---|---|---|---|
2702894 | Graham Ritchie | 1 | 58 | 152 |
14063860 | Sham | 0 | 9 | 1 |
1710628 | Ramesh Elaiyavalli | 0 | 5 | 0 |
3151817 | addyo | 4 | 4 | 1 |
5585371 | Ferran Buireu | 0 | 2 | 5 |
12242328 | GrΓΒ©goire | 0 | 2 | 0 |
8698671 | Ozik Jarwo | 0 | 2 | 0 |
11303070 | oreoorbitz | 0 | 2 | 0 |
6331353 | Sam | 6 | 1 | 22 |
10990737 | Ifaruki | 0 | 1 | 7 |
14273874 | Thomas | 1 | 1 | 2 |
2441103 | David Lorenzo LΓΒ³pez | 1 | 1 | 2 |
218980 | szymond | 1 | 1 | 2 |
1742382 | stylesuxx | 0 | 1 | 2 |
6820056 | Abhishek Jain | 0 | 1 | 1 |
681548 | keul | 0 | 1 | 1 |
4425964 | hiew1 | 0 | 1 | 1 |
731631 | Erik | 0 | 1 | 1 |
1177814 | Dipen Shah | 0 | 1 | 1 |
1237494 | Samar Panda | 0 | 1 | 1 |
As you can see the narcissist in me is happy with those results (I am Graham Ritchie, so now you know my real name π).
Not only top of the board but by a decent margin, I consider that challenge complete!
Every single answer given on a particular tag by a user
The other thing I wanted was a word count for a given tag.
Now because I wanted to make this a useful general query there were a few other things I wanted.
I wanted to know if an answer given was the accepted answer (to work out accepted answer rates).
I also wanted to be able to work out the total number of points given from a particular tag between particular dates (luckily the accepted answer combined with the "score" column is sufficient for this).
I wasn't clever enough to work out how to do a word count within the query (or if it is even possible, which I doubt) so I just wanted to grab all of my answers so I could process them elsewhere.
The query
DECLARE @UserID varchar(255)
SET @userid = rtrim(##UserID:string##)
DECLARE @tagName varchar(255)
SET @tagname = rtrim(##TagName:string##)
DECLARE @start varchar(255)
SET @start = rtrim(##Start:string##);
DECLARE @end varchar(255)
SET @end = rtrim(##End:string##);
select p.*, q.Title AS QnTitle, q.Body AS QnBody, q.AcceptedAnswerId AS AcceptedID, q.tags AS tags,
CASE
WHEN q.AcceptedAnswerId = p.Id THEN '1'
ELSE '0'
END AS isAccepted
from posts AS p
INNER JOIN posts AS q
ON p.parentID = q.id
INNER JOIN PostTags pt
ON q.Id = pt.PostId
INNER JOIN Tags t
ON t.Id = pt.TagId
where p.owneruserid=@UserID
AND t.TagName = @TagName
AND q.CreationDate > @Start
AND q.CreationDate < @End
AND p.PostTypeId = 2;
The permalink so you can try it yourself
Inputs
- UserID - the user ID you want to query (can be found by going to a persons profile on SO)
- Tag - The Stack Overflow tag you want to see all answers for
- Start Date - When to query from
- End Date - When to query to, can be set to a future date if you want it to be "until today".
Outputs
- Score - the score is the cumulative up / down votes
- Tags - all tags on the original question
- isAccepted - whether this answer was accepted by the OP (for some strange reason you may have to run the query twice for this to show up!)
- QnTitle - the original title
- QnBody - the original question
- Body - the body of the answer
- A few others as I was lazy and just grabbed the entire entry for the answer
Results
I am not going to show the results here due to the number of columns. Instead run the query yourself for your own user ID.
Conclusion
The two queries above let you analyse your Stack Overflow contributions in some quite interesting ways.
For example I found out I had written over 250,000 words in total last year on Stack Overflow! That is quite scary really (it is one of the reasons why I am starting to write on Dev.to instead so I get some benefit from writing so much!)!
By running the query month by month you can build up an idea of your contributions to a tag over time (something I am still working on).
Living in a "niche" lets you rank well
Obviously I hang around in the quiet tags so it is easy to be top of the board, but for you you might find you are close to the top of the board on some more active tags and decide to put some extra effort in to reach the top!
Or, on the flip side, the queries above might be useful for trying to find people for advice.
Obviously it isn't a guarantee of getting the best person but being high up the rankings does at least give you an idea of whether they are reasonably knowledgeable on a subject.
Do you have any fun queries to share?
I like analysing data, I am not very good at it though!
So does anyone have any interesting queries on data.stackexchange.com they have created or stumbled across? If you have then let me know in the comments!
Maybe you have a query that works out which tags are most popular? Or which user writes the most comments on the site? Whatever it is be sure to share it!
Final question - are you near the top for any tags?
Is there a tag that you perform particularly well on? If so share a link to the query and your Stack Overflow ID / Display name so we can see where you shine!
Don't be afraid of showing off, as you can see this post is half useful, half a way to brag a little bit! π€£π€£
For those of you who are awaiting my next mad web experiment, the next one is good fun...."I built a fully functional SPA website using a single DOM element, no CSS and a tonne of JS!"...it took me ages but is nearly finished...coming soonβ’(It is live)
Top comments (12)
Let's see the user with the most answers in the CSS tag ... Oh, it's me π ... there is 2000 answers of difference with the second user π² (I am spending too much time there ..)
I am also the second most hated user in the site (the second one with the most duplicate closure and the most accurate one: data.stackexchange.com/stackoverfl...)
I am not surprised you dominate the CSS tag. Quite an eye opener isnβt it! I will go have a play with that query you linked as looks interesting! β€οΈ
What is more telling is looking at the 2020 results for you - 1400 vs 400 second place, not a wonder you are at the 0.01% club! π€£
First of all, congratulations! It requires consistency and commitment to do this. Great job!
I don't have that much to show off, but with your permission I'm doing a bit of promotion. In my case, I follow many different tags and I don't have much time to spend on Stack Overflow, so it was frustrating that much of it I was just looking for a question that interested me or that at least I was able to answer.
For that reason, I made Answerable a recommendation system for Stack Overflow unanswered questions. It mainly uses the Stack Exchange API and the Stack Overflow RSS feeds, but I'm sure I'll find how to use the data.stackexchange site you just discovered me!
It is a young project and even though it's already functional (and I use every day) there's plenty of room for improvement and contributions! So if anyone is interested, give it a star, submit some feedback, check the contributing guidelines or upvote it on Stack Apps. π
MiguelMJ / Answerable
Recommendation system for Stack Overflow unanswered questions
I like the concept and don't worry about promoting it, it will surely help some important questions get answered and I think it is a great idea!
When I get chance I will have a proper look at the project (it may be weeks with how things are at my side at the moment but it is on my list so I will get to it eventually!!!) π€£
We all have stuff to do, but fortunately the project won't run anywhere π so keep it up πͺ
Thanks again!
Do you prefer doing SO questions to let's say coding challenge sites like CodeWars?
There is a "Hacker Rank" aspect of StackOverflow:
... and if you add some social intelligence and borrow marketing / SEO strategies like "long tail" (finding your niche), plus you are lucky enough to find a niche that actually matches your knowledge, you can get a good reputation.
Sadly, accessibility and CSS still seemed to be some kind of niche on StackOverflow, but "programming"-ish enough not to count as off-topic. So we are lucky that gamification works in a positive way here and makes people like Graham provide helpful answers to others.
Personally I find that I don't gain much rep considering the number of answers I have given.
Especially in Accessibility. If anything the lack of rep I gain can have the opposite effect (I give a great, well thought out answer and it gets 15 views and no votes...I used to find it quite demoralising until I changed my perspective!)
For Accessibility last year I got (314 upvotes * 10 rep) from upvotes and (115 accepted answers * 15 rep) from accepted answers - that is only 4835 rep, on 188 answers (25ish rep per answer on average).
If I was after rep I would just answer JavaScript questions as there are plenty of ones that are low effort and it is a very active tag. (my average answer in accessibility is 800 words and takes me 15-20 minutes to write, a JS answer can be an adjustment to code that would take 2 minutes so I could do 10 answers and only need 2 of them to be accepted / 3 total upvotes to get more rep than I do now)
Hence why I go on number of answers I have given. The only thing I pay attention to nowadays from a gamification perspective is the "All time" leader board, for example if I answer 120 more questions on accessibility I will have answered more than anyone else ever, that is the only bragging right I care about (I am number 2 at the moment...I don't like being second ππ€£π€£).
As for CSS, I can't consider that a niche at all, for Temani to dominate that takes some serious effort, 1400 answers in a year is no small feat! In fact they would probably be much better placed to talk about reputation and whether it actually motivates as they are in the top 0.01% this year.
I find the gamification on Stack Overflow is counter productive personally, I will never just look at the accepted answer and accept it, I have to look at all the other answers to find the real gems, but they probably answered an hour later than the accepted answer.
It is the "fastest finger first" element that really breaks the rep system.
It also makes people delete answers if they get downvotes (which can sometimes actually be a catalyst for a great answer, or link to an article that is perfect but be downvoted as it is a "link only answer")
I answer, primarily, for my own gain, I learn far more answering questions than any other method. I can't recommend answering questions enough as a way to grow, it shines a light on where your knowledge is weak, it lets you look past people's questions and see what they are actually trying to achieve, people will challenge you on points you are sure about and make you think again about a point etc.
Saying all that, my thought process has changed recently towards writing articles rather than answering questions, mainly because I personally will get more benefit (in terms of exposure and building a small following) and I can still benefit from the research process etc. But I will still pop on to Stack Overflow to see what I don't know!
They cover different disciplines, but I personally prefer answering questions for learning (once you have a good base in a subject).
I find coding challenge sites great for thinking about optimal solutions, answering questions good for my reasoning skills.
For example if someone says "I am doing X and want Y", there is scope there that X might not be the best solution in the first place. But there is also scope that X might be necessary due to company restrictions and so you have to work around the restrictions. There may even be scope that Y is a terrible idea in the first place. It seems to correlate more to "real world" reasoning that you need to apply to building a product or solution.
I have found there are additional benefits to answering questions, it has helped me with structuring my thoughts so that I can write more effectively for example.
I used to just do the "wall of text" answers, now I make sure to use headings to break the article down, consider my language choices so that a range of people find the answer useful, create demonstrations for more complex concepts etc.
I may be (probably am!) biased, but personally I think I learned a lot more answering questions than I ever have on challenge sites. But also consider that I answer on tags that are more subjective, [accessibility] is a "broad but shallow" subject so it may work a lot better for a subject like that than for something like [c#] where there is very likely an optimal / accepted way of doing something that is well documented.
For anyone who is interested, here is Jon Skeet's posts for 2020 on C#, the legend himself!
Glad to hear that you are enjoying your experience with Stack Overflow. I have a reasonable rep over there as well. However, some 6-8 months ago, I replied to a post not really knowing the hidden rules. I gave a better answer than someone with 30 times my reputation. Eek. That's when I discovered those issues. The guy latched onto me, and began down voting every post or reply I would make, no matter how much work I might have done to get the answer.
At first I didn't even know what down votes were until I noticed that strangely my reputation was slowly dropping and I looked into it.
The system automatically corrected some of this guys down votes, but from time to time he is able to "escape the algorithm" and I had to suck it up.
After doing the math, I realized that if he wanted me off, because I embarrassed him, he would eventually get his way. So I started looking and found here.