One way to look at things would be that since programming demographics are actually skewed towards men, our survey is representative of the reality of things on the ground (more on that later), at least according to similar other surveys.
Yet another equally valid point of view though is that many people use our survey to make important technical decisions, and that by under-representing women (along with other minoritized demographics) in the survey, we're essentially denying them a voice in these decisions.
Our way of squaring that circle is by making it easier to highlight those minoritized voices. Currently you can filter any result by gender, race, etc. using our API but we hope to expose that feature through the survey results site itself in time for next year's survey. We are also working on charts that highlight any deviation from the norm within specific subgroups (e.g. "female respondents are 13% more likely to have used Svelte compared to the general population").
I said that our survey demographics probably reflect reality, but I want to amend that a bit.
Something I also address in that State of CSS post is that our data collection methods do carry their own biases. For example:
- We are both white men, so our own personal social networks will have an over-representation of people just like us.
- Social networks inherently have an over-representation of white men to begin with.
- On top of that, re-submitting the survey to the same base of participants year after year only compounds whatever biases existed at the start.
So yes, we are actively be trying to counter-act those biases.
For example, one thing we've started doing is featuring "picks" submitted by various people from the community. This lets us highlight people who would otherwise not be part of the survey, and hopefully that will translate into more outreach to more diverse online populations eventually.
I've also emailed numerous organizations dedicated to promoting women or minorities who code, but have yet to get a single answer back. This is not surprising, as these organizations are probably deluged by requests from people who want access to their audience to promote their own products.
But this highlights something most people don't really take into consideration: these problems are extremely hard to solve.
The people affected by these biases are already over-solicited, and it's not their responsibility to fix them anyway. So yes, we're taking on that responsibility, but it's just not going to be an overnight process.
Another valid criticism. In order to address it, we spent a lot of time making both the survey questions and the survey results translatable into other languages thanks to a handful of awesome volunteers.
Again though network effects play a role here, and it'll take time until the survey is reliably distributed throughout non-English networks.
But let me point something out. According to Stack Overflow, the U.S. is also the most diverse country on the planet when it comes to gender in the programming community.
So as our survey will become more diverse geographically, it will paradoxically become less diverse gender-wise (at least in appearance).
This is not a reason not to improve things on both fronts in parallel, but just one more example of why things aren't always as simple as they seem.
We consider a couple factors when it comes to deciding which items will officially be part of a survey:
- How many write-in mentions it got the past year.
- How popular it is on GitHub (as a proxy for overall popularity).
- Community input.
Given that we try to limit each category to 10 items max (otherwise the charts get too big!) you can imagine that this is a tough process.
We also try to "prune" projects that seem on the way out to avoid a negative pile-up. For example, we don't feature Backbone or Knockout in the survey even though they're still widely used on legacy codebases because that's not what the survey is about.
At the end of the day we do have to make some choices though, and there's definitely some arbitrariness to it. Maybe in the future we'll settle on a better, more objective system, but for now this is the best we got.
We get this one a lot, and it's true we don't work as statisticians professionally. Nor do we pretend to be.
Our approach is simple: we try to be as transparent as possible throughout the whole process, be open to feedback, and do our best with the time we have (we both have day jobs).
- Our survey creation process is open to feedback.
- All our code is open-source.
- We have a public GraphQL API.
- We make our entire dataset available publicly.
- We have a public Discord.
As I said we both have day jobs, and while we would love to be able to work on the surveys full time we're not quite there yet.
But working on this full time means finding a way to monetize the project. While many of us are used to benefiting from free, ad-free stuff online, the truth is that much of that free stuff is subsidized by Facebook, Google, Microsoft, and other large companies. This is great, but it can also create unfair expectations for independent creators.
So yes, we do have ads in the form of "Recommended Resources" links at the bottom of each page. I want to be clear though that we do not accept sponsorships from any of the companies listed in the survey itself, except in the Resources section. We consider that this section is not the core of the survey, and as such this small conflict of interest is an acceptable trade-off.
Update: I'm adding this after seeing Jeremy Wagner's great observations on this issue.
So this criticism is completely valid, and thankfully there are other surveys such as the HTTP Archive Web Almanac which do a great job of addressing this.
Bad impressions are hard to shake, and I don't necessarily expect people who have already made their mind about the survey to reconsider.
But if you're still on the fence, or maybe would like to help us deal with some of these very real issues, then thanks for taking the time to read this.
It's not your job to help us fix our house, but if you're passing by and would like to pitch in then it'll be greatly appreciated!