Sponsored by
First, the exciting news: I am delighted to share that I started working with the Open Source Collective and Ecosyste.ms teams on this research, and the Open Source Collective is the sponsor of the following updates.
Hopefully more to come on this collaboration, so stay tuned!
Unresolved question: How should open technologies be financed?
When it comes to open technologies, such as open source software, there is a long-standing debate about how to finance them and by whom.
My humble conclusion is straightforward. Expecting thousands of companies to contribute back to the ecosystems they consume without coordination is a hopeless dead end. Open technologies fall under the public goods category; therefore, we should finance them with public money, similar to public roads, bridges, or libraries.
However, the nature of open technologies is pushing us to innovate over our funding models; even though we should recognize them as a new type of digital, public goods, they are produced by the private sector, which is a combination that happens for the first time on a large and continues scale.
This unique state brings us a new challenge; developing a scalable public funding structure to finance an entire and ever-changing market.
Last year, I started an Agile Public Fund experiment to dive into the practical side of this challenge; study and demo a funding algorithm that distributes a certain amount to open source initiatives based on criticality and usage metrics. The journey continues with new allies, some updates, and with more challenges to come.
As usual, your feedback is priceless; don't hesitate to get in touch with any comments, questions, or ideas.
Updates
I had a chance to improve the data, process, and algorithms in the last couple of weeks. You can see all the changes on the Open source public fund experiment document on Google Sheets.
TL;DR: I could extend the Criticality Score algorithm with usage metrics from Ecosyste.ms API and apply it to all open source accounts under the Open Collective, so we have a new ranking now! I also made it possible to change the weights of each parameter so that you can try the algorithm by yourself.
Open Collective data refresh
Refreshed the accounts list from Open Collective API and included the country and yearly budget in the data. There are now 4729 accounts with a code repository.
Criticality Score's latest version
Updated my local Criticality Score repository to the latest version. The new version is written in the Go language instead of Python, and it has a second algorithm that calculates the score by getting the "dependent count" data from the Open Source Insights (deps.dev) API.
Manual score calculations
Recreated all formulas for calculating the scores under the "Criticality Score - Results" sheet. Now it is possible to play around with the weight of each parameter and see the new results directly within the document.
New config with the Ecosyste.ms data
Decoupling the data collection and the score calculation made extending the data from other custom resources easy. Thus, I retrieved each repository's "dependent_repos_count" data from the Ecosyste.ms API and created a new algorithm configuration. This one replaces the deps.dev's "dependent count" parameter.
Now there are three different algorithms for score calculation, and you can see their parameters under the "Criticality Score - Config" sheet:
- original_pike - Yellow: The default algorithm from the Criticality Score.
- pike_deps.dev - Green: The second algorithm from the Criticality Score includes the "dependent count" from deps.dev as an extra parameter.
- ecosyste.ms - Blue: The new algorithm that uses the "dependent_repos_count" data from the Ecosyste.ms API and replaces the deps.dev's "dependent count" parameter.
Changing a parameter under the Config sheet updates the scores under the Results. Feel free to copy the document and try it yourself.
Stats
Under the new Stats sheet, you can see a quick overview of the data in the other sheets:
- General stats of the Open Collective accounts and repositories
- Budget stats of the accounts
- The languages of the repositories
- The licenses of the repositories
- The countries of the accounts
- The ecosystems of the repositories
- Top 10 repositories of each algorithm
Winners
This sheet shows the history and the details of the three winners I randomly choose monthly to test the algorithm.
Budgets
I plan to follow the budget changes of the Open Collective accounts in time under this new sheet.
The process and the helper tools
Below is a brief list of actions to prepare the data and the final results. One critical remark is that currently, the process only works with GitHub repositories, so I exclude non-GitHub ones—hopefully, a detail to improve in the future.
- Retrieve the accounts data from Open Collective API to create Accounts and Budgets sheets,
- Call GitHub API to find the most starred repositories of each GitHub user, which you can find under the "GitHub - Top repos" sheet,
- Run the Criticality Tool to get the data points of each repository, which are stored under the "Criticality Score - Results" sheet,
- Call the Lookup endpoint of the Ecosyste.ms API to get the additional "dependent_repos_count" data of each repository and combine it with the other parameters under the "Criticality Score - Results" sheet,
- Once the data is in place, the existing formulas in the "Criticality Score - Results" sheet calculate the scores.
- Last, I will be updating the data and the scores once a month.
Quick links:
Stats overview
Here are some stats that stand out:
- The total number of Open Collective accounts with a code repository is 4729.
- 96.32% of those accounts use GitHub as their code repository.
- However, 15.38% of the GitHub username are not valid / don't exist, which is a massive number. It would be handy if Open Collective could add a code repository/link verification method or ask the users to update their profile details occasionally.
- 81.62% of the accounts use USD as their currency. Their yearly budget is about $15 million, and the average is $3,887 per account, probably not even one percent of the ideal figures.
- The language of almost one-third of the repositories is JavaScript. Python comes second, and PHP is third.
- MIT dominates the license list with 41.91%, of which there are 27 different licenses. 15.41% of the repositories don't have any.
- Regarding the accounts' countries, the United States leads the list with 11.4%. China follows with 2.9%, the United Kingdom at 2.8%, Germany at 2.7%, and India at 2.5%.
- The Ecosyste.ms search returns 1453 matches out of 3336 unique repositories. Out of this data, npm is the top ecosystem with 46.94%, go is the second with 13.35%, and PyPI is third with 9.70%.
- And last, here are the top five repositories with the highest criticality score based on Ecosyste.ms config:
You can see the full ranking of each algorithm and more under the Stats sheet.
Winners overview
I randomly choose three open source collectives from the accounts list every month to test the algorithm results and reach out to the open source entrepreneurs. I distribute an amount to each collective based on their scores.
Since the start, it's been 19 rounds, and using experiment as my excuse, I have proudly invested $4259 in 57 open source collectives.
As a side experiment, I determine the amount to distribute based on new social media followers. Six months ago, I included Mastodon and LinkedIn next to Twitter and gained 178 followers on all three platforms since then.
What's next?
In short, a lot! Next to telling a more compelling story about why we should invest in open technologies and why public money is the best option to achieve that, there are many practical items on the list:
- Most under-appreciated: Find the accounts with the highest score and minimum yearly budget,
- Permissive the better: Categorize permissive and copy-left licenses and add a license parameter to the algorithm as an experiment (feedback is welcome),
- Repositories vs. releases: Combine the repository data with their release information and improve the algorithm by including the release metrics.
- National public funds simulation: Categorize the accounts based on each country and simulate fund distribution per country.
Thanks for tuning in, and I hope you enjoy the ride as much as I do. Wish me luck, and see you next time!
Top comments (8)
You write:
The only reason this is true for the moment, is because we've been tricked.
Once upon a time, the OSI was actually honest about all this... web.archive.org/web/20060411080543...
The easy solution for all of this is:
OSI get out of the way of developers and language, and focus on "OSI Approved Open Source"
Once the ideology of the anti-property folks that run the FSF is thus disconnected from the discussion, we simply write new licenses that prohibit big business from using open-source software for free. Allocation of Capital is one of the core functions of big business. They pay every single other vendor they have, they just need to pay for their open-source software.
This is easy to solve.
Ewan, thanks for the input!
I advocate for a permissionless economy to maximize freedom and innovation. The long-term goal should be to make digital goods available to everyone so anyone can freely use, study, and improve them until they start monetizing based on that work.
The businesses can/should finance the open technologies by paying taxes (potentially a special tax the moment they generate revenue, like VAT) instead of paying for every digital good in advance, before consumption.
Then we have a healthy economic structure that still rewards the open source entrepreneurs without introducing frictions like paywalls.
However, we must develop a scalable public funding structure to distribute a certain amount to the open source ecosystem. And that's what my humble attempt is about.
Suppose we create new licenses to restrict businesses from using open source software without paying. In that case, we again introduce paywalls that limit the innovation, freedom, and organic growth we have today.
By this logic, requiring businesses to pay employees is introducing a paywall that limits "innovation, freedom and organic growth" and we could allow much more growth if we didn't require businesses to pay for their employees... We can simply require them to work and avoid this "paywall that limits innovation".
We live in our time, not the past, nor the future. At this time, in the mixed market economies represents more than 95% of the population of earth, corporations play a substantial role in capital allocation. Why should they allocate capital to every other vendor that they work with, yet somehow the genius that writes the software they depend on should be exempt from resource allocation. Why? They have the resources. They are responsible for allocating them. They've already proven they will not willingly donate. Why shouldn't they simply pay for use like they do for every single other resource that they consume?
How can you imagine that independent developers subsidizing the largest companies on the planet with free labor is a good thing?
Why don't we extend that to everyone?
When are you going to start volunteering to work for your employer for free so that you can remove the "paywall that limits innovation", your salary is their paywall...
I suggest financing open source initiatives through public money through (agile) public funds (that's what this experiment/study is about). The revenue of the public funds will still come from those businesses through collecting taxes.
In order words, the companies still should pay for what they use, not before consuming the product (paywall), but when they generate value in the economy by using that product (paying taxes).
The money should still flow to open source initiatives so they can financially become sustainable or even profitable. But with the public funding case, we can maximize innovation and freedom (no need to add artificial restrictions).
Why should we establish such an economic structure for digital goods like open source software? Because digital goods' reproduction costs are virtually zero, they are naturally abundant; any company/user can have a copy of any software, and that doesn't add any cost to the producer.
That's not possible with an employee's time, which is a limited resource.
I'd happily schedule a call if you wish to discuss the details further.
Feel free to contact me through one of my social media accounts:
Companies operate and pay tax within a jurisdiction, yet companies use software written by developers that live in ANY jurisdiction
If implementing your vision in reality, on earth, in the 21st century, there will be many jurisdictions that will never subscribe to your extra tax proposal. That means the percentage of business activity taxed will be low, and every country that doesn't participate in your tax scheme benefits disproportionately, over the long term impoverishing the people of whatever nation implements your scheme while rewarding the citizens of the nations that don't.
If your idea were actually implemented, it would just be yet another "Linux Foundation" type organization that's allocating resources to people and projects that they like and they need, rather than anything that produces value and thus makes us wealthier. The thing that got Open-Source this far was great coders writing great code, not bureaucrats.
The time of the developer that made the software that you want to encourage them to give away to big business is not infinite - it is finite.
If you want to attack the monetary value of independent open-source developers products because they are digital, you should also attack the value of every social media company, search company, media company, information company, app store, and anything else that deals with "digital goods" as in your version of reality those are all unnecessarily "paywalled" anytime someone charges a fee for any of them.
You should get rid of all the buzzworld misdirection and write clearly.
"Permissionless" == Open
"Paywall" == "Fee"
"Artificial Restriction" == ""Restriction"
"Agile public funds". Public funds are the opposite of agile anything.
You raise some good points; thank you!
You seem pretty passionate about this topic, so I invite you to have a call again. We can understand each other better. Have a great weekend!
Please elaborate on how massive companies that pay for every other vendor that they use, every watt of electricity, every paper cup at the water fountain, please elaborate how these massive companies paying what for them is a nominal fee to use no longer free-software open-source is a barrier to "freedom" or the creation of "better products"?
Most big improvements don't originate at these companies anyways, these companies are responsible for allocating capital, and they acquire the smaller companies that make the big improvements.
Earlier you said "any fee" of any kind will limit these improvements... You like to call a fee a "firewall"...
Anyway, do you not see that a tax is also a "fee" and that putting a bunch of bureaucrats in between the collection of the tax and the dispersion of the tax proceeds must siphon off a substantial fraction of the collected fee? Even the most efficient charities struggle to keep a fund raising efficiency ratio below 30%, and most government is substantially less efficient.
Let's settle that I'm not proposing to remove the fee but instead change the moment the consumers pay it.
Companies/consumers should pay the fee for digital products like software not before consuming the product (paywall) but the moment they create value in the economy when selling their service/product to their customers (paying a dedicated tax).
Why would delaying the payment from before consumption to value creation improve our overall economy?
Imagine the start-ups (especially in developing countries); the new model would allow them to try any available digital products. They don't need to pay anything upfront. Take time to build your product/service and find your target audience.
However, the moment they make a sale, that's the moment they (or technically their customers) pay a dedicated tax, thus contributing back to the digital public goods they consume. In other words, no successful business, no payment.
It would reduce the cost of building new businesses and boost the start-up ecosystem in any country, meaning a much more competitive market, which is better for us, the consumers.
I considered visualizing this part, but no time at the moment. If I manage to do it later, I may add it here.
Why can we use this model for digital goods, not physical goods like electricity, paper cups, employee time, etc.?
As I mentioned earlier, we can consume digital goods indefinitely. It doesn't create additional costs if more people consume digital goods; we can have a copy of a software product for anyone who wants to use it.
That difference allows us to use a different financing model. We can increase the overall utility (freedom & innovation) if we allow companies to use digital goods as much as they want and only charge when successful.
Again, that's not a model we can apply for physical goods since they're naturally limited. In that case, we have to charge the fee before consumption.
Last, I agree that governments are not highly efficient in resource allocation. Hence, the goal is to build a public funding model with minimum or no bureaucracy (primarily by relying on existing data, similar to how YouTube or Spotify distributes revenue to content creators).
Let me know if I'm being clear or if you have other questions, and please feel free to follow the conversations. We're in the early stages, but we will need all the feedback we can get. I appreciate your interest!