How cool would it be to have a logistic regression model with variables like number of developers, average years experience, language, test coverage, bug density, etc. over all github projects, with stars strata? One can only dream.

It would be even better to have that data for non open source ones :).

I wonder how different the OS vs non-OS data would look like

