Discussion on: The broken promise of static typing

View post

Many aspects affect the number of unfixed and solved bugs like:

the number of features of the program.
it's maturity.
It's user base.
The impact of bugs for the users. Does a bug mean billions $ losts, hundred or thousand people killed (think aircraft autopilot or you nuclear power plant software) ?
The team/developers/company behind. Are they reliable, serious, experimented ?

What I see in your graph is that the most used languagues (C++/Java) with the biggest codebases and most features under the software built with them have the most bug per repo, but it seems logical.

Seeing that, it is now quite hard to draw any conclusion from that data alone.

What I surely see is that static typing serve as a mandatory documentation that help both the compiler, the IDE and the developper to reason about the code. There less information available on a typical dynamic language meaning that one has to rely more on alternate solutions but in state of the art tooling, the IDE/compiler typically never catch up. More checks are done at run time and the IDE fail to provide the same quality of tooling and context (auto completion, refactoring, code navigation).

Dan Lebrero • Jun 7 '17

Hi Nicolas,

Thanks for the comments.

The data is from Github which means Open Source code and from tens of thousands of repositories.
I state that the approach is very naive but I am still surprised about the results.

I agree that not all bugs are equal and you shouldn't use the same development practices in all projects.

I would include Ruby, Python and JavaScript in the list of most used languages. I do not which codebases are the biggest or with most features, but Steven McDonnell in the "Code Complete" book says: "the number of errors increases dramatically as project size increases, with very large projects having up to four times as many errors per line of code as small projects"

I think that is the reason why monoliths needs to be split into micro services at some point. My personal experience is that language expressiveness matters

Have you look at the comments by Blaine? They are really interesting.

Cheers,

Daniel

Eljay-Adobe • Oct 6 '17

Thanks for the link to Paul Graham’s Beating the Averages and The Blub Paradox. I run into that all the time.

Trying to convince the other developers (who are very bright people) that there are alternative languages that would be more powerful and suitable for our problem domain invariantly meets with deer-in-headlights blank stares.

Even contemplating alternative languages is outside of most developer’s comfort zone. Or moreso, even outside of capability of consideration. Even as a thought experiment.

When I look at the trends, I see object-oriented programming to continue for the foreseeable future. But, I also think there will be two language idioms that overtake object-oriented programming languages: functional programming languages, and domain specific languages.

I consider Lisp to be a programmer's programming language. An "abstract syntax tree oriented" programming language. Paul Graham's secret super-weapon is safe.

Nicolas Bousquet • Jun 8 '17 • Edited

The most visible correlation in your data is that the more stars a repo has, the more bug there is inside. Also the respective ranking of language change significantly with the number of stars, like java being quite good for all repo, but quite bad by your metric on big repos.

It may be possible there a correlation between language and number of bugs or dynamic/static typing but really the data is not refined enough to remove other variables so concluding anything is impossible from the data.

Sure that language expressiveness matters, it is enough to try to develop anything in assembly vs Java or Lisp and you sure see a higher level language work better. But there expressive languages on both sides and different languages may suit different problem categories too.

My impression is also that huge projects are not often done in dynamically typed languages. I feel like a dynamically typed language may be able to leverage more of the individual productivity and on the contrary are not that great when the code base scale (millions lines of codes).

The number of line of code is not a good metric but it is far better than thinkings all repos are equals, so I would consider bug per LOC. After that is done you could always apply a factor between high level language vs basic one (like C typically needs more LOC than Java).

Dan Lebrero • Jun 8 '17

Thanks for the comments!

I neither do think that the data proves anything, I hope I made that clear in the post. Proving is a big word that I rarely use for anything.

I don't know if you noticed by I linked to the best source of studies on the matter that I found.

Reading your comments, something popped to my mind.

When we talk about huge projects, do you think that we plan from the beginning for huge projects or their start small and grow to be huge? Do you know think is common in the second case to switch languages?

Cheers,

Daniel

Nicolas Bousquet • Jun 8 '17 • Edited

About huge projects do we know in advance? Well I guess it is case by case.

Twitter started basically as a Ruby shop and decided quite some time ago already to migrate to the JVM with Java and scala in particular (and javascript for the client). I don't know but I would say twitter started small.

Now I have colleagues and friends working for the french civil aviation and they decided long ago to make a new version of one of their key component. They started thinking big from the start. And by the way, automatic memory management was a no go as not realtime friendly, meaning many language like Java/Clojure/Lisp are instand no go.

There a saying that if you are a startup, you should go for instant productivity and that you'll always have time and money to rewrite everything if you company is to be successful, but if you are not successful, going more slowly to ensure better architecture, easier to maintain code or better performance doesn't make sense at all.

Some other would say you should use what you master. I think that make a lot of sense it save you time and let you concentrate on more important aspects like finding clients, hiring the right people or creating a business plan...

Most of the companies I worked for are big established company and while there often an emphasis on using the best tool for the job, it is also quite important to use standard tools, ensure you can hire easily and also that new people to a project have a chance to get up to speed. They almost always choose the popular statically typed language with Java, C++ and C. Javascript is now widely used but only because there basically no way to avoid it on the web and for year such company tried many way to go around it: JSF, GWT, doing it all on the server... The dislike of javascript by many IT specialist practice made the web losing years before nice reactive websites because the norm.

Theses companies have technical policies and outside of proof of concept, for anything that may go to production, it has to use allowed technology. For my current company that's C/C++ for most legacy, Java for most new things, Scala/Spark for BigData analysis and a bit of python. That last one being restricted to scripting, small projects that do not need to scale.

I do not necessarily says it is the right way to proceed, but the common practice is to use a statically typed language that has widespread adoption in the industry, and a mature echosystem that help on the productivity.

That being said, I quite remember the arguments of Paul Graham about lisp and how it helped him on his startup.

But even if he criticized it, when Yahoo brought his company, one of the first things they did was to migrate the code from lisp to a statically typed language... The decision was criticized, maybe rightly so, but it show that many people are not that found on dynamically typed languages.

Dan Lebrero • Jun 8 '17

I am personally quite torn about standardisation.

I always wonder what I would do if I created my own company. Would I mandate some popular language or would I allow every team choose whatever they wanted?

I can see a lot of good arguments in both sides and I have seen a lot of talks about the subject, and again, nobody agrees.

It is a little bit paradoxical what you say about the best tool for the job. I have similar experience and I see it as "the best tool within these limited and blessed toolset". When and how do you decide to add a new tool? It is really hard to quantify the value and cost, when we keep saying things like "more maintainable" or "easier to use".

Paul Graham essay is a classic, every developer should read it, not because of Lisp but to be aware of the Blub Paradox. It applies to all of us.

Out of curiosity, what language did your friends choose?

Thanks!

Nicolas Bousquet • Jun 9 '17

I guess for developers like you or me that love our craft, we want to get the most of our time, tooling and libraries. As such we like to have the best of the best, whatever it is.

That the promise of languages like lisp where you can easily build new abstractions that fit the best to solve the problem at hand.

But many things require several or many people either at the same time or over the years... maybe for example you'll not want to devote the next 20 year to the maintenance of the project you did in the past 5-10 years. This is where standardization make sense. If you get better productivity for yourself but the overall productivity drop, that a net loss.

So both aspects are to be taken into account. I would say in a big company, small independent team each completely responsible of its area even including the production make sense and help to scale that productivity. In today world, for many case, just saying your are able to provide VMs in the cloud that are able to respond to some kind of network queries should work and let of freedom in how things are done inside.

But even that doesn't solve everything. The interractions between teams will still dictate many things like what protocol data is exchanged. But also how you managed your database, what is overal architecture, what tools will you use for the continuous integration, QA testing, the cloud you'll use and how your application will skrink and scale dynamically...

There no much we can do alone in a big company if we don't cooperate.

I am convinced the language impact the productivity somewhat, but many other things impact it more. The programing language is a tactician choice, while the bigger things are strategist choices. And while you'll want to delegate the details to great tacticians, you'll want to have great strategists when you are in a big company... If just switching teams mean your employee need 6 months or 1 year before he become fluent in the technology stack, that's a real downside because this is only a small part of the job.