The broken promise of static typing

Dan Lebrero on June 05, 2017

This article originally appeared on IG's blog I was quite surprised at a recent blog post by Uncle Bob Martin, titled: "Type Wars", in which he ... [Read Full]

Great topic! I think your data analysis is fatally flawed, though. The only way to really figure this out is to give similarly experienced programmers a task and see how many errors solutions in the static typed languages contain vs the dynamically typed solutions. Unfortunately, I'm not aware any research on that exact topic.

I think Uncle Bob's oversimplified things. On the web, where speed is important and bugs usually aren't very costly, dynamic languages will win.

But in other domains where bugs can be expensive or cost lives (avionics, nuclear power plants, pace makers, etc), we might want to use languages and tools that help us ensure there are no bugs (or essentially no bugs). 100% test coverage of a dynamic language that Uncle Bob's talking about doesn't mean there are no bugs. And it certainly doesn't mean you've covered every execution path of the code.

If you look at the software written in Spark/Ada, you can see some really low defect rates. These defect rate are well below anything you could hope to achieve a dynamic language using TDD and there is data to back that up. But you end up trading speed for correctness.

Hey Blaine,

Thanks a lot for your comments.

I am not aware of any proper study either. In the article I link to the best source that I found, which contains a list of studies and a summary of each. Unfortunately nothing conclusive there either.

You are right about context being important. The same bug in two different context can have very different consequences.

I also agree that code coverage is no proof of 0 bugs, but static typing is neither.

I would love to see some proper studies to backup your statement about Spark/Ada ;)

I think you really nail it with "trading speed for correctness", a very important tradeoff.

Hey Dan,

The best source I have off the top of my head is a great talk by Martyn Thomas: [youtu.be/03mUs5NlT6U]

The whole talk is fascinating but I'll point you to the juicy bits in case you're in a hurry.

2 minutes: defect rate of 810 experienced software developers on > 8000 applications

19:22 minutes: defect rates of 5 projects that used 'correct by construction' software development techniques

32:33 minutes: productivity, cost, defects of the tokeneer project (zero critical failures found after extensive testing by the NSA).

33:51 minutes: The NSA gave interns with no experience with these techniques the job of adding features to the tokeneer project and they had amazing results (NSA conclusions at 36 minutes)

38:54 minutes: discussion of a few real-world safety-critical projects developed with these techniques (including defect rates which are fractions of the defect rates for typical projects)

I'm mostly a web guy but I'm really interested in this stuff. I've done a bunch of reading and I'm just beyond a "Hello World" example in Spark (the learning curve is pretty steep compared to picking up Java or something like that).

Anyway my inexperience with Spark/Ada prevents me from being able to tell how honest Thomas is about the benefits and drawbacks of this approach but I'm intrigued all the same.

Cheers.

Hi Blaine,

The talk was really fascinating. Thanks a lot for sharing.

Some thoughts after watching it:

  1. The numbers are pretty impressive.
  2. I don't trust anything that comes from the NSA, but I can trust the other examples ;)
  3. I wouldn't either recommend an agile methodology for building aircraft controller software. I don't want to imagine what "iterating" would mean.
  4. I loved the reasons why SPARK is not being adapted (min 48:30). We are so close minded!
  5. Interesting that they removed features from Ada to make it simpler and verifiable. It somehow reinforces my belief that simpler is key to more reliable software.
  6. I am really intrigued about "bounded resource (space and time) requirements."
  7. Did you notice on the SHOLIS slide (min 42:20) the bullet point "Demonstrated low value of unit testing when formal methods used"? Interesting!

You just added another TODO in my very long list of things that I do not know about.

I think it is very laudable for you to be interested and to be learning something so unusual. I recently wrote about my experience with Clojure at my personal blog, maybe you can relate to it.

Thanks again,

Daniel

Hey Dan,

I can relate to your Clojure experience. The Pragmatic Programmer (remember that book?) was right. Learning another language or paradigm effects how you program and how you think about solving problems.

Martyn Thomas has a whole series of lectures and they are all interesting. You might want to checkout:

Anyway, I really got interested in this stuff because I'm working in a code base that is full of bugs (who isn't, right?) and I just thought there has to be a better way to develop software so I started asking myself how 'they' make software for safety critical applications that doesn't break and isn't full of bugs.

The traditional advice is to turn up the compiler/interpreter warnings. Then you add static analysis. And now in PHP 7.1 you have optional strong typing so you convert your code base to run on PHP 7.1 and you do some of that. And you write unit tests. And once you're good at that you switch to TDD.

And all that stuff is good. It's really good in fact but it doesn't help you if you missed a requirement or a whole class of requirements. It also doesn't help if your requirements are ambiguous or contradictory.

So what we're trying to do is get really fast feedback. If we've got something wrong, we want to fix it as soon as possible because the longer that wrong thing is in your system the more it will cost to fix it. And the next step after everything I mentioned might be formal methods and mathematically verified software. I think of it as an uber-static analyzer in that it automatically verifies certain properties of your code (and annotations).

So you can spend your time writing tests and hoping you catch things or you can spend your time annotating your code in Spark/Ada and let the tools prove it works or you can ship buggy software, which in some cases is the right thing to do.

The real question for me is what if any of these tools and disciplines are appropriate for my role as a web developer?

Most projects spend more than 50% of their budgets testing and fixing defects. Could we spend a fraction of that money up front and do it right the first time by writing software with formal proofs? I don't know the answer yet but I'm working on it.

Cheers.

I will add those two videos to my list for my weekend.

Very interesting thoughts. Let us know how it goes!

Thanks,

Daniel

So its been a long time since I last coded in SPARK (nearly 10 years now) but it's worth noting a few things:

1) The really low defect rates reported for systems coded in SPARK aren't simply due to the language features, but also down to the "Correctness by Construction" approach, which emphasises getting things right from the high-level requirements all the way through formal specs and into coding, information flow analysis and proofs -- the sooner you find and eliminate the bugs, the less costly removing them is. The language greatly aids this approach due to its static analysis capabilities, but you can improve the defect rates in any language by following a similar approach (not going to get them as low though)

2) By getting rid of the bugs early, you are minimising re-work and (importantly) re-verification when removing them at a later date, so the "speed for correctness" tradeoff isn't as large as you might otherwise expect. Certainly in the domains where you tend to find SPARK (or normal Ada) being used, the cost of testing required for a similar confidence level in other languages can exceed that of the V&V for SPARK.

3) A lot of the applications that demand really low defect rates are aerospace, defence, etc etc. You'll see more statically typed languages in this arena because of their amenability to verification, but you are unlikely to see these projects pop up on github. That's an understandable limitation of the approach in the original post.

3) There's some good info on this set of slides from Rod Chapman of Praxis about real world applications, including defects per KLOC: asq509.org/ht/a/GetDocumentAction/... (NB: Praxis developed SPARK from earlier work from University of Southampton, and is now part of Altran)

4) Even proof of partial correctness doesn't negate the need for testing. Proof of freedom from run-time exceptions (e.g. demonstrably no buffer overruns) is less time consuming, but of great value.

Finally, I believe that Tony Hoare's quote was also used in the preface of "High Integrity Software: The SPARK Approach to Safety and Security" which is pretty much the text for SPARK :-)

Thanks a lot for sharing your experience with SPARK!

I agree that bugs are one of the worst cases of wasted time, specially if by the time they are found, we have already context switch, which is usually the case, by many weeks in some cases.

Given that you are not using SPARK anymore, may I ask what has been your experience since then? Have you tried to convince your teams to use it?

Cheers,

Dan

I like how you go on to say that the analysis is "fatally flawed", but don't explain how. If the author did some kind of data manipulation that favored one language (or paradigm) over another, that would be a sign of being "fatally flawed", but it seems like there's no other explanation for the observed data other than the author's conclusion. Obviously, it doesn't meet scientific standards for being conclusive, however, why should we prefer an opposite statement (that static types are more bug-free) by default over the author's statement (that more simplistic languages are more bug-free)? Clearly, the data is in favor of the latter statement.

"100% test coverage of a dynamic language that Uncle Bob's talking about doesn't mean there are no bugs. And it certainly doesn't mean you've covered every execution path of the code."
This seems to be very far from what the author was talking about. I think you might have misunderstood the article.

fatally flawed

Here are some alternate explanations of the data, all hypotheticals that should be considered before OPs conclusion is accepted as an accurate interpretation of his data:

  1. Practitioners of different languages have different reporting habits: they call different things "bugs", they report with varying frequencies, they tend to not care about reporting bugs as much as building the next feature, etc.

  2. Bugs are different sizes, so while haskell and python might both have "1 bug", the cost of that bug could vary wildly.

  3. There's a ratio of "bugs per feature", so more productive languages show up as more buggy.

  4. Bugs are labeled differently, IE perhaps haskell projects tend to have nice "bug" labels just because static typists are more OCD about it, where as a python project might have a million bugs, but no one labeled them as such. (related to my bullet #1)

I agree with others in the comments, in order to appropriately draw up causal relationships, one would need to construct an appropriate experiment. Double-blind-placebo-controlled-randomized might be a bit tough to construct, although the closer to that one could be, the better.

Perhaps one could construct a randomized crossover though, and that would finally lend some actionable insights into the problem?

I'm not the GP, but I'm guessing that he was thinking of the fact that this could simply mean that more bugs are reported on F# projects, not that more bugs exist.

Many aspects affect the number of unfixed and solved bugs like:

  • the number of features of the program.
  • it's maturity.
  • It's user base.
  • The impact of bugs for the users. Does a bug mean billions $ losts, hundred or thousand people killed (think aircraft autopilot or you nuclear power plant software) ?
  • The team/developers/company behind. Are they reliable, serious, experimented ?

What I see in your graph is that the most used languagues (C++/Java) with the biggest codebases and most features under the software built with them have the most bug per repo, but it seems logical.

Seeing that, it is now quite hard to draw any conclusion from that data alone.

What I surely see is that static typing serve as a mandatory documentation that help both the compiler, the IDE and the developper to reason about the code. There less information available on a typical dynamic language meaning that one has to rely more on alternate solutions but in state of the art tooling, the IDE/compiler typically never catch up. More checks are done at run time and the IDE fail to provide the same quality of tooling and context (auto completion, refactoring, code navigation).

Hi Nicolas,

Thanks for the comments.

The data is from Github which means Open Source code and from tens of thousands of repositories.
I state that the approach is very naive but I am still surprised about the results.

I agree that not all bugs are equal and you shouldn't use the same development practices in all projects.

I would include Ruby, Python and JavaScript in the list of most used languages. I do not which codebases are the biggest or with most features, but Steven McDonnell in the "Code Complete" book says: "the number of errors increases dramatically as project size increases, with very large projects having up to four times as many errors per line of code as small projects"

I think that is the reason why monoliths needs to be split into micro services at some point. My personal experience is that language expressiveness matters

Have you look at the comments by Blaine? They are really interesting.

Cheers,

Daniel

The most visible correlation in your data is that the more stars a repo has, the more bug there is inside. Also the respective ranking of language change significantly with the number of stars, like java being quite good for all repo, but quite bad by your metric on big repos.

It may be possible there a correlation between language and number of bugs or dynamic/static typing but really the data is not refined enough to remove other variables so concluding anything is impossible from the data.

Sure that language expressiveness matters, it is enough to try to develop anything in assembly vs Java or Lisp and you sure see a higher level language work better. But there expressive languages on both sides and different languages may suit different problem categories too.

My impression is also that huge projects are not often done in dynamically typed languages. I feel like a dynamically typed language may be able to leverage more of the individual productivity and on the contrary are not that great when the code base scale (millions lines of codes).

The number of line of code is not a good metric but it is far better than thinkings all repos are equals, so I would consider bug per LOC. After that is done you could always apply a factor between high level language vs basic one (like C typically needs more LOC than Java).

Thanks for the comments!

I neither do think that the data proves anything, I hope I made that clear in the post. Proving is a big word that I rarely use for anything.

I don't know if you noticed by I linked to the best source of studies on the matter that I found.

Reading your comments, something popped to my mind.

When we talk about huge projects, do you think that we plan from the beginning for huge projects or their start small and grow to be huge? Do you know think is common in the second case to switch languages?

Cheers,

Daniel

About huge projects do we know in advance? Well I guess it is case by case.

Twitter started basically as a Ruby shop and decided quite some time ago already to migrate to the JVM with Java and scala in particular (and javascript for the client). I don't know but I would say twitter started small.

Now I have colleagues and friends working for the french civil aviation and they decided long ago to make a new version of one of their key component. They started thinking big from the start. And by the way, automatic memory management was a no go as not realtime friendly, meaning many language like Java/Clojure/Lisp are instand no go.

There a saying that if you are a startup, you should go for instant productivity and that you'll always have time and money to rewrite everything if you company is to be successful, but if you are not successful, going more slowly to ensure better architecture, easier to maintain code or better performance doesn't make sense at all.

Some other would say you should use what you master. I think that make a lot of sense it save you time and let you concentrate on more important aspects like finding clients, hiring the right people or creating a business plan...

Most of the companies I worked for are big established company and while there often an emphasis on using the best tool for the job, it is also quite important to use standard tools, ensure you can hire easily and also that new people to a project have a chance to get up to speed. They almost always choose the popular statically typed language with Java, C++ and C. Javascript is now widely used but only because there basically no way to avoid it on the web and for year such company tried many way to go around it: JSF, GWT, doing it all on the server... The dislike of javascript by many IT specialist practice made the web losing years before nice reactive websites because the norm.

Theses companies have technical policies and outside of proof of concept, for anything that may go to production, it has to use allowed technology. For my current company that's C/C++ for most legacy, Java for most new things, Scala/Spark for BigData analysis and a bit of python. That last one being restricted to scripting, small projects that do not need to scale.

I do not necessarily says it is the right way to proceed, but the common practice is to use a statically typed language that has widespread adoption in the industry, and a mature echosystem that help on the productivity.

That being said, I quite remember the arguments of Paul Graham about lisp and how it helped him on his startup.

But even if he criticized it, when Yahoo brought his company, one of the first things they did was to migrate the code from lisp to a statically typed language... The decision was criticized, maybe rightly so, but it show that many people are not that found on dynamically typed languages.

I am personally quite torn about standardisation.

I always wonder what I would do if I created my own company. Would I mandate some popular language or would I allow every team choose whatever they wanted?

I can see a lot of good arguments in both sides and I have seen a lot of talks about the subject, and again, nobody agrees.

It is a little bit paradoxical what you say about the best tool for the job. I have similar experience and I see it as "the best tool within these limited and blessed toolset". When and how do you decide to add a new tool? It is really hard to quantify the value and cost, when we keep saying things like "more maintainable" or "easier to use".

Paul Graham essay is a classic, every developer should read it, not because of Lisp but to be aware of the Blub Paradox. It applies to all of us.

Out of curiosity, what language did your friends choose?

Thanks!

I guess for developers like you or me that love our craft, we want to get the most of our time, tooling and libraries. As such we like to have the best of the best, whatever it is.

That the promise of languages like lisp where you can easily build new abstractions that fit the best to solve the problem at hand.

But many things require several or many people either at the same time or over the years... maybe for example you'll not want to devote the next 20 year to the maintenance of the project you did in the past 5-10 years. This is where standardization make sense. If you get better productivity for yourself but the overall productivity drop, that a net loss.

So both aspects are to be taken into account. I would say in a big company, small independent team each completely responsible of its area even including the production make sense and help to scale that productivity. In today world, for many case, just saying your are able to provide VMs in the cloud that are able to respond to some kind of network queries should work and let of freedom in how things are done inside.

But even that doesn't solve everything. The interractions between teams will still dictate many things like what protocol data is exchanged. But also how you managed your database, what is overal architecture, what tools will you use for the continuous integration, QA testing, the cloud you'll use and how your application will skrink and scale dynamically...

There no much we can do alone in a big company if we don't cooperate.

I am convinced the language impact the productivity somewhat, but many other things impact it more. The programing language is a tactician choice, while the bigger things are strategist choices. And while you'll want to delegate the details to great tacticians, you'll want to have great strategists when you are in a big company... If just switching teams mean your employee need 6 months or 1 year before he become fluent in the technology stack, that's a real downside because this is only a small part of the job.

Hi Nicolas,

You are so much right. As Gerald Weinberg said:

"The Second Law of Consulting: No matter how it looks at first, it's always a people problem."

Thanks a lot for your toughts. It has been a pleasure to have a civilized discussion.

I have started following you on Twitter just in case you decide to start blogging.

Cheers,

Dan

Thanks for the link to Paul Graham’s Beating the Averages and The Blub Paradox. I run into that all the time.

Trying to convince the other developers (who are very bright people) that there are alternative languages that would be more powerful and suitable for our problem domain invariantly meets with deer-in-headlights blank stares.

Even contemplating alternative languages is outside of most developer’s comfort zone. Or moreso, even outside of capability of consideration. Even as a thought experiment.

When I look at the trends, I see object-oriented programming to continue for the foreseeable future. But, I also think there will be two language idioms that overtake object-oriented programming languages: functional programming languages, and domain specific languages.

I consider Lisp to be a programmer's programming language. An "abstract syntax tree oriented" programming language. Paul Graham's secret super-weapon is safe.

There are multiple issues with this measure.

1) Github issues cannot be really regarded as issues because many of them are questions/enhancements etc., Most of the repos lack proper labeling of issues as "issues" as opposed to having nothing. So you have to manually sift through the issues to identify which is actually an issue

2) How many of these repos do actual development in github? I can see various compilers such as v8 js library sitting in a separate repo and only the mirrors are there in github. So there are no issues tracked in github.

3) I would prefer a complex language with simple features at its heart such as scala as opposed to simple language with complex features such as PHP. Because once you master the language (takes time), the number of bugs can significantly go down.

4) Lesser bugs does not mean a stable language. This is a classic case of correlation does not mean causation. In fact, it can mean the opposite. Languages such as java have a bigger community and hence it has more bugs because there are more people to test it.

5) Static typing was never meant to improve correctness of the programs. It evolved naturally from assembly language where you just allocate bytes. While dynamic typing claimed better productivity which has been proved to be false over the years

6) No matter how much sophisticated languages come out, they are always going to lose against human stupidity. They are no match. No language is going to help you from shooting yourself in the foot. It all depends on various levels of safety and choosing the right tool for the job

This argument is never ending because you can always find people on either side of the court.

I have also written on this Static vs Dynamic typing and why you should care, do give it a read.

Hi Madusudanan,

Thanks for the comments. My comments on the comments :):

1) issues labelled "bug". Of course is GitHub data, so you can trust it as much as you want.
2) I assume the majority of the tens of thousands of repos.
3) I prefer a simple language with simple features :). You claim: "the number of bugs can significantly go down", do you have a serious study to prove that?
4) Agree, but I expect the Scala/Haskell/F#/Clojure/Erlang/Go communities to be roughly similar in size.
5) Maybe it wasn't meant to, but it is now "common knowledge".
Productivity claims are yet another can of worms, from which I have never seen a proper study to prove anything. I see that you link to a StackOverflow question, but I don't see any study mentioned on the answers. Where is the prove?
6) Completely agree. Better not to give those idiots a gun to shoot themselves on the foot.

Do you think the argument will end if we had enough data? Would we be ever be able to have the data?

Nice read, but I no longer believe in a static vs dynamic discussions :)

Hi Dan,

Thanks for taking the time to reply.

My comments below.

I prefer a simple language with simple features :). You claim: "the number of bugs can significantly go down", do you have a serious study to prove that?

I do not have any proof. But I say that from my experience with PHP and Scala. For example, PHP does not have a O(1) hashmap. There are several other examples that I can give.
The fact is, a simple language does not give you all the tools that you need. You can develop any way you want it to. This is the problem now with JS ecosystem. There is lack of standard frameworks and tooling. Sometimes it is better to be opinionated.

I see that you link to a StackOverflow question, but I don't see any study mentioned on the answers. Where is the prove?

Since we are dealing with opinions and not lemmas, I dont think we can prove either of them. But what we can do is gather collective opinions. I believe stack overflow is a very mature community to discuss such issues.

Productivity claims are yet another can of worms, from which I have never seen a proper study to prove anything

I use a strategy called proof by contradiction. Dynamically typed languages claim to be better at programmer productivity but from my experience of coding, they never work out for large teams. Hence I do not see what they are good for. You can neither prove this nor disprove this.

Completely agree. Better not to give those idiots a gun to shoot themselves on the foot.

What I meant by this is no matter how much a good language you have, there is always the need for code reviews. Nothing beats that.

Do you think the argument will end if we had enough data? Would we be ever be able to have the data?

I dont think that is the question we need to be asking. We have had several years of discussion around this and there is nothing conclusive. So I would choose whatever language that works out well for the team. This is of course based on several factors such as community/library maturity, ease of learning etc.,

One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies.

I would go with the latter any day. When coming to languages/frameworks, people tend to be overly creative. Instead of creating their own, people can always contribute back to open source/existing stuff and there is a very strong relation between the maturity of the language and the library maturity. An example would be the JVM ecosystem vs ruby ones.

Hi Madusudanan,

I see that when we talk about simplicity, we are not using it in the same way. In the article I mean "simple" as per Rich Hickey definition, which he explains in this one hour presentation.

Funny that you need one hour to explain what simple means :).

About large projects, I haven't personally work in any large codebase for the last ten years of my career as I have been doing SOA/Microservices.

What do you think about microservices? Isn't it a way of avoiding big complex codebases?

We are leaning towards the same definition of simple. But where we differ is how it is implemented in languages. You should take a look at martin odersky's talk working hard to keep it simple. A language being complex to learn is not necessarily the same as its usage in applications. There are some languages that get this right, such as scala.

What do you think about microservices? Isn't it a way of avoiding big complex codebases?

I don't think microservices is the only answer. Services can be split if they naturally have a boundary. In my experience if the application talks to the same database schema then it would be unnecessary overhead to create everything as a micro service i.e instead of database calls we would now be using REST APIs. HTTP while being a good protocol is definitely not a replacement for regular method calls within a language. I tell this because most of the microservices discussion end in a way that split everything as small as possible, which is definitely not the goal of microservices.

But it can definitely help to a great extend if done right.

Of course my experience is limited and I am not an expert in any way. Just my 2 cents.

This is interesting. I'm normally of the static typing camp, but I get it. It's one thing to write code that a computer can understand, and it's a whole other thing to write code that another person can understand. More easily understood code = less chance for bugs.

When I first started programming, it was in C++. I then learned Java in college, as well as VB.Net and C#. My first few programming jobs were .Net, and now I'm working in Python. Every once in a while I'll fire up Visual Studio at home, and do a little C#, and the difference between it and Python is quite interesting. I'm definitely not working on the same kind of projects between work (Python) and home (C#), but there's some things that I really wish were easier in .Net. For example, to make an HTTP web request in Python is maybe - maybe - half a dozen lines. The same thing in C# is probably 1.5x to 2x as much code. Same sort of deal for something like reading/writing a file.

That being said, there's some definite proverbial rat's nest code that I've seen in Python. All the manual type checking in the world won't save you from bugs if you aren't being smart about things.

I wonder how much of this is also the experience of the programmers. I don't really know any Go, Scala, Haskell, or Erlang programmers. Perhaps these languages attract a more mature programmer? I wonder if that Stack Overflow data would should a correlation between languages used and years spent programming...

Hey, regarding calling REST API from C# take a look at DalSoft.RestClient

dynamic client = new RestClient("http://jsonplaceholder.typicode.com");

await client.Users.Get();

or try using HttpClient

var httpClient = new HttpClient();
var content = await httpClient.GetStringAsync(uri);
return await Task.Run(() => JsonObject.Parse(content));

Hope it helps!

Hi Joel,

In Rich Hickey's terms, I would say Python is an easy but not simple language.

Also, a lot of people seems to miss it, but Go is a static typed language. So maybe it is not about dynamic vs static, maybe it complex vs simple.

I would recommend to watch Rob Pike and Rich Hickey presentations that I link in the article.

Thanks for reading!

Daniel

I like Python. The main problem with Python is though that although it favors simplicity
"Simple is better than complex." most of the time you end up with anti-patterns and complex code.

It assumes that we are all responsible programmers and we are not messing about.

In practice I've seen a lot of times this gets violated in favor of convenience.

I can't shake the idea that simplicity is always measured through a lens.
C is simple: from the perspective of the hardware, C is conceptualy close.
Haskell is simple: from an equational perspective, Haskell is conceptually close, values are mapped from domains to codomains (even IO and state is modeled this way).

What if Haskell, Scala, and F# developers are super proactive about reporting bugs? What about other labels? In the F# compiler and IDE tools repository, "regression" is also used. What about repositories which could naturally have a high bug count, but can't be measured the way you chose? Clojure, Ruby, and Scala compiler repos don't have issues on GitHub, for example. F# and Golang do. There are so many other questions surrounding methodology that I have.

I don't think that you can begin to draw any conclusions from this.

Are you suggesting that Java and C++ devs are super duper proactive? :)

The approach is naive, but I was really expecting that there was going to be some significant difference between "properly" types languages and the rest.

The best explanation that I have found so far is from Bartosz Milewski. Except from his book Category Theory for Programmers

"Strong static typing is often used as an excuse for not testing the code [...] The result of this cavalier attitude is that in several studies Haskell didn’t come as strongly ahead of the pack in code quality as one would expect."

So maybe strong static typing plus proper testing is the answer to less bugs.

What do you think?

I'm not suggesting that some groups are more proactive than others - just offering a question that one could draw from that data. Put differently, what if Clojure developers were less annoyed by bugs than C++ developers? I see that conclusion as just as valid as those you've drawn.

I agree that static typing is not an excuse for tests (even though many folks in the FP community wold say otherwise...). Types can certainly eliminate a class of problems if used well, but they're certainly not a silver bullet. Tests guard against change. For any decently-sized project, you need tests to protect your code against yourself :).

"you need tests to protect your code against yourself" -> that made me laugh!

By the way, F# is the only thing that has made me ever consider looking at the Microsoft stack.

Thanks a lot!

I find a troubling back-of-the-envelope correlation between language popularity/adoption level and number of bugs found. I think some of this can be explained by more eyes on the code (and users) and more inexperienced contributors.

Who is mostly writing Go code? Experienced enthusiasts and those with a staked investment in the language. Who is writing C++/Java/Python? Pros, but also new programmers who are flailing around trying to make a mark and learn real lessons after 50 cumulative hours programming.

Popularity may screw the data, but feel free to compare just the "experienced enthusiasts" languages: Scala/Haskell/F#/Clojure/Erlang/Go.

This is however a well known bias in data analysis. There might be a hidden phenomenon that explain most of the correlation.

In case of geography for instance, you must always be careful of not redrawing a simple population map. Because high number of occurence often happen in highly populated place.

In your case repository with the most contribution are logically thoses that contain also the more bug reports. So by calculating your bug density by dividing against the number of repo what you might actually mesure is the number of active contributors.

You could check this by testing if charting the number of contributors produce a similar graph to your indicator.

And to overcome this you might divide number of bug by the number of active contributors to each project. Also you might need to filter a specific timespan because older project expectedly have more reported bug. So only number of issue and number of unique contributors from last year shall be taken in account.

By implementing these changes you could have a more robust indicator using the same source.

Interesting points. I will see if I can get that data.

Thanks for the idea!

Another thing you might look at along these lines is to subtract bug reports submitted by contributors to the project, so as to try to distinguish (if imperfectly) between bugs discovered by users and bugs logged by those who are developing the project. For example, in a Haskell project it may be considered a bug if an invalid state is representable given the type signature even if that bug is never encountered as a runtime error, whereas in a Clojure project this isn't even a concept. However, this sort of "bug" is unlikely to be reported by someone who's simply a consumer of a library, so maybe excluding contributors (perhaps over some threshold?) can help to filter out issues that may not affect end-users.

Have you considered that you may have actually measured known bug-count vs unknown? To me this is like like comparing pennies in a jar vs missed pay-checks... It's the unknown long-term problems with systems (rounding errors, off-by-one's, partial API regressions, and design flaws) that lead to the biggest problems.

It's for sure interesting, I'd love for there to be an answer, but I've been making the transition from dynamic -> static yo-yo'ing without any evidence for or against either for the general-case since the 90's.

Thanks for the article

Hi Lewis,

Not sure if I understood you about the known vs unknown. Do you mean that for dynamically typed languages, there are bugs that have not been reported or found, while those same bugs would have been reported in a staticly typed lang?

I love the pennies vs paychecks analogy. I will steal it for a future blog post ;).

I am with you in the static vs dynamic debate, that is why I wanted to propose a different one: simple vs complex. On this one, I would position myself on the "simple-by-default" camp, were doing complex things was painful and non-idiomatic. What about you?

Thanks!

Dan

Hi Dan,

You got the known vs unknown in one. Not knowing about a bug (it not being in issues) doesn't mean it doesn't exist as we found with the OpenSSL bugs a few years back.

I'm glad you enjoy my analogies I love using them as they generally help ;)

On simple vs Complex. I'm sure it's a false dichotomy overall but I definitely love the idea I keep being sold RE: simplicity.

How cool would it be to have a logistic regression model with variables like number of developers, average years experience, language, test coverage, bug density, etc. over all github projects, with stars strata? One can only dream.

It would be even better to have that data for non open source ones :).

I wonder how different the OS vs non-OS data would look like

Articles like this based on real data are great to read. I like your analysis and share your thoughts that bug density is not significantly affected by static vs dynamic typing or any one feature for that matter. There are many reasons for bugs, many reasons why a developer reports a bug, different debugging tools and troubleshooting abilities, and many programming language features all of which factor in. One feature will never cause approximately 5 to 6 times more bugs or that feature will be quickly replaced and depreciated.

Complexity is definitely a significant factor. Another big factor is the number of intermittent errors that cannot be reproduced. Multithreaded programming and languages that do not automatically manage memory like c are more prone to intermittent bugs, race conditions, etc and have errors that are very difficult to reproduce. This leads to bugs that never get fixed and eventually add up. Fixing bugs is easy if you remember one thing... You can't kill what you can't catch.

Funny how c / c++ which has the highest bug density in Round 3 is also used in our most important systems, Linux and Windows. Remember Windows 95, 98, Me? They had problems. Windows 2000, XP, 7, and 8? Much better. Both written in C. Why was one better than the other? I assert architecture.

Thanks for the comments. Very insightful and very good question regarding C. I have always assumed that C was mandatory for performance reasons. What do you think?

I may be missing something, but I always thought that the value of static languages was in API and framework discoverability (aka strong autocomplete), not in avoiding bugs. When a variable type is known by the compiler, it can more easily figure out what you can do with it and avoid trips to the documentation. This is why I like static type system that doesn't get in the way (more C#, less Java).

Also, community plays a part. The languages with higher bug densities with some exceptions seem to attract more beginner programmers.

Hi Supe,

Thanks a lot for the comments.

I agree that autocomplete is nice way of discovering what a library API provides.

There is a reddit discussion about this specific thing. Still autocomplete is so handy...

It is interesting what you point about "beginner" languages, but if you remove from them picture, the data still doesn't show that advanced static languages do any better than "simpler" languages.

Thanks again!

Daniel

I've been noticing similar things. Statically typed compiled languages do eliminate the possibility of certain classes of errors getting through to production simply because the compiler will catch the errors. However, a lot of the time, these compiled languages introduce complexities that raise the likelihood of other classes of errors that may actually be more advanced to troubleshoot (looking at you, Haskell, with your lazy evaluation and unpredictable performance). Another language that makes me question the perceived value of static typing is Scala. Until I had worked with Scala almost exclusively for the better part of a year, I actually spent more time fighting with the compiler over code that was logically correct (but due to type-erasure and other things, the compiler couldn't verify) than actually working on real bugs in the application.

I wonder, if all of this is true, what it's impact is going to be in the success or demise of Rust?

I think that for a programming language it is far more important the marketing around it than the technical qualities.

I've used static and dynamic languages and I agree with the hypothesis that static languages, when used well, help you reduce the probability of bugs. In many cases, people write poor code. If you use a static language, like F# or Haskell, but use it like it is JavaScript or the old C, C++ it is normal that bugs will arise. Most programmers are "Primitive obsessed" which is a source for some bugs. Many like to cast all over the code too. Programming in a way which makes invalid states impossible to represent helps a lot and also saves you a lot of testing. Usually when I get a code to compile I have very few bugs and most of them are caused by a bad communication of the requirements.

fsharpforfunandprofit.com/series/d...

Go/Erlang/Clojure come from the industry as a reaction to specific pain points from "real production systems".

And you are holding that erlang's immutability or functional paradigms for ex are not linked to academia because they are supposed to address "real production systems" issues?
I think it is an artificial distinction. For example elm is heavily based on all you would categorize as "academic" but its intent is to address real issues in client side development.
How can one separate these 2?

Sorry, I do not mean that.

Academia is extremely important and should be a source of inspiration to the industry.

In fact Curry On is one of my favourite conferences: "Academia and industry need to have a talk."

Experimentation is key to advance the state of the art, but do you want experimental programming features in your production code? Brian Goetz, one of the Java Language architects explains it better here

I always had the impression that in an ideal world programming would mean some kind of 1-to-1 relationship with discovered principles of math and nature rather than invented languages based on invented principles. Something closer to ideals that are inherently perfect from logic rather than inherently flawed human constructs (not that they are not pragmatic).
In any case, thanks for the resource!

That is the most beautiful thought that I have read in a long time.

Thanks for sharing!

What's "density"? Issues labelled "bug" per line of source code? If so, what counts as a line of source code? Whitespace lines? Lines consisting only of trailing or opening parentheses?

Maybe you should add the meaning to the article? I also thought it would be bugs per line of code, a measurement that is useless by itself.

But up-so-far I think pretty much every measurement I have seen is useless.

Just to get it right. A project with 1 file and 100 lines of code in language X with 1 bug has technically a smaller "bug density" then a 50 million lines of code project in language Y with 3 bugs? If "yes", do you think that this is a useful measurement?

thanks for the feedback. I will try to make it more clear!

To your question, yes, that is what I meant by bug density. Language X will have 1 bug per repo and language Y will have 3.

To your particular example, it is useful if the 100 lines of code provide the same amount of functionality than the 50 million lines.

Of course, I don't know any language that is 500k more succinct than other, but I don't either know of any 50 million lines codebase with 3 bugs.

What I do expect is that roughly all developers, no matter the language, have to solve the same problems, so the open source libraries available have roughly the same functionality.

Thanks!

Dan

Sure, you don't find a 50 million line code project with just 3 bugs. It will have a lot more. That's the point, the bigger the code size the more bugs you usually have.

Usually a comparison of bugs per line of code is "better". But "better" still doesn't mean useful. Some languages are 2-3 times more succinct for the same functionality. So a more succinct language with the exact same amount of bugs will automatically have a larger "bug density" (considering bugs per line of code).

The assumption that every language somehow solves the same problems is also not really correct. A lot of languages like PHP, Python, Ruby, Perl and so on are primarily web-development. And a lot of stuff is only solved by using C libraries. Or in other word, not really solved at all.

Some binding to GUI frameworks like GTK, Qt or game-engines (what you see in Python and so on) sure never will have the code size or complexity like a whole library in C (its just a binding).

I am indeed generalising and assuming that most of us either build websites or do ETL from source A to source B. Probably because that is what I have done for the whole of my career (boring!).

I think you are right to point out that all languages rely on C or C++, but I think that is true for all of them, not just the ones that you mention.

Thanks for the comments!

Daniel

Hi Dan!

Nice analysis.

When you calculate bug density, is it # of bugs divided by # of lines of code?

If so, this is really surprising! I read somewhere that bug density was pretty much a constant. So more concise languages had an advantage by being shorter.

However, your analysis shows that Java and C++ have more bugs per line of code!! So more code * more bugs = more more bugs! Ouch!

It would be cool to see the distributions of these languages. How wide are the curves? Does bug density vary widely in Java projects? What about Haskell?

Rock on!
Eric

Hi Eric,

I have updated the post to make it clear: "By bug density I mean the average number of issues labelled "bug" per repository in GitHub"

The assumption is "I do expect is that roughly all developers, no matter the language, have to solve the same problems, so the open source libraries available have roughly the same functionality.". David seems to disagree on this assumption. What are you thoughts?

I also remember reading somewhere that bugs are constant per lines of code, but maybe what was constant was the number of lines produced or the number of lines that you can keep in your head. I unable to find the reference right now.

Steven McDonnell in the "Code Complete" book says: "the number of errors increases dramatically as project size increases, with very large projects having up to four times as many errors per line of code as small projects"

Great idea for another pet project. Maybe one for PurelyFunctional.tv? ;)

Thanks a lot!

Dan

This analysis is flawed. The input variables are not controlled, nor do the conclusions logically follow from the data. I could make an almost opposite conclusion with the same data:

Based on bug density we clearly see that static typed languages are the best for identifying bugs.

The fact that the "data" can be used to draw very opposing conclusions would indicate a fatal flaw in the analysis. This is a sensational piece with no merit as research.

Thanks for the comment!

I would completely agree with your conclusion if the bugs from statically typed languages were all compilation errors. I suspect they are not.

Also, I could agree if the bug density of statically typed languages weren't all over the place. Note that Go is statically typed and one of the languages that I would call simple.

The post is not research. It doesn't say so anywhere, the post says "naïve", "not conclusive" and "opinion".

For research, read the link near the word naïve.

Also I would suggest to read the comments and watch the videos from Blaine. They are very cool and probably closer to your taste.

Thanks again!

Dan

¿Have you recalculated the data by dividing the total number of bugs by the age of the projects?

This is necessary to remove the age variable as obviously the older the project, the more bugs it will have.

But surely you have found a very valuable source of information to provide more insight in the static vs dynamic typing debate...

I see no C on that list, does that mean C produces no bugs?

You can't produce bugs if you become frustrated and never ship anything.

It'd likely be dumped in with C++.

Thanks for this post Dan. It generated a lot of good discussion on this topic!

Is the bottom line "simplicity appears to be important, and static-vs-dynamic typed languages less important"?

That is what I would like to have a conversation around! What is your feeling about it?

We have been fighting over the static-vs-dynamic thing for too long.

Thanks for reading!

Dan

My feelings -- just my feelings, not backed by any hard data -- is that the most important thing is both simplicity and writing the source code for maintainability and legibility. What Uncle Bob wrote about in his book Clean Code.

Some languages lend themselves to simplicity. For example, I'm impressed with D, Python, Lua and F# ... all of which have a clean syntax and are rather free of excessive "ceremony". Which is why I have a soft spot in my heart for those languages.

But the languages I use that pay the bills are C++ and C#, and I have a love-hate relationship with both of those languages. (More vehemence for C++, because I've been using it for a very long time.)

Bugs can be written in any language. But languages like C++ that have so many areas of undefined behavior that are easy to accidentally stumble into do no one any favors.

Languages that have contract programming, like Eiffel, D, and Ada 2012, make unit testing a lot less important because the contracts can be specified directly in the code instead of being encoded in unit tests. (That's what unit tests do: they express contracts.)

In my experience, statically typed languages -- like Go, C++, D, F#, Swift, TypeScript -- don't have much better protection from the duck typed languages like Python, JavaScript, Boo for "not making bugs". What the static typing does provide is scaling. Small applications gain little benefit from static typing. But as applications grow helps to make sure the pieces are fitting together correctly.

Case in point is Google's Angular that was converted from JavaScript to TypeScript, they had discovered that there had been a good number of bugs in their code that were caught once they had the static typing of TypeScript. (TypeScript transpiles to JavaScript, and the type annotation information is erased. It's a transpile time safety net.)

But, I've also worked with large system based in Objective-C which has a mix of static type checking and runtime duck typing, due to the nature of it using message passing to objects. (The message passing is reminiscent of SmallTalk.)

When I think of duck typed languages, I usually think of scripting languages. When I want to do something quick-and-dirty I reach for Python. When I want to make something application-like, I reach for a static typed compiled language.

But there are languages out there that bridge the two worlds of sorts. Languages that minimize the ceremony around the static typing, like OCaml, F#, and Swift. They're still all strongly typed, but the burden is more on the shoulders of the compiler, rather than forcing the developer to dot all the i's, and cross all the t's.

So I'd say that static typing catches a small category of bugs. For smaller applications, those kinds of bugs are few. For larger applications, those kinds of bugs can be crippling.

I don't know of any scripting language that supports contract programming as part of the core language. (Educate me if you know of any!)

A vastly bigger source of bugs in programs I work in is mutable global state. By which I am also including local mutable member variables in a class instance... that's a smaller scope global state. Programs that I've seen and I've written that emphasize immutability and segregate immutable data from functions and side-effect free functions seem to produce a lot less bugs.

I'm not sure if the "less bugs" I'm seeing is because I'm a better programmer with those kinds of languages, or if I make less bugs in those languages because it is easier to reason about the correctness of the code. Doesn't have to do with all those languages being statically typed. I believe it does have to do with immutable data and lack of global state has more simplicity.

Another vast source of bugs I've run into is null pointers. (Damn you Tony Hoare for adding in the null reference to ALGOL W!). That's another area where Haskell, F#, OCaml, Swift outshine C, C++, C#. Objective-C sort of sidestepped the problem with its treatment of the nil object quietly eating messages (well, almost quietly... the eaten message is output to the console log).

It'd be interesting to see how many of those bugs are actual logic bugs vs avoidable type level bugs users hit before even getting to the logic stuff.

Yes!

As I am somehow fascinated with Haskell, I would love to go through all the fixed bugs in some Haskell repos to see if there are some common patterns.

What would you expect to find?

I'd expect Haskell bugs to be more about misunderstood/incorrectly implemented business logic whereas more dynamic languages potentially have a lot more issues that stem from not being able to enforce invariants at compile time. Mind you that they probably still have the same potential logic bugs lurking behind, users just may not have gotten there yet cause some code path led to 23 being added to "foo".

This is based on many years working full time in Ruby, while at the same time time running FP user groups and contributing to various compiler projects, both statically and dynamically typed.

I would expect something similar.

At the summary of studies that I link in the blog, there is reference to this talk were "The speaker used data from Github to determine that approximately 2.7% of Python bugs are type errors".

I was quite surprised.

How is Python vastly more complicated than Ruby?

What about #4 and #5: Ruby and Scala? They seem to be the opposite of simplicity. Also OOP languages.

code of conduct - report abuse