Great topic! I think your data analysis is fatally flawed, though. The only way to really figure this out is to give similarly experienced programmers a task and see how many errors solutions in the static typed languages contain vs the dynamically typed solutions. Unfortunately, I'm not aware any research on that exact topic.
I think Uncle Bob's oversimplified things. On the web, where speed is important and bugs usually aren't very costly, dynamic languages will win.
But in other domains where bugs can be expensive or cost lives (avionics, nuclear power plants, pace makers, etc), we might want to use languages and tools that help us ensure there are no bugs (or essentially no bugs). 100% test coverage of a dynamic language that Uncle Bob's talking about doesn't mean there are no bugs. And it certainly doesn't mean you've covered every execution path of the code.
If you look at the software written in Spark/Ada, you can see some really low defect rates. These defect rate are well below anything you could hope to achieve a dynamic language using TDD and there is data to back that up. But you end up trading speed for correctness.
I like how you go on to say that the analysis is "fatally flawed", but don't explain how. If the author did some kind of data manipulation that favored one language (or paradigm) over another, that would be a sign of being "fatally flawed", but it seems like there's no other explanation for the observed data other than the author's conclusion. Obviously, it doesn't meet scientific standards for being conclusive, however, why should we prefer an opposite statement (that static types are more bug-free) by default over the author's statement (that more simplistic languages are more bug-free)? Clearly, the data is in favor of the latter statement.
"100% test coverage of a dynamic language that Uncle Bob's talking about doesn't mean there are no bugs. And it certainly doesn't mean you've covered every execution path of the code."
This seems to be very far from what the author was talking about. I think you might have misunderstood the article.
Here are some alternate explanations of the data, all hypotheticals that should be considered before OPs conclusion is accepted as an accurate interpretation of his data:
Practitioners of different languages have different reporting habits: they call different things "bugs", they report with varying frequencies, they tend to not care about reporting bugs as much as building the next feature, etc.
Bugs are different sizes, so while haskell and python might both have "1 bug", the cost of that bug could vary wildly.
There's a ratio of "bugs per feature", so more productive languages show up as more buggy.
Bugs are labeled differently, IE perhaps haskell projects tend to have nice "bug" labels just because static typists are more OCD about it, where as a python project might have a million bugs, but no one labeled them as such. (related to my bullet #1)
I agree with others in the comments, in order to appropriately draw up causal relationships, one would need to construct an appropriate experiment. Double-blind-placebo-controlled-randomized might be a bit tough to construct, although the closer to that one could be, the better.
Perhaps one could construct a randomized crossover though, and that would finally lend some actionable insights into the problem?
I'm not the GP, but I'm guessing that he was thinking of the fact that this could simply mean that more bugs are reported on F# projects, not that more bugs exist.
Thanks a lot for your comments.
I am not aware of any proper study either. In the article I link to the best source that I found, which contains a list of studies and a summary of each. Unfortunately nothing conclusive there either.
You are right about context being important. The same bug in two different context can have very different consequences.
I also agree that code coverage is no proof of 0 bugs, but static typing is neither.
I would love to see some proper studies to backup your statement about Spark/Ada ;)
I think you really nail it with "trading speed for correctness", a very important tradeoff.
The best source I have off the top of my head is a great talk by Martyn Thomas: [youtu.be/03mUs5NlT6U]
The whole talk is fascinating but I'll point you to the juicy bits in case you're in a hurry.
2 minutes: defect rate of 810 experienced software developers on > 8000 applications
19:22 minutes: defect rates of 5 projects that used 'correct by construction' software development techniques
32:33 minutes: productivity, cost, defects of the tokeneer project (zero critical failures found after extensive testing by the NSA).
33:51 minutes: The NSA gave interns with no experience with these techniques the job of adding features to the tokeneer project and they had amazing results (NSA conclusions at 36 minutes)
38:54 minutes: discussion of a few real-world safety-critical projects developed with these techniques (including defect rates which are fractions of the defect rates for typical projects)
I'm mostly a web guy but I'm really interested in this stuff. I've done a bunch of reading and I'm just beyond a "Hello World" example in Spark (the learning curve is pretty steep compared to picking up Java or something like that).
Anyway my inexperience with Spark/Ada prevents me from being able to tell how honest Thomas is about the benefits and drawbacks of this approach but I'm intrigued all the same.
The talk was really fascinating. Thanks a lot for sharing.
Some thoughts after watching it:
You just added another TODO in my very long list of things that I do not know about.
I think it is very laudable for you to be interested and to be learning something so unusual. I recently wrote about my experience with Clojure at my personal blog, maybe you can relate to it.
I can relate to your Clojure experience. The Pragmatic Programmer (remember that book?) was right. Learning another language or paradigm effects how you program and how you think about solving problems.
Martyn Thomas has a whole series of lectures and they are all interesting. You might want to checkout:
Anyway, I really got interested in this stuff because I'm working in a code base that is full of bugs (who isn't, right?) and I just thought there has to be a better way to develop software so I started asking myself how 'they' make software for safety critical applications that doesn't break and isn't full of bugs.
The traditional advice is to turn up the compiler/interpreter warnings. Then you add static analysis. And now in PHP 7.1 you have optional strong typing so you convert your code base to run on PHP 7.1 and you do some of that. And you write unit tests. And once you're good at that you switch to TDD.
And all that stuff is good. It's really good in fact but it doesn't help you if you missed a requirement or a whole class of requirements. It also doesn't help if your requirements are ambiguous or contradictory.
So what we're trying to do is get really fast feedback. If we've got something wrong, we want to fix it as soon as possible because the longer that wrong thing is in your system the more it will cost to fix it. And the next step after everything I mentioned might be formal methods and mathematically verified software. I think of it as an uber-static analyzer in that it automatically verifies certain properties of your code (and annotations).
So you can spend your time writing tests and hoping you catch things or you can spend your time annotating your code in Spark/Ada and let the tools prove it works or you can ship buggy software, which in some cases is the right thing to do.
The real question for me is what if any of these tools and disciplines are appropriate for my role as a web developer?
Most projects spend more than 50% of their budgets testing and fixing defects. Could we spend a fraction of that money up front and do it right the first time by writing software with formal proofs? I don't know the answer yet but I'm working on it.
I will add those two videos to my list for my weekend.
Very interesting thoughts. Let us know how it goes!
So its been a long time since I last coded in SPARK (nearly 10 years now) but it's worth noting a few things:
1) The really low defect rates reported for systems coded in SPARK aren't simply due to the language features, but also down to the "Correctness by Construction" approach, which emphasises getting things right from the high-level requirements all the way through formal specs and into coding, information flow analysis and proofs -- the sooner you find and eliminate the bugs, the less costly removing them is. The language greatly aids this approach due to its static analysis capabilities, but you can improve the defect rates in any language by following a similar approach (not going to get them as low though)
2) By getting rid of the bugs early, you are minimising re-work and (importantly) re-verification when removing them at a later date, so the "speed for correctness" tradeoff isn't as large as you might otherwise expect. Certainly in the domains where you tend to find SPARK (or normal Ada) being used, the cost of testing required for a similar confidence level in other languages can exceed that of the V&V for SPARK.
3) A lot of the applications that demand really low defect rates are aerospace, defence, etc etc. You'll see more statically typed languages in this arena because of their amenability to verification, but you are unlikely to see these projects pop up on github. That's an understandable limitation of the approach in the original post.
3) There's some good info on this set of slides from Rod Chapman of Praxis about real world applications, including defects per KLOC: asq509.org/ht/a/GetDocumentAction/... (NB: Praxis developed SPARK from earlier work from University of Southampton, and is now part of Altran)
4) Even proof of partial correctness doesn't negate the need for testing. Proof of freedom from run-time exceptions (e.g. demonstrably no buffer overruns) is less time consuming, but of great value.
Finally, I believe that Tony Hoare's quote was also used in the preface of "High Integrity Software: The SPARK Approach to Safety and Security" which is pretty much the text for SPARK :-)
Thanks a lot for sharing your experience with SPARK!
I agree that bugs are one of the worst cases of wasted time, specially if by the time they are found, we have already context switch, which is usually the case, by many weeks in some cases.
Given that you are not using SPARK anymore, may I ask what has been your experience since then? Have you tried to convince your teams to use it?
We’re a place where coders share, stay up-to-date and grow their careers.
We strive for transparency and don't collect excess data.