Some time ago, we got a fantastic opportunity to interview Simon Peyton Jones, who is a key contributor to the design of Haskell and one of the lead designers of Glasgow Haskell Compiler (GHC). Currently, Simon is working at Microsoft Research, where he researches functional programming languages.
In the interview, we discussed the past, present, and future of Haskell, Haskell’s benefits and downsides, GHC, walking the line between Haskell being a research and industry language, and multiple other topics.
Below are some of the highlights of the interview, edited for clarity.
From the perspective of a person who’s doing a lot of work on compiler and who’s done a tremendous amount of work over time to make it possible for all of us to enjoy Haskell – how does it happen that we have the Haskell that we know and love?
My central focus and, certainly, my sole initial focus was on the language and its compiler. But for a programming language to succeed in being useful to anybody, the language and the compiler has to exist within a much larger ecosystem of tools and libraries.
Initially, Haskell had none of that, so a huge step-change was when Cabal and Hackage came along. Until then, there were some libraries for Haskell, but there was no good packaging and distribution mechanism for them. So Cabal and Hackage made a huge step-change in the Haskell ecosystem – they made it possible to take GHC and Haskell, the language, and use it for production applications. I guess that was the early 2000s, something like that, and then there was a huge growth in libraries.
And then, we’ve seen lots of other stuff going about: profiling and debugging, and, more recently, IDEs for Haskell. All of this in aggregate is much bigger than the compiler itself and is not my essential focus, but I’m thrilled that it’s happening and I feel that it’s happening rather better now than it has in the past.
How did the ecosystem part of Haskell come together?
All right, well, remember that the Haskell community has grown organically over 30 years. Haskell was born in 1990, so it’s grown slowly and by accretion; it didn’t come into birth suddenly. At large, it has just grown over 30 years.
The second thing to remember is scholars had no corporate sponsor, you know. It’s not like C#, which was born out of Microsoft. So Haskell sort of grew slowly without an essential sponsor and that means that the community’s grown in quite a distributed way. The folks who are building Cabal and Hackage are different from the folks who are doing GHC itself, and they’re rather different from the folks who are doing Stack or IDEs.
That’s a strength, right. Diversity is good, and it has meant that each sort of group of volunteers had autonomy and a sense of agency about what they were doing. But, of course, the more mission-critical Haskell becomes for more organizations or people, the more stress it places on this rather loosely coupled distributed kind of organization.
By way of an example for the language itself – for a long time, here’s how we would get new features into the language. Well, it would basically be Simon Marlow and me, we would think, you know, “we should do this”, and we just put it in and there it was, in GHC. Or an intern would come and spend a summer at Microsoft and would implement something.
But as the language became more important to more people, we had to pay more and more attention to not being disruptive for people, and also it became less satisfactory to say the way you get something in the language is by knowing Simon. Partly because it’s obviously too elitist, but also because it just makes too much of a bottleneck. That led to the GHC proposal process. Of course, it has its faults, but at least it means that if you want to get something into the language, you can write a proposal and have it discussed by the community at large and then have it agreed or not agreed by a committee. That’s a way in which we’ve incrementally moved towards something a bit more transparent and (I very much hope) something that offers more of a voice to more people.
Some people might think that the proposal process lowers the development velocity and even the potential of a feature to end up in the compiler. What is your commentary on this?
There’s two different things going on here. First, there’s a tension between a language and a compiler that is a laboratory for exploring the bleeding edge of what it means to be a good programming language and, particularly, what it means to be a good functional programming language. Haskell has always been a laboratory like that – it explores a particular corner of the design space for programming now, which is, namely, “purely functional programming languages”, and really tries to invest in that.
And being a laboratory, a motherboard to plug in lots of ideas is in tension with being an utterly reliable baseboard for mission-critical applications in industry.
That is a tension which other languages have mostly addressed by not changing very much right, but going on the side of the latter – becoming very solid and reliable but not able to move very much. At the moment, it is still moving quite nimbly, you know, we had a lot of discussion about Dependent Haskell, linear types got in recently, so there’s a lot going on at the moment.
To a certain extent, we navigate that tension by a kind of consensus among Haskell’s users that they’re signing up to being part of a rather grand experiment, and the ones who really don’t want that are probably not going to use Haskell. But at the same time, we now pay exhaustive attention to back-compatibility questions and to migration plans, and so forth. We take all these issues much more seriously than we did so, but, nevertheless, I would be very sorry to lose that.
There’s a second component which is we agree that we keep moving – but how do we keep moving? How do we decide what to put in? The process of putting new stuff in the language – that’s the GHC proposal process. First thing to say is: I think the process is often perceived as being more intimidating or difficult than it really is. It can be slow, and it can be slow for two reasons. Number one is a good reason, which is that there’s a lot of debate, and proposals massively improve, in my opinion, through this debate. Very significant changes happen to many proposals in the process. I really like that. It takes time, and if you just think “I’ve got a good idea, I want to do it”, it’s frustrating to spend three months talking about it with the community and seeing it improved. You may not even think it’s improved, but somehow, there’s a greater consensus involved about it. But I think that’s basically a good thing.
The second thing is not such a good thing, which is that the committee can be slow. Why is that? We advertise that once you say as a proposal author – “please accept or reject or push back this proposal” – then stuff is supposed to happen within a month. And it doesn’t always happen within a month. Why is that?
Well you know, I’m at fault as much as anybody. Everybody’s busy, and the community consists entirely of volunteers who are giving their personal time to thinking in detail about language changes that other people propose out of the blue.
So that’s quite a big gift that they are giving to the community and I want to apologize to everybody for whom that has seemed like a delay. If you think you could make that better, please volunteer – next time we ask for nominations, nominate yourself, we need people who are willing to devote thoughtful cycles to doing it.
But sometimes proposals, you know, simple ones – they get proposed and agreed in a few weeks. It isn’t and should not be that intimidating, and I feel apologetic when it is, and if anybody’s feeling bruised by this, come and talk about it specifically, about the process, and say I’m finding this a bit difficult, what can we do?
Could we talk about Haskell best practices in the big picture? As you said, the Haskell language has changed over time tremendously. What are the additions to Haskell that changed the way you program in Haskell the most?
One thing to say is that while Haskell the source language has changed really quite a lot over time, Haskell the core language, the internal language has changed very little over time. It still contains let, lambda, variables, constants, functional applications, and the big change was to add casts and coercions, coercion abstractions. Now that was a single big change that we added – the little internal language had seven constructors and now it has eight, or something like that.
And it’s really quite a small change. I mean, it has huge pervasive consequences. I want to say that somehow, the changes to the surface language are superficial. They’re profoundly influential in how you think about it, but they all compile to this same core language. So that means there is a certain intellectual coherence to everything that’s going on, it’s not just a ragbag of features we keep slapping on, even though it may feel or look like that sometimes.
To return to your question about what changes in the surface language have influenced my own programming practice – the biggest stylistic change that’s happened in GHC recently, I think, is the move towards this “trees that grow” idea. Now, trees that grow, you can search for that keyword as a paper on my homepage about it, is a way to make sort of extensible data types using Haskell. This is really useful for Haskell’s abstract syntax tree. Haskell has a very large concrete syntax and so, correspondingly, has a large abstract syntax, that is the internal data type that describes Haskell programs after you’ve passed them has dozens of data types and hundreds of constructors and GHC then, during its renaming and type checking phase, decorates this tree with lots of additional stuff: types and scopes, and all sorts of extra stuff get gets added onto the tree. So, at first, we had a tree that was just GHC specific, but then we realized increasingly that other people would like to parse Haskell themselves for other purposes, so what we really wanted was a sort of base library that contained the core abstract syntax tree with its dozens of data types and hundreds of constructors and then some way for GHC to customize that tree, to add all its decorations and that’s the trees that grow idea and that lives off type families. We use both type families and data families quite extensively to power the trees that grow idea. It’s not what type families and data families were originally intended for, you can use them for all sorts of things, but it’s a major application within GHC, and it’s pushed a sort of stylistic change through the compiler.
If you needed to explain to beginners and people who are just starting out with writing, like, let’s say industrial Haskell, what would you tell them about laziness?
I’d tell them to read John Hughes’s paper “Why Functional Programming Matters”, which is an extremely articulate exposition of how laziness enables you to build more modular programs.
That said, GHC’s internals have a lot of stuff that’s about inferring strictness through strictness analysis. Also, Haskell has many annotations, bang patterns and seq, and so forth that let you force the compiler to use call by value. So the intermediate language, in fact, although it is a lazy language, has a lot of support for call by value mechanisms as well. So, in effect, it’s not lazy evaluation, it’s not really an either/or. Every lazy language, well, certainly Haskell, has a lot of support for strictness, and strict functional languages like ML or OCaml typically have support for laziness as well.
So then it’s just a question of where you start, and we kind of move towards the middle now. Ideally, I suppose you’d like some glorious unity in which you didn’t have to have a default, but I don’t know how to do that at the moment.
People often say: “well, if you were designing Haskell again, would you make it strict?” So, to begin with, historically, the defining feature of Haskell was that it was pure and lazy. And, in a way, it was remorselessly pure because it was lazy. Because with lazy evaluation, you don’t quite know when anything is going to be evaluated and so, if it’s impure and you have side effects like opening valves or writing two variables, you don’t quite know when those things are going to happen, and if you’re opening valves, you really want to know which valve you’re going to open first.
So a lazy language that is impure seems infeasible. Haskell’s laziness made it impossible for us to contemplate impurity. Now, of course, we still let it in through the back door through unsafePerformIO, but it’s called unsafePerformIO for a good reason – we even put it in the name.
The most helpful thing about laziness in retrospect was that it forced us to be pure, and that meant that Haskell initially couldn’t even do input/output because input/output is impure, and that, in due course, led us to Phil Wadler’s brilliant insight of taking ideas that [Inaudible] had been writing about, about monads, and applying them in a very practical context of a programming language. And that led us to our paper “Imperative Functional Programming”, which was the sort of breakthrough moment when we said: “Ah, now we can see how to do both pure and impure functional programming together in the same program without the impurity messing up the purity.” That was an amazing, amazing moment, and it would not have happened had we not been forced into this pure vision by laziness, but now it has happened, now we know how important purity is, could you imagine a pure strict language in which you do not give in to impurity even though you know the evaluation order? And I have flirted with the idea that you could redesign language as being a strict language and, I think, then ten years later people will be saying: “if you were redesigning Haskell, would you make it lazy?” Yeah, the grass is always greener on the other side of the fence, so I’m content with our choice.
I very strongly believe that purity is something we should stick to remorselessly, but I’m very open to the possibility of having a pure strict language with really good support for laziness. But I don’t want to shift Haskell to that because it would be too big and disruptive a change for too little a payoff.
Do Haskell engineers need to have a deeper background in mathematics and if not, is there anywhere where it might be helpful to have a background in some mathematical theories?
I’m not sure that you need a deeper background in mathematics, but I think that Haskell appeals to people to whom mathematics also appeals. It’s attractive to a similar set of people, so it’s not, I think, that you make any very direct use of mathematics other than things that every programmer needs, like reasoning. If I’m going to predict what this program is going to do or think about its results, and I need to reason in a logical way, and there really is an answer, if I apply this function to this argument, I’m going to get that result. You might call that computational thinking, but you might also call it mathematical thinking, but that’s equally true if you’re writing Java.
I think because Haskell functions by default are pure, they behave like mathematical functions, so that the intuitions you get from math when you say “f is a function that squares its argument” – when you talk about a function in mathematics, it’s always implied that it’s a function without side effects, of course, you call it twice, it gives the same answer, anything else would be stupid – Haskell sort of cleaves to those same sets of intuitions but I don’t think you need any mathematical understanding to make sense of that.
And then you might wonder about category theory and functors and monads, and so forth. Well, I didn’t originally know any category theory, and even now, I’m sort of a weak category theory user, but probably I get most of my intuitions from programming with functors and monads rather than from the category theory itself. So I don’t think you need to know category theory to write Haskell programs either.
I think probably more important than a sense about math is, I think – the big thing that Haskell really pushes you towards is taking types seriously, and I don’t just mean as a little safety net to check you don’t add an int to a character. I mean that a type is like a weak theorem. It’s a true statement about what the program does, and you can think of the function definition as the proof of the theorem, but the type is the theorem, and over time you find yourself putting more and more stuff into the types.
People said, you know, what’s the equivalent of UML for Haskell, and after a bit, I realized it’s the types. We don’t need a separate modeling language. The very the first thing you do when you start writing Haskell programs, you start writing type declarations for data types and types or functions, and that is a kind of design language or modeling language that enables you to think about the main pieces of your program, and the data flows, and what goes from from here to there. And it’s not just a modeling language that’s separate, it becomes a machine-checked part of your program in perpetuity. And that idea of involving more and more in types, that thinking, I think, really is pervasive and important to a Haskell programmer.
It’s much more apparent in Haskell than it is even in other statically typed programming languages. I’m not talking here about Scala or OCaml, which are in the same ballpark, but compared to C or any of these untyped languages, and that’s quite a big cultural shift. And I think it probably does again appeal to mathematicians a bit more than to non-mathematicians, but anybody can accommodate it. You don’t need a math background to program in Haskell.
What do you think about the future of computing? Do you think that we will have big breakthroughs in the way that computers run our code? For example, there is some research on reversible computing and quantum computing and stuff like this. Could you talk about this?
In the 1980s, when I was at Cambridge doing the computer science diploma, there was a lot of excitement around the idea of designing hardware to execute functional programs directly. Indeed, there was a whole conference called Functional Programming and Computer Architecture (FPCA). Some of my first papers were at FPCA. It later merged with the Lisbon Functional Programming Conference, they were both every two years, so they merged to become ICFP – the International Conference on Functional Programming.
But it was very interesting, it was “functional programming and computer architectures” right there in the title. Isn’t that an amazing thought? But over time, I think it became clear that first, it’s really hard to compete with Intel, it takes such an enormous investment to build hardware um hardware chips that it’s very hard to compete, but also that when you have a really good compiler, what you don’t want to do is to build hardware that interprets at runtime stuff that you could have compiled away statically beforehand. That would be silly, that’s just investing cycles at one time to do stuff that you can just [waves hands].
And that’s really what GHC is spending a lot of its time doing, it took us quite a long time to figure all that out, we wrote a whole book about that. I can see some things where we could do with a bit of extra hardware support, particularly, better support for [Inaudible] and [Inaudible] for garbage collectors, not avoiding fetching cache lines for the heap cells that you know to be empty. I think there’s some modest hardware support that could help, I don’t know whether it would be transformational.
Quantum computing may change the world but I don’t think it’ll do so quickly; it’s always 10 or 15 years off, and I don’t think we will write Word processes with quantum computing. It probably will start to become of direct utility, but I think it’ll probably be in fairly specialized applications. I don’t really know, but it’s a bit like nuclear fusion, I think it may always be a bit on the horizon. But because it would be transformational, I think it’s worth investing in, I’m just not holding my breath.
I think it’s unlikely that a lot will change for most of us anytime soon from a hardware point of view, I think probably the biggest thing that is changing for us computing wise at the moment is the whole machine learning / artificial intelligence thing.
Now I know it’s terribly flavor of the month and terribly laden with hype, but the fact is that for years and years, you know, for my entire professional life until relatively recently, I thought the way you get a computer to do what you want it to do is to tell it step-by-step or perhaps declaratively like in Haskell exactly what to do. Now, machine learning is a complete end run around that whole thing. If somebody said tell me step-by-step a Haskell program to recognize a cat, that would be really hard to do directly in the way that we think about writing algorithmic programs, but a machine learning program is quite good at doing that. Well, the program is pretty simple and it’s only a few dozens or hundreds of lines of code, still is a program, of course, still executing on the same old hardware, maybe with a bit of specialized hardware support. But as an approach to telling a computer what you wanted to do, it’s utterly different. Functional programming is different, very different from imperative programming, but if you zoom out a bit, then you know machine learning is way, way different again.
I think that’s going to shake loose lots of stuff. We need to think carefully about that. I do think that Haskell can be a really good language for describing machine learning kind of models, and there’s a tremendous amount of experimentation in the machine learning area, so having a language which is easy to move around and explore, and change things would be really good, there’s some Haskell libraries around that.
So I think in terms of the big influences on the future of computing: leave aside the hype, I do think the whole machine learning thing is going to have a slow seismic impact. It’s not going away, it’ll have a long-term impact. I don’t think it’ll directly affect functional programming, but it’ll directly affect computers, it’ll directly affect computer science in a big way.
But I continue to think that functional programming is on this long march. To say, if you want to write a program in an algorithmic way, “Plan A’,’ as it were, then I think that over time functional programming will increase its steady growth and, in fact, infect the mainstream world more and more. I often say that when the limestone of imperative programming is worn away, the granite of functional programming will be revealed underneath. I still think that, so I think there’s a long-term trend there as well.
We would like to thank Simon Peyton Jones for the awesome opportunity and the insightful answers – there’s a lot of hidden gems in the full interview.