This is a collection of ideas about how modern programming languages, and as a result programming in general, could be improved. While I've worded everything as statements rather than questions. I would appreciate any feedback you might have for me. I hope to start a longer-term project from this, but a discussion about the concept is necessary first, to avoid screwing up and looking like a pleb. If there's something you don't follow, feel free to ask in the comments.
I've used a fair few programming languages, and there is none I truly like.
If I write a program, I can't be sure that it won't crash, deadlock or spinlock, I can't be sure it won't steal my moneylink-1, link-2 or leak my data, I will have to rewrite it to work on different hardware, if a non-software engineer hands me their script for use in production, I will probably have to rewrite it...
There is, in short, a distinct lack of metaprogramming. Although the languages we use are generally Turing complete, allowing us to compute anything that is computable, they do not provide adequate primitives to reason about our solution.
The result of our current approach is higher barriers to entry for programmers, more bugs & vulnerabilities, longer development time, stagnation in hardware development, poor quality products and shittier jobs for CS graduates. In a world increasingly relying on software, spaceships explode and people die because of preventable errors.
Functional programmers might expect this post to be a praise of strongly typed, purely functional languages, but even those languages have failed to truly deliver on their promises. Nevertheless, they do hit closer to the mark, so I will use them as a starting point and suggest changes.
I'm going to refer to Haskell and Elm a lot, not because they are the best or the worst, but because these are the pure languages I'm most familiar with. If you know a language that solves all of these issues, please let me know.
Effects have been a thorn in the side of purely functional programming since ever. A program without effects can't do anything, it doesn't even start computing, so what would the point be of writing one? It can therefore be alluring to model effects in a pure language in any way one might find, and monads do seem to fit. However, effects ≠ monads. They are just a way to model sequential black-box state changes. Essentially a way to do imperative programming in a FP-style category. Haskell+IO is conceptually no different from programming in Clink-1, link-2.
Effects as monads are the death of proof systems. Because monads are monoids, they destroy information. Application of an operator on a type in a monoidal category results in that same type.
What does that mean in practice? Consider the following pseudocode examples of theoretical 3rd party libraries:
readImage :: IO imageData readImage = do readFileData "./imgs/cat.jpg"
totallyLegitReadImage :: IO imageData totallyLegitReadImage = do stealPasswords findAndSendNudes openBackdoor readFileData "./imgs/cat.jpg"
That seems like a rather important distinction to overlook if you claim the type system makes your programming language safe.
Rather than functions returning effectfull monads, we can expand a function's type to be 2-dimensional. We still have the traditional type, let's call it the memory-model dimension, where a function is a proposition a → b. We add a type in the effects dimension. Specifically, an effect type might generate a DNF (disjunctive normal form) of all effects a function can have. It could also be extended to include further order constraints. Monads are also a kind of second dimension, but with serious limitations on what that second dimension can do (sequentiality & non-specificity).
Regardless of how we model effects, we would prefer to have as few of them as possible, which leads me to the next point.
Any functional program has a barrier between the functional world that is free of effects and the real world where effects happen all the time. Crossing that barrier is what any effect is. Currently, that barrier is rather more restricted than necessary.
Let's look at Haskell again. Here the barrier encompasses everything from reading/writing to disk, network connections, listening to the keyboard, etc. Essentially, the functional world only spans over application memory and CPU. This makes a lot of sense if your intent is to write an executable in the traditional sense: one running on a von Neumann architecture with other applications and an OS, without making any assumptions about what other applications might be doing.
This approach certainly seems safe, as it makes few assumptions. On the other hand it also limits the reach of the proof system, because proving things across communicating agents is a lot harder. That in turn potentially makes the overall project less safe. Suppose, however, that the entire node is managed by the compiler; files on disk would just be a constant. This is useful if you're e.g. building a deep learning model in a Jupyter notebook, where you might assume nothing else is poking around in your local directory. We could also bring multiple nodes under full control of the compiler, or even a database. These are common situations in real-world applications.
The functional boundary should denote the abstract inputs and outputs of our solution, rather than one-size-fits-all executable-style IO effects that shrinks the safe functional world unnecessarily.
With the functional boundary reaching beyond memory + CPU, it becomes possible, even necessary, to separate the expression of what a program should do from how it should achieve it.
Programming languages started as direct control mechanisms for computers, adding various layers of abstraction to gradually move away from low-level error-prone instructions to higher, more elegant structures. We've gotten call stacks, named variables, loops, virtual lookup tables. We've replaced byte offsets with types and replaced types with hashmaps (weak/dynamic types). Functional programming did away with instructions altogether.
Nevertheless, all languages I'm aware of still build upon their predecessors and carry over details of their intended architecture. Programming languages are being held back by the notion that a developer wants to create a controller for a machine. This is wrong. A developer wants to create a solution to a problem.
Suppose then, that we could create some program in "what-space", and have some (interactive) process by which we transform it into a program in "how-space".
With this split comes a divide in required domain knowledge. Performance tuning, which may require knowledge of advanced CS topics, could be handled independently and by different people from business logic, as the two would no longer be intertwined in the same code.
Parallel computing and hardware development would also greatly benefit from thislink. We've been stuck with x86 for decades because the competition wasn't only one of architectures, but also one of available software. And so innovation has ground to a halt. Attempts at replacing x86 (Intel has tried) failed, and core count increase was much slower than anticipated. Remember Intel vs AMD and the single-threaded vs multi-threaded performance debates? Single-threaded performance pretty much won, because that was what software was written for.
Part of this separation might also be getting rid of data-structures. On a conceptual level, all data is just sets and relations between sets. However I'm not quite so sure about this idea, not yet anyway. Opinions are certainly welcome!
Certainly though, we should get rid of types such as "int32" in the what-scope.
Contemporary type systems cannot be fully trusted. At some point, somebody decided to hide something in order to make it look less complicated. It isn't less complicated. Let's look at examples from Elm and Haskell.
Elm actually got it almost right. It has, in fact, played a big role in shaping my current views on the matter. It uses the type system to "prove" there will be no runtime
Still, Elm's type system does have holes in it. The problem lies with its axioms, that is, the base functions from which we create more complex programs. Division by 0, for instance, returns 0, so the function looks friendlier to beginners. Because
x/0 = 0 cannot be provenlink, it results in invalid proofs, making it essentially a silent failure. The complex behavior is still there, but it's hidden from the proof system.
This is going to get a bit formal, but such is the nature of discussions about Haskell. If you don't follow along but would like to, feel free to ask questions in the comments.
x, when used as a type, actually denotes
x ∨ ⊥ (reads "x or bottom"). Where ⊥ is the bottom, which is sort-of an errorlink. The motivation for this is essentially the halting problem: you cannot prove termination for every program written in a Turing-complete languagelink. "∨ ⊥" is used to denote the possibility that a computation may not terminate. Exceptions are also based on ⊥, and are therefore allowed inside pure functions, exposing the proof system to exceptions.
There are a few problems with this approach.
First, non-termination is not a value. It is a property of the function. It seems disingenuous to represent it in a disjunction with a value. Mathematically convenient, but far removed the real world. A function's type is a proof/arrow, which exists in the Hask category (cannot be proven to terminate), but not necessarily in the Set category (can be proven to terminate). Exceptions, on the other hand, are values and should be explicit in both categories.
Second, we can prove termination for the vast majority of functions. Almost everything I've written consists of O(1) operations, O(n) maps, O(n) folds and O(log n) reduces. I don't recall ever writing a O(???) program that wasn't an IO loop, which cannot be covered by a functional type system anyway. If this is indeed a general pattern, we should distinguish between functions in Set and functions in Hask, since it would give us additional security about the former.
Third, nobody will ever want to run a potentially non-terminating function. There is a point where we say "fuck it, this is taking too long" and pull the plug. Termination will be enforced in practice, so even fully functional programming should be enough for the vast majority of use cases.
Haskell also has
unsafePerformIO, but at least those can be easily detected with static analysis.
We've already discussed adding proof of effects and proof of termination to the type system. There are many more qualities we might want to prove, especially if we have aforementioned separation of the "what" from the "how". Every proof increasing the dimensionality of the type, similar to how effects added a dimension to functions. I'll name a few:
Functional programming is full of cases where complexity can trivially be proven. Let
C(f) denote complexity of
f, we then have:
C(map f) = O(n * C(f))
C(fold f) = O(n * C(f))
C(reduce f) = O(log n * C(f))
Based on my past experiences I think these operations, combined with O(1) primitives, will cover 90% of cases.
We may also be able to determine if a program can be vectorized (if it can run on a GPU/SIMD architecture): we always can for a O(1) program using conditional writeback. Same for a single
map f and
reduce f if
f is O(1). Similar arguments can be made for FPGAs and VLIW processors.
We can also prove single-threadedness, which can help with heap recyclinglink. While one might argue heap recycling, whether it is done with special operators or uniqueness types, should be explicit to the programmer to avoid poorly-understood lack of optimization, the separation of "what" and "how" nullifies this argument.
Reasoning about data safety will only hold within the scope of our proof system: for instance, if we have an executable that is internally perfectly safe, an adversary program running on the same machine might exploit a vulnerability of that machine (e.g. cache misprediction + timing attacks) to read the memory of our program. Nevertheless, let's continue with the possibilities that exist within the scope of the type/proof system.
Data-dependencies are explicit in declarative languages. Suppose we mark data
x as "sensitive", if we have a function
f :: x → y, with no side-effects, we know that
f(x) will also be "sensitive". We can use this to track where data is going. Furthermore, we could add an axiom
encrypt :: Sensitive PrivateKey → Sensitive x → String to remove the "sensitive" label.
Similarly, we could only allow types labeled as "NotAPassword" to be stored, with an axiom
saltAndHash :: APassword String → NotAPassword String.
There are other things we might want to express, probably many I haven't thought of. For instance, it is incorrect to fold a sum over floats. Not only is it slower than using reduce, it can also yield the wrong value. The accumulator float can get so large that the change of each individual addition is lost in numerical inaccuracy. If you don't know about this: try repeatedly adding
1.0f to a floating point variable. At some point it will stop increasing. Before overflowing that is.
I don't think all of this could be implemented in a new language. There's quite a lot of experimental ground to cover. For instance, ATS has cool new features, but it's syntax is rather unpleasantlink. I'd rather not worry about syntax when defining language primitives. Instead, I would create a framework from which to build languages, one that defines the abstract shape of types and terms and handles proofs and composition in general. One for which compilers to different languages/platforms can be written as plugins.
I think there's a lot of untapped potential in metaprogramming, and am seriously toying with the idea of setting something up, but it would be a rather large project. If you've read so far, I would love to hear any thoughts you have on the matter in the comments.