Author: Denis Redozubov
I was asked several times, why I prefer using such programming languages as Haskell and Rust since they are not the most widely used and popular tools. I’ve written this post to demystify what is going on in my head when I think about the technology selection.
Developing software that must meet the requirements for long-term operation and defined reliability level is in a sense similar to a chess play. In both cases, it’s rather difficult for the human brain to comprehend scenarios. Experience is of great importance, and every move/choice can be critical. Further resemblance implies that, just like in chess, development is very much positional, i.e. a whole set of moves can be focused on preparation for a maneuver which results in winning a single pawn. It might appear that it’s merely one pawn but in a serious game it may become a considerable advantage. Similarly to a positional game over the chess board, the development and evolution of large-scale projects involve constant decision-making focused on solving major tasks or implementing the project requirements. The effect of all, even minor, solutions tends to accumulate by the endgame or by the moment the software product is in operation. However, the difference complicating the situation is that, unlike chess, software development is not solved using a computer. You can’t find the best moves just by running a computer engine. That is why it is necessary to make many decisions incrementally leading us to this goal, and all means of improving our position are worth using.
In the nutshell, solutions can be divided into several categories: architectural, procedural, and instrumental. Architectural solutions show the way we structure the project. Procedures define how we organize the work process and assure the implementation quality and correctness. Instruments, in their turn, determine what the development team should use to achieve the goal. Today, the end-to-end software development is carried out using a large number of tools: you need to formalize the requirements and the development process, write the software code and test it, assemble the release etc. Despite this flow of tasks, selection of programming language can be of the greatest importance because this choice determines the following set of parameters:
- Performance baseline.
- Peculiarities of the software distribution and operation, for example, the interpreter’s requirement or the static linking ability.
- Ecosystem of reusable libraries and components. I’d like to note that it is not only the number of libraries that matters but also the quality of those relevant for you.
- Possibilities of parallel/concurrent/asynchronous operation of the programs, which may be important for many systems.
- The difficulty people face when learning the technology, which influences significantly both the language community and the developer retraining.
- Language expressiveness which is somewhat subjective, but still being felt by developers. Additionally, the selection of programming language can have a strong influence on the structure of development. For instance, the language ecosystem tools may determine the way unit tests are written and the tests scope. A good infrastructure for property tests can give a boost to move in this direction, while the lack of good infrastructure for unit tests can make their development and support more challenging.
The tools also influence architecture-related issues – reuse of the system modules is linked to how easy it is from the conceptual viewpoint to divide the units and structure the code. For instance, explicit work with the effect systems enables better code generalization and allows making sure that the software code unit doesn’t perform any input/output operations, such as network and disk operations. This allows talking about the safety and architecture.
Considering this, you should be aware that the correct selection of programming language for your project and team may have far-reaching implications. Keeping in mind the chess analogy, we remember that every minor advantage contributes a point in favor of the language and can play a significant role in a large-scale development. It should also be noted that I’m talking about selecting the development tools in situations which set no strict constraints on technology selection relating, for instance, to a large ecosystems already written in a certain language. At Typeable, we are guided by the following reasons for general-purpose languages:
- The programming language should support static typing. This allows the developer to reduce the duration of each iteration of the code modification and validation. This also allows reducing the number of bugs significantly, both in terms of functional requirements and software safety.
- Algebraic data types – it’s difficult to overestimate the influence of this feature as soon as you start using it. This is a simple feature, absolutely necessary for invariant modeling. Sum types are also so indispensable that selecting a language where you need to simulate them using other constructs means creating obstacles at the first step.
- Flexibility of support and execution of multithreaded programs. Languages with GIL (Global Interpreter Lock) fail to meet this requirement from the very beginning. It’s desirable to be able to maximize the hardware utilization and work with sufficiently high-level abstractions.
- A sufficient ecosystem of libraries. We also subjectively assess their quality. We don’t think it necessary to connect everything in the form of libraries, but such basic things as bindings to popular databases should be available.
- Clear minds in the community of developers who work with this language. A developer we would like to see in our team should be interested in CS and development. This is opposed by “easy-to-learn” technologies tempting people to join IT for the sake of easy money, which greatly dilutes the workforce.
- We should have programming languages in our toolbox that allow creating the software meeting strict time and memory requirements.
Considering all of the above, our toolbox should allow us to hold a steady position in any project going our way. Going back to the chess analogy, these are our principles that let us play a highly positional game. Positional game is a game aimed at creating a long-term position that opens up possibilities for the player and minimizes the weaknesses. It is opposed to an attack-oriented game, i.e. “sharp game” associated with higher risks. The attacking player strives to end the game before the opponent is able to take up a strong defense. Sharp development includes programming contests, MVP for marketing experiments, many data science tasks and, in many cases, software development for Computer Science publications. They are similar in that they usually don’t require any long-term support as they just have to work for a definite period of time. On the other hand, positional game means a long-term play where maintainability and updateability are the key characteristics. This is exactly what we do, and we need a solid foundation to be sure that the software we write and update can operate for a long time. Though such projects can also start as MVP, they are based on quite different assumptions.
So why do we select the technology based on exactly these considerations? Multiple reasons can be given. First of all, it’s a good idea to exclude the issues of technology fashion and trendiness to improve predictability over a large timespan. Even though a time-proven compiler with an active community is a conservative option, this choice is reliable in contrast to new flashing options popping up every year. Surely, some of them will move from the last category to the first one but we will know this only later, probably in a number of years. Instead of the fashion trends, we strive to use the fundamental Computer Science and the great number of research works devoted to this topic which have been applied in the programming languages we use. For example, the type theory is a discipline close to both mathematics and CS, dealing with the fundamental issues of requirements formalization. This is exactly what we need to write software. Besides, this is the combined experience of other people engaged in "exact" sciences, and I believe, it's absurd to ignore this experience. It makes more sense to take such discipline as a basis rather than use nothing or use a subjective opinion based on the personal experience of a particular individual.
Secondly, we are looking for the programming languages and compilers embracing the largest possible number of our principles. This is why, in addition to our favorite Haskell, we’ve put Rust in our toolbox. For real-time requirements and strict constraints on memory utilization, we need something rather low-level. The typing strictness in C is still far from perfect, so if we can use Rust for such tasks, we’d prefer doing this.
The third reason is that we create software primarily for our clients and we’d like to protect them from our biases. That’s why we can’t exceed a certain risk level agreed with the client when we select a tool. But even under these conditions we’ve got rather marginal technologies such as GHCJS, because the complex analysis of pros and cons still produced an attractive picture for us and our clients. We wrote already about how we arrived at this decision: Elm vs. Reflex.
All means and theoretical justifications are worth using when you work with large code bases and complex software as you have to keep this complexity in check somehow. Our idea of a correct approach is to protect every pawn, improve our position gradually and carefully, so that the project could exist in a good and stable state until the moment it’ll play pivotal role for our clients’ businesses.