DEV Community

Cover image for Supersets and their humble beginnings
Ben Lyon
Ben Lyon

Posted on

Supersets and their humble beginnings

   "As we all know, each language has its own vocabulary. Words in English are not the same as words in French. It is the job of a dictionary to convert words from one language to another. How does the dictionary know which language we are using and convert words to the correct language?"                                     Raymond Chen, Microsoft Developer, creator of The Old New Thing
Enter fullscreen mode Exit fullscreen mode

Programming languages, much like spoken languages, have developed from the most basic forms of communication (machine code) to flexible, specialized, and succinct tools (C++, C#, Swift). An early programming language such as Assembly, could be (loosely) compared to the spoken language of Latin. While Latin is still taught in some university applications and isnt nearly as long winded as machine code, our conversations are held in region-defined inflective languages. (Old English vs Modern English)

Let me take a moment to explain what I mean by 'inflective' language - if I were to call someone a 'Joey', that would provide the assumption that you know what I mean. However, if you don't get the reference, then I need to explain who Joey is, and the source material of who he is. This sort of explicit explanation creates a longer conversation, and allows more room for misunderstanding in the joke. By using inflective language, you are able to use a context-specific shorthand to give additional meaning to a singular word.

Image description

The example I'll bring here today will be a comparison of machine code to assembly code. Douglas Hofstadter, Professor of Cognitive Science and Comparative Literature at Indiana University once said that 'looking at a program written in machine language is vaguely comparable to looking at a DNA molecule, atom by atom.' To make this comparison clear in our programming, machine language is a low-level language that can be read by the CPU directly, meaning that we are directly accessing the instruction set of the CPU architecture. This code is nigh-unreadable, and boils down to punching 1's and 0's into the terminal. While it can be run without the need for translation, writing it is time-intensive and highly prone to errors. Machine code can roughly be equated to using punch cards to write your program. Pain.

Image description

The extension, or super set of machine code, would be an assembly language. While also a low-level language, it provided the human-readable key of...well, keywords. Keywords provide a shorthand alternative to explicit implementation, and provides better methods for writing and debugging. It takes less time to use these keywords to get the desired result, allowing for faster development times. The really cool thing about both of these forms of programming language, is that they're written to directly interface with the CPU architecture through the designed instruction set of the processor family - meaning that you can write assembly code, or write in machine code, and still usually get the desired result.

Image description

The modern(ish) equivalent of this comparison, is the superset language. Superset languages provide all the features of a given language, and give expanded and enhanced functionality over their subset counterparts. For example, the C programming language provided an understanding of data-types and structures over its predecessor, B. It supported dynamic memory allocation, allowing for more efficient use of what memory was available to computers in 1972. However, C was not capable of data encapsulation, nor did it have any native methods for direct exception handling - bug management had to be implemented by hand. C's superset, C++, developed in 1985, provided this missing functionality natively, and expanded C's capability by providing direct exception handling, classes, inheritance, polymorphism (the ability to behave in multiple forms) and data-encapsulation, a key element in object-oriented programming. The additional benefit to using C++, is that it is syntactically similar to C, meaning that conversion of your software from one language to another isn't as monumentally huge of a task, in comparison to rewriting your 1's and 0's to assembly.

If a simple takeaway can be given, it might be that while languages do change, our current state of programming languages provide a level of flexibility in writing and maintaining our code, giving us the ability to allow our inflective language come out, and bring new levels of capability to our work.

Top comments (0)