Should programming languages be made for IDEs rather than humans?

DrBearhands on November 01, 2018

This is something I've been thinking about for a while. Why are we creating programming languages for humans, rather than for an IDE, allowing th... [Read Full]
markdown guide
 

You're missing the point: text files are easy. They can be a little verbose sometimes, but as a 'substance of expression', if you will, they're unmatched. You can create or modify text with the single most common human-interface device metaphor in existence, which children now learn to manipulate in or even before schooling. You can read text with a million different programs and style it, filter it, format it, cut and paste and perform a billion different operations on it. A certain clunkiness in writing out equations is a small price to pay for that kind of flexibility.

If you were to "decouple" formatting and semantics like you propose, you wouldn't be able to avoid coupling the semantics to the editor instead. And there'd only be one editor, until you developed something else that could interact with your structured representation of a syntax tree. This isn't to say it can't or shouldn't be done -- APL had its day, educational tools like Scratch do well, and there are of course a variety of "no programming experience required" flowcharting and modeling languages -- but it's an idea that can only compete against text files in some pretty specific niches.

 

If you were to "decouple" formatting and semantics like you propose, you wouldn't be able to avoid coupling the semantics to the editor instead

Not necessarily, css/html is also a decoupling of formatting and semantics. The main point here being that the semantics on their own are no longer supposed to be read by humans.

 

HTML and CSS also aren't programming languages as such but markup and styling languages (although HTML5+CSS3 are evidently Turing complete if you're willing to put in the effort). But this is actually an interesting point: WYSIWYG markup editors which truly decoupled formatting or visual layout from the semantics of data binding and interaction were a thing in the 00s before everybody realized they were terrible and concentrated on building better plain-text templating languages instead.

HTML and CSS also aren't programming languages

True, but I wasn't suggesting using exactly those. In most languages, a program has a certain structure (rather like an AST, but not entirely). That is the "meaning" of a program. Where HTML has an <img> tag, an imperative program has a while loop.

Indentation, variable names, operators, import statements... those are all styling for humans. You could drop all of them and the language would be just as expressive, but not as readable.

 

This separation already exists. The source code defines the semantics, and the configuration of the editor describes the styling.

 

I think looking at Smalltalk will answer a lot of your questions. Smalltalk is exactly what you're describing: a language built for an IDE. It can't be used outside the IDE, as it doesn't store its data in plain text, but rather an image format. The IDE provides all sort of nice features and analysis to the user, and ideas from Smalltalk have influenced many other languages and IDEs since its creation in the 1970s. So why aren't we all using Smalltalk? I think the key lies in interoperability. Smalltalk is a world of its own. It doesn't interoperate well with tools that exist outside of the Smalltalk world. For example, you can't really benefit from git when you can't understand how to merge code in a binary image format. When I need to accomplish a specific task (say, some sort of build task), I need to know how to accomplish everything I need in the Smalltalk world, using its tools. Smalltalk goes against the unix philosophy: it doesn't do one thing, it does everything because its a mini virtual machine.

I don't have the authority to say if this is the entire reason we don't code in Smalltalk-like languages, but I think its part of it. There are plenty of new languages trying to push programming languages in new, interactive directions (ex: Eve), but none of them have gained critical mass or mind share. There must be some intrinsic reasons that languages like this don't take off. Hope this provided some insight!

 

I'm not an expert in the question, but as far as I know the main reason of SmallTalk failure was licensing model - it was very expensive. I consider Ruby the closest reincarnation of SmallTalk OOP model (which is widely used). Ruby doesn't have "forced" IDE though.

 

I've honestly never used Smalltalk, guess I will have to take a look at it.

 

I haven't used it much either, but what I've seen has been interesting. Pharo is a pretty modern implementation pharo.org/.

 

I devoted my entire PhD to the pursuit of a programming environment that goes beyond just text. At the core, I decouple what developers read and write from what's stored on disk. This enables significant enhancements to both the development UI and the program code compared to text-based systems.

I developed a prototype IDE called Envision to explore these ideas. Here is a youtube playlist with 5 short videos highlighting features you might find interesting.

In case you want to dig into the research here is the project page at ETH Zurich that has freely available PDFs of all our publications. I recommend just looking at the final PhD dissertation, as it contains extended versions of all the papers:
Envision: Reinventing the Integrated Development Environment. All the publications (and especially the dissertation) contain lots of screenshots that illustrate the main points.

As of now, Envision is somewhat on hold, as I've finished my PhD, but I hope to get back to developing it more actively soon. You can find Envision's code on GitHub.

 

Although I think we definitely should look for alternative ways to program, the concept of Envision is IMHO a dead-end. I've seen approaches like this before, one has even made it into a product I had to use at work. Everybody hated it. Here's the problem:

visually rich presentations of code

As long as code is text in your thoughts, there's no better visualisation for it than (syntax highlighted) text. Graphical representation of text is utter shit. It makes it hard to write, hard to version/diff, and hard to view anywhere else than in the IDE it has been developed in.

If you want to abstract code, don't try to display code. Slice code into packages/modules, display them as icons and orchestrate them! That's the way to go. I'm pretty sure about it. However you then have added the complexity of another abstraction layer.

 

Thanks for your comments, Thorsten. I am very curious what product you used that everybody hated, would you mind sharing?

I have myself used a few visual systems at work, such as Labview and Siemens Plant Simulator. It's true that these systems are not easy to use outside of their specific domain. Unlike these systems, Envision has been designed from the ground up to be generally applicable.

Regarding some of your other points:

code is text in your thoughts

Software developers do not "think in text". Developers think in abstractions (such as classes, functions, modules), control flow (branches, loops), data flow (steps of algorithms and data transformation), etc. Thinking in text would imply that the syntax of a language (as opposed to its semantics) may somehow influence the design of a system or a function, which is not the case.

Once we have decided on a design we have to create a corresponding program. This is mostly done as text, but doesn't have to be. As long as whatever editor we're using nicely maps to our mental model, things can work out smoothly.

(syntax highlighted) text

Syntax highlighted text is in fact a basic form of a visually rich presentation of code. One way to think of the visual aspect of Envision is syntax highlighting on steroids.

Graphical representation of text is utter shit

Graphical representation of "text" might be utter shit, but we're talking about graphical representation of programs. For example:

  • If you want to explain to a new team member how your system works, would you write a bunch of text or draw a diagram with the major components of your system?
  • If you are teaching someone a graph algorithm, would you just write a bunch of text or draw a graph diagram?

Graphical representations absolutely include text where it's the best way to communicate something. E.g., in most cases showing expressions as text is a great option.

hard to write

We have specifically designed Envision to support keyboard-based editing and shown that it is is as fast as typing in a text-editor.

hard to version/diff

Again, we have specifically designed a version control system that integrates with Git and provides a number of improvements over standard text-based diffs, both in terms of presentation and diff accuracy. You may want to watch the corresponding video and/or see the paper.

hard to view anywhere else than in the IDE

As long as the storage format is open and simple (both of which are the case for Envision), any number of editors can be made for it and show it in any number of ways. Take, for example, png image files. You can open/view/edit them in a number of different programs, each with its own strengths and weaknesses.

Slice code into packages/modules, display them as icons and orchestrate them!

I agree. This is part of what Envision does.

complexity of another abstraction layer

This is true, but I see it as a strength, not a weakness. This extra layer allows us to decouple the backend (program structure/code) from the frontend (editor/visualizations/text) and enables both to evolve in ways that are impossible if they are coupled.

Thank you for your detailed reply. The "tool from hell" is SwissRisk's X-Gen, a transformation tool being used at some banks. One might think that it's highly specialised on orders and trades, but it's rather generic and can handle any data as long as you're using XML. But here's the point: the design philosophy of the IDE seems to be based on the assumption, that typing (like on a keyboard) is bad. Unfortunately that's the only "innovative" idea, thus the graphical building blocks that you can drag'n'drop in the IDE are in fact just representations of elements of structured programming. What does that mean? Well in order to program something like this...

if (input.trade[0].type == "foo") { output.info = "hello world"; }

...which you can type in a matter of seconds, you will have to complete the following steps in X-Gen:

  1. create a new transformation.
  2. select the size of your workspace for the transformation. your options are "DIN A3" and "DIN A4". (yep, that's true.)
  3. drag'n'drop the [IF] square on the sheet
  4. open details of the [IF] square and enter 'input.trade[0].type == "foo"' (yes, here you're allowed to type something)
  5. drag'n'drop the [THEN] square onto the [IF] square
  6. open details of the [THEN] square and enter 'output.info = "hello world"'
  7. I also think that one needs to draw some arrows between the squares so that X-Gen knows the order of your squares (i.e. control flow) - well, we only have one square in this example that needs arrows, so 2 arrows are enough: start -> [IF[THEN]] -> end.

You see, this tool really represents text as graphics. It does not even try to step up onto the next abstraction layer, it just makes it really hard to write code by disabling typing for all the keywords.

Why did I write "code is text in your thoughts"? After a long day of coding it happens that I dream of code and then I really see text. Syntax highlighted code. But that's probably just because I stared on it for countless hours. It's not what I think when I am working on code. So yes, I was wrong. Developers think in abstractions, I totally agree with you on that.

I guess this all leads to the question: What's a good abstraction layer for graphical representation? I'm pretty sure the answer is "it depends". When documenting/presenting I like to use Visio diagrams (and ASCII diagrams) for giving an overview of the system, I'm working on. However these diagrams have very different grades of detail, depending on the importance of the components for the audience. So a shape can represent a bunch of hosts (not important) as well as a single function or REST call (important). An IDE on the other side should present a consistent level of abstractions with similar grade of detail for all (technically) equivalent components.

 

I'll definitely check that dissertation out!

 

This sounds a lot like the structure editor Facebook was working on, but extending the idea further so that limits on what the contents of a node (eg. a variable name) are are removed - very intriguing!

 

Donald Knuth described Literate Programming in 1979.

One of my coworkers at a previous company was Raymond Chen. As a grad student under Donald Knuth, he got to program using Donald Knuth's Literate Programming.

Raymond recommends against that style of programming.

 

Why does Raymond recommend against that style of programming?

 

Tooling is very poor. Debugging is very difficult. Documentation-and-code still become out-of-sync just as comments-and-code become out-of-sync, despite proximity (in both scenarios).

As yet-another-alternative to traditional text-based source code files, there are some potential novel ideas from Bret Victor, some alternative IDEs such as found in Lego Mindstorms that are visually oriented rather than text-oriented, Smalltalk style IDE where the code is in the general environment, old DEC Forté style (pre-JavaScript) IDE where the code was in a database backing store, and novel ways of having text-based source as in Light Table.

So there are people working on the leading edge. Maybe one of those concepts will become mainstream.

 

Finally, someone else who gets it! "Code as text" as a paradigm feels painfully outdated. It seems so obvious that we can do better. The comments here are a pretty good guide to what pitfalls we'd need to avoid:
don't be Scratch, interop with GitHub, find a way to leverage whatever the hell the vim power-user community is. Don't just be literate programming. It feels doable, though.

Have you ever tried the Lightbox IDE? It lets you put print statements in your code and see what they evaluate to on an example input inline, for multiple test cases, as you edit. It's a big step towards the feel of programming in a spreadsheet while using a real language.

 
 

And here I got the name wrong - it's Light Table, not Lightbox

 

Have you tried LabView?

Arent some parts of MATLAB supposed to help with this sort of thing (I'm not experienced with MATLAB, it's an assumption based on what I've heard about it).

But.... LabView is awful, if you need to refactor it's very difficult. If you need to debug, forget it. Plain text code is easy and perfect for standard software development, for scientific development (i.e. mathematics, grahpics) which involves complex equatics I would expect there are libraries which allow you to express math formula as plain code?

Why would you want to use spaces in variable names?

Why is snake Vs camel case a problem?

If you have a variable which holds the value for P(a|b), then use a creative name, which is what that value represents (I dont know what that expression is) so assume its something like ambient_pressure (I don't care how its value is calculated, the name is descriptive of what it is.....

A huge problem in code which I deal with on a daily basis is reading stuff like this(python syntax):

cv_to_ddv(cf_df):

I mean, what the hell is that? No comments, nothing, and the guy who wrote it left the company!!! I have to now go search where it's used and try to interpret its use to understand this functions purpose.... So it turns out it means:
Convert compensation voltage to derived dispersion field

So the name is totally rubbish. Naming stuff is one of the hardest things in writing software because it describes what you are doing. When you look at some complex equation you will "read" it, so text should also be able to be used to describe it.

 

I've used matlab a little bit. I might have missed something but I think it made the problem worse by just turning everything into non-standard operators to stay within the ascii characters and monospace/text format.

LabView I know nothing about.

Why would you want to use spaces in variable names?

Why is snake Vs camel case a problem?

Because we create variable names composed of multiple words. fooBar is less readable than foo_bar is less readable than foo-bar is less readable than foo bar. Spaces are also easiest to write. The reasons not to use spaces is that it conflicts with syntax. Also some gestalt principles (characters of a variable are close together), but there's other options for that.

If you have a variable which holds the value for P(a|b), [...]

P(a|b) is a mathematical notation for "probability of a being true given that b is true". That's a lot of words to write out. This was a real-world problem I've had, especially because I also needed P(a|¬b) and many similar variables. The resulting code was unreadable using full-length variable names.
The meaning of P(a|b) is well understood by people who have a minimal background in Bayesian statistics. So, essentially, it is the right name.

More generally speaking though, because variable names are styling for humans, you could have multiple names for the same variable and use whatever suits you most in a certain situation, e.g. short or long. Although both at the same time sounds like a very bad idea :-P.

 

In terms of readability using camel vs snake I have to disagree as I've never had an issue reading either syntax, but everyone is different, so for you it's a fair point.

The point you make about naming variables is very true; it's very difficult to map mathematical names to human readable without being obtrusively long. So again, I guess if you do a lot of it being able to use reserved chars in a variable name could be useful...

Thanks for explaining what P(a|b) is, I've never come across that before :)

 

If you're doing math, text sucks.

Text-format math is harder to read, but it is much easier to edit and write.

Navigating a one-dimensional line of text can be done with two buttons; add two more and you can add line-oriented editing, but that's optional. Editing a multi-dimensional equation, like your version of the distance formula, means you have to come up with an interface for selecting just the radical, or just the equation that you're taking the root of; you can't do that with normal arrows and drag-select.

It's the same set of problems that any kind of WYSIWYG has, now that I think of it. Just because source code is read more often than it's written doesn't mean you can completely neglect the editing experience.

 

A very good point!

I do think this is solvable. E.g. if you write latex markup, there is a line-based counterpart to the formulas in the compiled pdf. If the relation between markup and formula is isomorphic, navigating with arrow keys in the formula is possible, because it is in the markup.

 

i.e. we display the code as plain text, and that's what it is.

That's what it is to us - obviously to the interpreter/compiler/running process it's something else.

What is that something else that we could use to represent a program that is at the same time a plain text file? Some form of data format that could also be read as the AST of the program...

Say, you've got MSc. in AI - you must've heard of a once-popular AI language called Lisp at some point? You know, the one where the code is the data and the data is the code? Where you can see the AST right in front of you because of the ridiculously simple syntax?

Image based coding is so last century sadly, despite the most popular IDE in the world being Microsoft Excel. Plain text is... well, plain. No real worries about reading and writing - or forwards compatibility. Even Smalltalk can be represented as a text file.

By removing the coupling of formatting and semantics this way, we can also use wildly different formatting (e.g. a graph editor) on different systems, but modify the same underlying semantics.

Try something from Wolfram

 

I'm not really sure what you're trying to say here...

 

Fair enough... reading it back I'm not sure either!

 

The main argument against designing a language for an IDE is that it ties it to the IDE. If you spot a simple typo on GitHub you can't change it (if you're even able to view it on GitHub). If you're on a new device you have a text editor, but you don't have an IDE. If you're on a phone you can't read it. You can't share code snippets with others unless they're on the same IDE.

While it's an interesting idea (although not an entirely new one), there are too many practical arguments stacked against it.

I think the more useful approach would be to let the IDE offer features to display the code better (such as a formula view for your above example, which could format it in a better legible way) which integrates into the editing process. One could argue that that's a large part of what IDEs are about.

 

IDE features would definitely solve some of these problems, but I feel this would be more a case of treating a symptom than fixing the underlying problem.

You're absolutely right about practicality, but sometimes you have to take a step back to make a step forward.

 

This exists, in the form of LabVIEW, Simulink, and other such thing. They are even widely used in industry because they're excellent for expressing complicated mathematics.

However, as others have pointed out, they all suffer in some way or another from portability issues. Until we have an equivalent of ASCII and UNICODE standards for these model based languages, they simply aren't very likely to catch the kind of traction text-based languages have. The tooling will never get to that point without an open and popular standard.

Additionally, for that kind of programming to kick off, something must be able to take up the role that C and C++ currently fill as the backbone of close-to-hardware software. I don't think it's impossible, but I also don't think there's enough incentive to put in that effort right now. Modern tooling has made C/C++ programming highly productive, and getting a non-text language up to feature-parity and portability-parity would be unjustifiably expensive right now.

 

Most IDEs are are bloated enough already. Moving away from raw text to something with a massive amount of overhead is only going to create other problems. A job I had years ago involved programming in visual system represented by a tree, where you dragged and drop objects, and editted their properties. It was truly the most awful thing ever. Imagine reading a text book in the format of pop-up book.

 

Intentional programming, and to some extent, model driven architectures went there. The idea with intentional programming was that you had intents that were composed of other intents all the way down to machine code. The idea being that instead of defining a language with syntax, you had abstract syntax trees that you could translate into less abstract syntax trees using transformations. Editing your intent could be done in a UI, using a DSL, or anything. At the high end you'd have DSLs, complex UIs, etc. transforming stuff to running code. MDA was more or less the same idea but focused around UML and the UML meta language. The former never really got out of the prototype stage. Charles Simonyi (one of early the MS millionaires and inventor of the infamous hungarian syntax) apparently is still running intentsoft.com/ for like the last 20 years or so but they are not currently promoting any products. MDA was briefly popular and I recall some seriously misguided projects that were using it (shudder).

Also, Eclipse is a descendant of VisualAge, which was a Smalltalk, and later, Java IDE that actually stored code in a DB instead of files to facilitate working with the code in a structured way. Smalltalk of course always worked that way.

Eclipse later went back to storing files but they did do something cool which was to incrementally maintain an a abstract syntax tree of the code base. This was the compromise that allowed them to stop using a database and this why it has its own incremental java compiler: a normal compiler would be way too slow. The eclipse compiler tends to only lag behind what you type by a few hundred ms. This is also what enables them to complex refactorings, quick fixes, and other sophisticated AST transformations.

Intellij of course does very similar things except they never really figured out incrementally updating the AST and instead implemented a lot of the same refactorings a bit differently. As a consequence, it is a lot slower doing things that are essentially happening in real time on Eclipse.

E.g. launching a unit test after a 1 character change on Intellij can be painfully slow whereas it is instant in Eclipse. I regularly end up waiting seconds or even tens of seconds for Intellij to catch up. It also loses the plot entirely quite often meaning that it's view of the world gets out of sync with what is actually in the files. At least it lies to me frequently about things being broken, or worse, not broken. Also you need to frequently do manual refreshes, rebuilds, and I occasionally just rage quit it so that it can figure out reality on startup. Eclipse always felt a lot faster and robust in this respect.

 

All languages should be homoiconic! In which case, you can easily (meta)program some representation.

And I know it's not exactly what you're describing, but check out ballerina. You can auto-gen sequence diagrams from it (it's "cloud-native" so the assumption is that most of your code can be represented as such)

 

Now, I think a compromise would be a kind of graphical interface that generates "normal" code, say C#. That way, especially for beginners, it is way easier to use this new IDE. However, if you want to, you can still use a traditional IDE like Visual Studio, if you feel more comfortable with that. In order to implement another visual IDE, you also don't require a new standardized save format: you just use the plain code.

 

For illustrative purposes say you wish to have both text and graph based 'views'. The two views need to be isomorphic. Generally, programming languages aren't made with that requirement in mind.

That said, UE4 has something similar that allows interop between graph and text, although I forgot the name.

 

As someone who prefers text editors like vim and emacs to a full IDE, I hate the idea. I understand that other people love IDEs, but I would refuse to use a language that requires me to use an IDE. Not to mention that such a language would be difficult to use tools like grep, sed, etc.

 

My feeling is that IDEs should be built to express ideas and transform them into code for people who find text based code difficult or inconvenient.

Writing a language to meet requirements of an IDE would limit the potential of the language.

We have too many languages and runtime environments already. Stop writing languages, just build IDEs that make languages more accessible or less verbose or so you dont have to use semicolons or so you can be whitespace delimited.

Whatever you dont like about whatever language you work in, it can be hidden from you with an IDE and we dont have to to reimplement whole library ecosystems to achieve what can currently be done with little work in any major language.

Also, the more languages and runtime environments we have, the larger our global threat surface and the more bugs we won't find until it's too late.

 

Mathematica notebooks are very much what you are describing, but, as several people have said already, this approach has several drawbacks. Most importantly they are hard to version control but they also tie you to a specific editor which is may be laggy or unpredictable. In practice, I tend to avoid notebooks and instead write scripts when I want to use Mathematica.

Perhaps a partial solution is to use more unicode, in Julia for example you can write things like √(x^2+y^2). Some other languages also support unicode to varying degrees too.

 

So what if we changed our files from being plain text to something richer and more structured?

No, please don't.

If you demand on a colorful IDE to understand your code, you probably made a mistake.

Or write in Piet.

 

What makes you say that? Obviously you must have some different experience from mine that makes you dislike this idea.

 

My experiences involve being highly annoyed by rainbow colors in my terminal. :-)

Colors are not really part of the argument.

I'm talking more about what .doc files are to .txt files, but noting that part of the final look should be determined by personal settings. Like in the text vs graph editor example. Or in your case, having or not having syntax highlighting.

 

Plain text is still the only universally available format.

It will load on any device you can possibly think of, and there's only one single thing that can cause incompatibilities between platforms and that's line endings which is a problem that is trivially solved.

Tabs vs spaces? CamelCase or snake_case? These are all inconsequential in the grand scheme of things. If I write some plain text code on this phone then I can share it with literally anyone in the world and it will load and compile/run on their machine.

The only way a new code format can become a replacement for plain text is by achieving the same ubiquitousness.

Good luck with that :)

 

Plain text code still needs to be executed some way, so it isn't really ubiquitous. And even if it were, why would you need that in practice?

 

You want to take development to the next level. Perhaps you need something more, a new challenge 😉

 

I think what you need is complex and fancy plug-in on the IDE side. Programming languages should be on raw condition. IDE and the extensions make it good product and ready for consume. It is easier to add plug in than plug-out from programming language

 
 

I'm currently learning iOS development with xcode and Swift and the whole code-generator stuff feels kinda ugly to me

code of conduct - report abuse