lostghost

Posted on Jul 13

Wow, everything is graph!

#programming

There are two sides to programming, two approaches.

One approach is low-level. A computer is a collection of ciruits, some more programmable than others. A process sees a virtualized memory layout, which allows it to access devices over a bus - memory being just one of the devices. And what programming is about, is learning how to shove bits down the pipe - as fast as the hardware allows it. Those who appreciate computers, myself included, can see how this view is appealing. It correlates with the imperative/procedural way of describing computations.

Image source

The second approach is mathematical. A program is a transformation of inputs to outputs - it is a function. Functions operate on data - sets, vectors, numbers, sets of numbers. Functions can also depend on their own results - thus you get a finite state machine. There are queue networks, Petri nets - robust mathematical models. This approach is more in line with the functional way of describing computations.

Image source

But there is a third approach, more oriented towards computational systems. In my view, a system is a collection of interacting parts, that together perform a higher-level functionality. And the natural way to represent such a system is with a graph, where each node is also a subgraph. This maps well to the object graph.

Image source

The object graph unites two graphs - the execution graph, and the data graph, by the virtue of objects containing both fields and methods. To me, this leads to a natural question - if what we create is a graph, why are we programming text?

So much of programming complexity is in keeping track of how things relate to each other - and by changing a part of the system, what other parts will be affected. A graph answers that question much better than text - yet so far, it's power was only successfully utilized in quick diagrams on a whiteboard, and BPMN engines. Is the program really that hard to represent as a graph, and also - is it too hard to actually program?

Image source

Issue is that it's two graphs - execution and data, which makes for a disjointed view. Could we create one compelling graph, when data has non-trivial cross-references, while also participating in different executions? As a bonus, how to represent metaprogramming - if execution also changes?

Image source

Data consists of entities referencing each other. But all references aren't equal. Some carry ownership information, others don't - like direct fields in a struct vs pointers. Some allow for mutation, others are read-only. Some allow for concurrent access, others have mutual exclusion.

With modern programming practices, the following optimal design emerges. We want to avoid shared mutable data. Ownership gives responsibility of managing the lifecyle, and memory. For mutation by non-owner - borrowing needs to take place, to ensure that a reference stays valid for the duration of the mutation, kind of like in a database transaction. With this, we can represent data as squares, and the data that they own - as squares within. Data that they simply reference - as squares without, connected with arrows.

Here, Foo owns Bar, and Bar references Baz without ownership.

Now for the call graph - methods belong to objects, making them squares within squares. Methods call each other - this is represented by arrows.

So how do we unite the two? By establishing that first there are methods, and data belongs to methods. But what about data belonging to different methods at different times? Who said our graph needs to be static - data can move as the system progresses.

And the data never needs to be in two executions at the same time - it's either copied or borrowed.

So this solves it! But wait. This is a pretty nice way of representing a snapshot of a live system - but what about actually programming a system? How do we represent a system for that goal?

What's important to realise is that programming as we know it, isn't actually programming - it's meta-programming.

Imagine a live system. It has instructions - when a new order is placed, update and display the sum of the cost. When a user creates an order, that changes the state of the system - how is this not programming?

Now imagine a program. It has instructions - when a system is started, register a repository for orders, a web form for creating them, and a processing pipeline, which responds to changes in the repository, and calculates the sum. It's a program over the live system - it is a meta-program.

Now imagine a meta-program. It has instructions - when a web form for creating orders is registered, register copies of it for the user, an internal copy for an admin, and a demo copy for the sales department. This is a program over the meta program - making it meta^2.

So in reality, you're always modifying a graph - just, it's usually not the live system graph, but a graph above it, with instructions of how to modify the underlying graph. This meta-stack can be as high as you want - and the workflow for inspecing and modifying any level of those graphs is the same. And there is no code, which is "dead", and merely a template for constructing a system - all graphs are live graphs, and you are viewing their live state, as it is.

Hopefully you found information in this post to be insightful, and it gave you a broader perspective on what programming is all about. I believe graphs to be the future, and if you agree or disagree - let me know what you think, by writing a comment. Thanks for reading.

DEV Community

Wow, everything is graph!

Top comments (0)