Data directed programming
Alan Campora ・1 min read
Introduction
Some days ago I found a course about computer science called Structure and Interpretation of Computer Programs, and I've been learning many interesting things through it. Data directed programming is one of the topics I would like to share with you today.
In the following post we're going to create an interpreter for arithmetic operations using complex numbers. So if you like maths and Lisp keep reading the post, and if not, you will see some cool concepts anyways!
Complex numbers
First, we should be aware about how to represent complex numbers.TL;DR any complex number can be represented in two forms:
Rectangular: z = x + y * i such that x is the real part and y*i: is the imaginary part.
Polar: x = R*cos(a) and y = R*sen(a) such that R : represents the magnitude and a the angle
Equivalences: a = arct(y,x) and R = sqrt( x^{2} , y^{2} )
Therefore, we have four important constants, x, y, magnitude (R) and angle (a).
Let's start to define the operations. For this example, we're going to use the addition and multiplication.
z1+z2
realpart(z1+z2) = (realpart z1) + (realpart z2)
imaginarypart(z1+z2) = (imaginarypart z1) + (imaginarypart z2)z1*z2
magnitude(z1*z2)= (magnitude z1) * (magnitude z2)
angle(z1*z2) = (angle z1) + (angle z2)
Enough of maths! Now that we know how a complex number is represented and how the operations are done, we are ready to start the development of the interpreter!
Operations
The previous operations can be defined in Lisp in the following way
(define (+c z1 z2)
(createfromrealimaginary
(+(realpart z1) (realpart z2))
(+(imaginarypart z1) (imaginarypart z2))))
(define (*c z1 z2)
(createfrommagnitudeangle
(*(magnitude z1) (magnitude z2))
(+(angle z1) (angle z2))))
Let's code!
Representing abstract data
First, we need to take an important design decision, how are we going to represent the numbers? The rectangular form seems to be a pretty straightforward implementation. Therefore, complex numbers will be represented using an ordered pair with the realpart and the imaginarypart (x,y). To find the angle and the magnitude, we should use the trigonometric relations.
(define (createfromrealimaginary x y)
(cons x y )) ;;;cons returns a pair
(define (realpart z)
(car z )) ;;; car returns the first element
(define (imaginarypart z )
(cdr z)) ;;; cdr returns the first element
(define (createfrommagnitudeangle r a )
(cons (* r (cos a)) (* r (sin a))))
(define (angle z)
(atan (imaginarypart z) (realpart z))
(define (magnitude z)
(sqrt (+ (square (realpart z)) (square (imaginarypart z))))
The problem
The solution was running well, but after a few months our partner Hal implemented other strategy. He chose the polar form (r,a) for representing the complex numbers.
(define (createfrommagnitudeangle r a)
(cons (r a))
(define (angle z)
(car z))
(define (magnitude z)
(cdr z))
(define (createfromrealimaginary x y)
(cons (sqrt (+ (square x) (square Y)))
(atan y x)))
(define (realpart z)
(* (magnitude z) (cos(angle z))))
(define (imaginarypart z)
(* (magnitude z) (sin(angle z))))
Both solutions are ok. Non of them is better but now we have two incompatible representations because we're duplicating the definitions. Now we want both representations to coexist, without interfering with each other.A picture is worth a thousand words, so let's take a look to the system structure that we would like to create.
The interpreter is divided in 2 layers, one for the operators, and the other for the representation of complex numbers.
First approach
Tagged data
The operations in the middle abstract layer (realpart, magnitude, angle, imaginarypart) should work without being affected by the different definitions of complex numbers. Therefore, this layer should acts as an interface. But how can we achieve this behaviour? What if we tag the data with the type in the following way?
In order to manipulate this tagged data, we should implement a function that attaches the type to any content.
(define (attachtype typetag content)
(cons typetag content))
(define (content data)
(cdr data))
(define (type data)
(car data))
Where can we add this types? Constructors sounds like a good place to do it! Let's refactor both representations
;;;our package
(define (createfromrealimaginaryrectangular x y)
(attachtype 'rectangular (cons x y )))
(define (realpartrectangular z)
(car z ))
(define (imaginarypartrectangular z )
(cdr z))
(define (createfrommagnitudeanglerectangular r a )
(attachtype 'rectangular
(cons (* r (cos a)) (* r (sin a)))))
(define (anglerectangular z)
(atan (imaginarypart z) (realpart x))
(define (magnituderectangular z)
(sqrt (+ (square (realpart z)) (square (imaginarypart z))))
;;;Hal's package
(define (createfrommagnitudeanglepolar r a)
(attachtype 'polar (cons (r a)))
(define (anglepolar z)
(car z))
(define (magnitudepolar z)
(cdr z))
(define (createfromrealimaginarypolar x y)
(attachtype 'polar
(cons (sqrt (+ (square x) (square Y)))
(atan y x))))
(define (realpartpolar z)
(* (magnitude z) (cos(angle z))))
(define (imaginarypartpolar z)
(* (magnitude z) (sin(angle z))))
Two main changes have been implemented in this refactor.
 every constructor is attaching the type,
'rectangular
or'polar
at the beginning of each complex number .  changing function names to avoid naming conflicts
Regarding the operations +c *c, we can still use them. Since each representation works well in different scenarios, lets use both of them.
(define (createfromrealimaginary x y)
(createfromrealimaginaryrectangular xy))
(define (createfrommagnitudeangle r a)
(createfrommagnitudeanglepolar r a))
Dispatching on type
Once the data was tagged with each type and once we have different functions for both representations, it's possible to implement a *Manager that will take care of calling the corresponding function.
;;;Manager
(define (rectangular? z)
(eq? 'rectangular (type z)))
(define (polar? z)
(eq? 'polar (type z)))
(define (realpart z)
(cond (rectangular? (type z))
(realpartrectangular (content z))
(realpartpolar (content z))))
(define (imaginarypart z)
(cond (rectangular? (type z))
(imaginarypartrectangular (content z))
(imaginarypartpolar (content z))))
(define (angle z)
(cond (rectangular? (type z))
(anglerectangular (content z))
(anglepolar (content z))))
(define (magnitude z)
(cond (rectangular? (type z))
(magnituderectangular (content z))
(magnitudepolar (content z))))
Each function will check the type of the data received, and after that, the Manager knows which representation should be called. In each case, the parameter of the called function will be the content of the data.
This technique receives the name of Dispatching on type, we use it to keep the modularity of our system.
Anyway, this solution has two main weaknesses
 It's necessary to add many different conditions for each each representation.
 Developers must warrantee that the are no two functions with the same name.
The solution
Data directed programming
Is the manager necessary? It is responsible of calling different functions but it doesn't do much. In fact, it's acting like the following table
When it matches the type and the operator, it knows the correct function to apply.
But let's assume that our system has two functions to represent that table internally
put operatorkey typekey item

get operatorkey typekey
Using put
, each developer is responsible of mounting their functions to the system table in the following way
;;;mount functions in system
(put 'rectangular 'realpart
(lambda(z) (car z) )
(put 'rectangular 'imaginarypart
(lambda(z)(cdr z)))
(put 'rectangular 'angle
(lambda(z)(atan(imaginarypart z) (realpart z)
(put 'rectangular 'magnitude
(lambda(z)(sqrt (+ (square (realpart z)) (square (imaginarypart z)))))
(put 'rectangular 'createfromrealimaginary
(lambda( x y )
(attachtype 'rectangular (cons x y ))
(put 'rectangular 'createfrommagnitudeangle
(lambda( r a )
(attachtype 'rectangular
(cons (* r (cos a)) (* r (sin a))))))
To get completely rid of the manager, we need to create an interface with the operators.
(define (apply operator arg)
(let (fn (get (type arg) operator )))
(if (not (null? fn ))
(fn (content arg))
("error: no operator")))
(define (realpart z)
(apply 'realpart z))
(define (imaginarypart z)
(apply 'imaginarypar z))
(define (angle z)
(apply 'angle z))
(define (magnitude z)
(apply 'magnitude z))
The advantage over the previous solution, is that we don't have a central point of control, in fact the responsibility is moved to each representation. They achieve that mounting themselves to the system using the function put
. Moreover, if there are 100 hundred new representations of complex numbers, the above code won't suffer any change!
To fix the naming conflicts, now we're using lambda expressions.
Conclusion
To wrap up, for me it was the first time I heard about those concepts. Behind them, what we're really trying to do is to create a polymorphism for the representations.
I hope this example was clear enough for all of you! Normally, we're far from maths when we're working, but from my point of view, it's also interesting to mix programming and maths.
Furthermore, everything was implemented using lisp (scheme), which is a beautiful language =D (I know what you're thinking, you don't like to see many ()()()
but you get used to them, I promise).
Thanks for reading!
Datadriven programming can probably have different meanings, but here is the one I use it for: it is a style of programming in which specialization is done through data structures and not boilerplate code.
You can learn many programming languages here: hackr.io/
Examples are probably a better way to understand the concept.
Instead of writing (pseudocode):
DoSomething(a,b)
DoSomething(c,d)
DoSomething(e,f)[
You write:
things_to_do = [ [a,b], [c,d], [e,f] ]
for each thing in things_to_do
DoSomething(thing)
If you have firstclass functions, instead of writing:
switch x
case a
f(y)
case b
g(y)
You write:
switch = { a: f, b: g}
switchx
Now, more realistic examples. This one is extracted from a Ruby library I wrote, and it makes some methods defined on Arrays work on Strings. I think it illustrates the power of datadriven programming coupled with metaprogramming quite well.
methods = %w{first first= head head= init last last= tail}
methods.each{n define_method(n){*a,&b chars.to_a.send(n,*a,&b).join}}
Actually, the examples seem to be similar to the issue we're solving in the article. If I try to adapt your solution to the issue of having two different representations, it could look like this: data = {"polar": function realPart(){...}, "rectagular": function realPart(){}}, and then somehow you give that data to the switch =D.
It's great to see that data driven programming can also be implemented in contexts!
Great post! Thanks
Thanks for this! 😄
Cool post. Does the fact that polar coordinates can be the same ([1, π] = [1, 3π] = [1, 5π], etc.) complicate matters at all or is that simply handled by the trig functions?
Great question! Actually I haven't thought about that before.
In fact, that could be an issue because some users could want to represent angles with degrees and others using radians. So once again, we could try to tag the angle with the type ('radians 'degrees) and handle that internally using the technique that we saw before!