Alan Campora

Posted on May 31, 2017

Data directed programming

#computerscience #lisp #scheme

Introduction

Some days ago I found a course about computer science called Structure and Interpretation of Computer Programs, and I've been learning many interesting things through it. Data directed programming is one of the topics I would like to share with you today.

In the following post we're going to create an interpreter for arithmetic operations using complex numbers. So if you like maths and Lisp keep reading the post, and if not, you will see some cool concepts anyways!

Complex numbers

First, we should be aware about how to represent complex numbers.TL;DR any complex number can be represented in two forms:

Rectangular: z = x + y * i such that x is the real part and y*i: is the imaginary part.

Polar: x = R*cos(a) and y = R*sen(a) such that R : represents the magnitude and a the angle

Equivalences: a = arct(y,x) and R = sqrt( x^2 , y^2 )

Therefore, we have four important constants, x, y, magnitude (R) and angle (a).

Let's start to define the operations. For this example, we're going to use the addition and multiplication.

z1+z2
real-part(z1+z2) = (real-part z1) + (real-part z2)
imaginary-part(z1+z2) = (imaginary-part z1) + (imaginary-part z2)

z1*z2
magnitude(z1*z2)= (magnitude z1) * (magnitude z2)
angle(z1*z2) = (angle z1) + (angle z2)

Enough of maths! Now that we know how a complex number is represented and how the operations are done, we are ready to start the development of the interpreter!

Operations

The previous operations can be defined in Lisp in the following way


(define (+c z1 z2) 
    (create-from-real-imaginary
        (+(real-part z1) (real-part z2)) 
        (+(imaginary-part z1) (imaginary-part z2))))

(define (*c z1 z2) 
    (create-from-magnitude-angle
        (*(magnitude z1) (magnitude z2)) 
        (+(angle z1) (angle z2))))

Let's code!

Representing abstract data

First, we need to take an important design decision, how are we going to represent the numbers? The rectangular form seems to be a pretty straightforward implementation. Therefore, complex numbers will be represented using an ordered pair with the real-part and the imaginary-part (x,y). To find the angle and the magnitude, we should use the trigonometric relations.


(define (create-from-real-imaginary x y) 
    (cons x y )) ;;;cons returns a pair

(define (real-part z) 
    (car z )) ;;; car returns the first element

(define (imaginary-part z ) 
    (cdr z)) ;;; cdr returns the first element

(define (create-from-magnitude-angle r a ) 
    (cons (* r (cos a)) (* r (sin a))))

(define (angle z)
    (atan (imaginary-part z) (real-part z))

(define (magnitude z) 
    (sqrt (+ (square (real-part z)) (square (imaginary-part z))))

The problem

The solution was running well, but after a few months our partner Hal implemented other strategy. He chose the polar form (r,a) for representing the complex numbers.

(define (create-from-magnitude-angle r a) 
    (cons (r a)) 

(define (angle z) 
    (car z))

(define (magnitude z)
    (cdr z))

(define (create-from-real-imaginary x y) 
    (cons (sqrt (+ (square x) (square Y)))
          (atan y x)))

(define (real-part z) 
    (* (magnitude z) (cos(angle z)))) 

(define (imaginary-part z) 
    (* (magnitude z) (sin(angle z))))

Both solutions are ok. Non of them is better but now we have two incompatible representations because we're duplicating the definitions. Now we want both representations to coexist, without interfering with each other.A picture is worth a thousand words, so let's take a look to the system structure that we would like to create.

The interpreter is divided in 2 layers, one for the operators, and the other for the representation of complex numbers.

First approach

Tagged data

The operations in the middle abstract layer (real-part, magnitude, angle, imaginary-part) should work without being affected by the different definitions of complex numbers. Therefore, this layer should acts as an interface. But how can we achieve this behaviour? What if we tag the data with the type in the following way?

In order to manipulate this tagged data, we should implement a function that attaches the type to any content.

(define (attach-type type-tag content)
    (cons type-tag content))

(define (content data)
    (cdr data))

(define (type data)
    (car data))

Where can we add this types? Constructors sounds like a good place to do it! Let's refactor both representations


;;;our package 

(define (create-from-real-imaginary-rectangular x y) 
    (attach-type 'rectangular (cons x y )))

(define (real-part-rectangular z) 
    (car z )) 

(define (imaginary-part-rectangular z ) 
    (cdr z)) 

(define (create-from-magnitude-angle-rectangular r a ) 
    (attach-type 'rectangular 
        (cons (* r (cos a)) (* r (sin a)))))

(define (angle-rectangular z)
    (atan (imaginary-part z) (real-part x))

(define (magnitude-rectangular z) 
    (sqrt (+ (square (real-part z)) (square (imaginary-part z))))

;;;Hal's package

(define (create-from-magnitude-angle-polar r a) 
    (attach-type 'polar (cons (r a)))

(define (angle-polar z) 
    (car z))

(define (magnitude-polar z)
    (cdr z))

(define (create-from-real-imaginary-polar x y) 
    (attach-type 'polar 
        (cons (sqrt (+ (square x) (square Y))) 
              (atan y x))))

(define (real-part-polar z) 
    (* (magnitude z) (cos(angle z)))) 

(define (imaginary-part-polar z) 
    (* (magnitude z) (sin(angle z))))

Two main changes have been implemented in this refactor.

every constructor is attaching the type, 'rectangular or 'polar at the beginning of each complex number .
changing function names to avoid naming conflicts

Regarding the operations +c *c, we can still use them. Since each representation works well in different scenarios, lets use both of them.

(define (create-from-real-imaginary x y)
    (create-from-real-imaginary-rectangular xy))

(define (create-from-magnitude-angle r a)
    (create-from-magnitude-angle-polar r a))

Dispatching on type

Once the data was tagged with each type and once we have different functions for both representations, it's possible to implement a *Manager that will take care of calling the corresponding function.

;;;Manager
(define (rectangular? z)
    (eq? 'rectangular (type z)))

(define (polar? z)
    (eq? 'polar (type z)))

(define (real-part z)
    (cond (rectangular? (type z))
          (real-part-rectangular (content z))
          (real-part-polar (content z))))  

(define (imaginary-part z)
    (cond (rectangular? (type z))
          (imaginary-part-rectangular (content z))
          (imaginary-part-polar (content z))))

(define (angle z)
    (cond (rectangular? (type z))
          (angle-rectangular (content z))
          (angle-polar (content z))))

(define (magnitude z)
    (cond (rectangular? (type z))
          (magnitude-rectangular (content z))
          (magnitude-polar (content z))))

Each function will check the type of the data received, and after that, the Manager knows which representation should be called. In each case, the parameter of the called function will be the content of the data.
This technique receives the name of Dispatching on type, we use it to keep the modularity of our system.
Anyway, this solution has two main weaknesses

It's necessary to add many different conditions for each each representation.
Developers must warrantee that the are no two functions with the same name.

The solution

Data directed programming

Is the manager necessary? It is responsible of calling different functions but it doesn't do much. In fact, it's acting like the following table

When it matches the type and the operator, it knows the correct function to apply.

But let's assume that our system has two functions to represent that table internally

put operator-key type-key item
get operator-key type-key

Using put, each developer is responsible of mounting their functions to the system table in the following way

    ;;;mount functions in system

    (put 'rectangular 'real-part 
        (lambda(z) (car z) )
    (put 'rectangular 'imaginary-part 
        (lambda(z)(cdr z)))
    (put 'rectangular 'angle 
        (lambda(z)(atan(imaginary-part z) (real-part z)
    (put 'rectangular 'magnitude 
        (lambda(z)(sqrt (+ (square (real-part z)) (square (imaginary-part z)))))

    (put 'rectangular 'create-from-real-imaginary
        (lambda( x y ) 
            (attach-type 'rectangular (cons x y ))

    (put 'rectangular 'create-from-magnitude-angle
        (lambda( r a ) 
            (attach-type 'rectangular 
                (cons (* r (cos a)) (* r (sin a))))))

To get completely rid of the manager, we need to create an interface with the operators.

(define (apply operator arg)
    (let (fn (get (type arg) operator )))
    (if (not (null? fn ))
        (fn (content arg))
        ("error: no operator")))

(define (real-part z) 
    (apply 'real-part z))
(define (imaginary-part z) 
    (apply 'imaginary-par z))
(define (angle z) 
    (apply 'angle z))
(define (magnitude z) 
    (apply 'magnitude z))

The advantage over the previous solution, is that we don't have a central point of control, in fact the responsibility is moved to each representation. They achieve that mounting themselves to the system using the function put. Moreover, if there are 100 hundred new representations of complex numbers, the above code won't suffer any change!
To fix the naming conflicts, now we're using lambda expressions.

Conclusion

To wrap up, for me it was the first time I heard about those concepts. Behind them, what we're really trying to do is to create a polymorphism for the representations.

I hope this example was clear enough for all of you! Normally, we're far from maths when we're working, but from my point of view, it's also interesting to mix programming and maths.

Furthermore, everything was implemented using lisp (scheme), which is a beautiful language =D (I know what you're thinking, you don't like to see many ()()() but you get used to them, I promise).

Thanks for reading!

Top comments (6)

nihar • Jun 1 '17

Data-driven programming can probably have different meanings, but here is the one I use it for: it is a style of programming in which specialization is done through data structures and not boilerplate code.
You can learn many programming languages here: hackr.io/

Examples are probably a better way to understand the concept.

Instead of writing (pseudocode):

DoSomething(a,b)
DoSomething(c,d)
DoSomething(e,f)[

You write:

things_to_do = [ [a,b], [c,d], [e,f] ]
for each thing in things_to_do
DoSomething(thing)

If you have first-class functions, instead of writing:

switch x
case a
f(y)
case b
g(y)

You write:

switch = { a: f, b: g}
switchx

Now, more realistic examples. This one is extracted from a Ruby library I wrote, and it makes some methods defined on Arrays work on Strings. I think it illustrates the power of data-driven programming coupled with metaprogramming quite well.

methods = %w{first first= head head= init last last= tail}
methods.each{|n| define_method(n){|*a,&b| chars.to_a.send(n,*a,&b).join}}

Alan Campora • Jun 1 '17 • Edited

Actually, the examples seem to be similar to the issue we're solving in the article. If I try to adapt your solution to the issue of having two different representations, it could look like this: data = {"polar": function realPart(){...}, "rectagular": function realPart(){}}, and then somehow you give that data to the switch =D.
It's great to see that data driven programming can also be implemented in contexts!

Verito Black • May 31 '17

Great post! Thanks

Walker Harrison • May 31 '17

Cool post. Does the fact that polar coordinates can be the same ([1, π] = [1, 3π] = [1, 5π], etc.) complicate matters at all or is that simply handled by the trig functions?

Alan Campora • May 31 '17 • Edited

Great question! Actually I haven't thought about that before.
In fact, that could be an issue because some users could want to represent angles with degrees and others using radians. So once again, we could try to tag the angle with the type ('radians 'degrees) and handle that internally using the technique that we saw before!

Ben Halpern • May 31 '17

Thanks for this! 😄