You've heard the term, you've probably even used them, but what exactly is a closure? It's a combination of data and code that have become a staple of modern programming. They offer a natural functional feature; quite useful even if you don't fully understand them. Let's take a closer look and demystify this curious construct.
A simple example
We can start with a simple local form:
defn pine = -> {
var x = 1
defn incr = -> {
print(x)
x = x + 1
}
incr() // prints 1
incr() // prints 2
}
Here were define a function pine
. Inside that function we define another function called incr
. The x
variable, used inside incr
, is not defined there: it's part of the local variables of the pine
function. The function incr
is a closure of it's code and the variables in the surrounding scope.
By calling incr
twice we see that x
persists between calls. Moreover, x
is shared between the scopes:
incr() // 1
print(x) // 2
x = 4
incr() // 4
Both incr
and the pine
scope are referring to the same x
. This depends a bit on the language, in Leaf, shown here, this sharing is the default. In C++, and other languages, cloning is the default; in this case the value is not shared, but a copy is taken when the function is created. For example:
defn pine = -> {
var x = 5
//a copy of `x` is made now
defn incr = -> clone {
print(x)
x = x + 1
}
incr() // 5
print(x) // still 5 (would be 6 if shared)
x = 1
incr() // 6 (would be 1 if shared)
print(x) // 1
}
Subsequent calls to incr
use the same x
, but the one in pine
is independent of it. This creates two different x
variables. These are both types of closures as they combine code and a data scope. The results are quite different, so it's important to know what type you're dealing with.
High-order functions
When we combine closures with high-order functions we get interesting new possibilities. For example, the data scope for a closure can persist outside of the function that creates it.
defn addr = (x) -> {
var f = (y) -> {
return x + y
}
return f
}
var a5 = addr(5)
print( a5(1) ) // 6
pinrt( a5(3) ) // 8
var a7 = addr(7)
print( a7(1) ) // 8
pinrt( a7(3) ) // 10
addr(5)
is creating a function that adds 5
to a number. We aren't calling that function from within addr
though, instead we assign to the variable a5
and then call it. When addr(7)
is called it creates a new environment, it's x
value is not the same as the one from a5
.
A closure can be passed to a function that knows nothing about the surrounding scope.
defn key_sort( data : listï½¢anyï½£, key : string ) {
sort( data, (a,b) -> {
return a#key < b#key
})
}
The sort
function will be expecting a comparator for two objects. It does not know how that comparison is done, or that it might be accessing data from the enclosing scope. This code demonstrates that the closure, which we pass to sort
, is truly enclosing the code and data, it's not just some syntax trickery.
A compiler may well opt to use trickery in many cases where closures are used. The most generic approach, one that works for
sort
andreturn
has a bit of overhead. The simpler approach, shown withincr
earlier, can often be compiled without any runtime closure support.
A technical nit for the highly interested
The above explains basically what a closure is, and how it can be used. If you're into language theory, you might not be satisfied with some details. If you're not into language theory, feel free to stop reading now and go about having fun with closures.
Let's go back to this code:
defn pine = -> {
var x = 1
defn incr = -> {
print(x)
x = x + 1
}
incr() // 1
incr() // 2
}
I called incr
the closure which may not be technically true -- this depends on the language. In Leaf it's not yet a closure, it's a function that takes a "context". It's not much different than this function in C:
struct pine_context {
int x;
}
void incr( pine_context * pctx ) {
print( pctx->x );
pctx->x++;
}
It's when the function is called, incr()
, that a closure is made. The compiler sees that the function is expecting a pine_context
environment; it finds one in the immediately enclosing scope, that of pine
, which seems logical. It binds the context to the function and then makes the call.
This is one place where a compiler could instead use trickery. It avoids creating any closure object and instead calls the
incr
function with thepine_context
. This optimization allows using local closures without overhead cost.
This binding also needs to happen if incr
would escape the scope where it's declared.
defn pine = -> {
var x = 1
defn incr = -> {
print(x)
x = x + 1
}
return incr
}
var q = pine()
q() // 1
q() // 2
When the compiler encounters the return
statement, it realizes it needs to bind the incr
function to the pine_context
before it returns. After the return statement, there will be no further opportunity to find the context.
Isn't a closure just a class and instance in disguise? Good observation. Yes, there are usually syntactic, and some behavioural differences, but they both model the same relationship between data and code. In the Leaf compiler, they mostly share the same processing code.
Just the surfaces
Languages can behave quite differently with closures. It's important to understand that fundamentally it's just a combination of a function and some data. Knowing whether the scope is being shared or cloned is an important detail. Or where possible, like in C++, when parts of the context are shared and parts cloned.
I've strayed a bit into Leaf specific behaviour, just to give an idea of the complexities involved. If you'd like to dive even deeper into how closures are implemented, then let me know.
Top comments (0)