This article was published on Monday, October 8, 2018 by Eytan Manor @ The Guild Blog
The idea of writing such article popped into my mind while working on my
Webflow/React transpiler. All I wanted to do was to take a JS
code string and transform it in such way that globals won't be redefined if already so:
/* In */
foo = 'foo'
/* Out */
if (typeof window.foo === 'undefined') window.foo = 'foo'
At the beginning I thought I could do that with some help from a regular expression; but boy was I
wrong.
A regular expression is simply not enough because it ignores the concept of scoped variables
completely and works on a string as if it was a plain text. To determine a global variable, what we
need to ask ourselves is: Is this variable already declared in the current scope or one of its
parent scopes?
The way to go with such question would be breaking down the code into nodes, where each node
represents a part in our code and all the nodes are connected with each other in a relational
manner. This whole node formation is called AST — abstract syntax tree, which can be used to easily
lookup scopes and variables and other elements which are related to our code.
An example AST may look like so:
function foo(x) {
if (x > 10) {
var a = 2
return a * x
}
return x + 10
}
Example taken from
Lachezar Nickolov's article
about JS ASTs.
Obviously, breaking down our code into nodes is not a walk in the park. Luckily, we have a tool
called Babel which already does that.
Babel to the Rescue
Babel is a project which originally started to transform the latest es20XX syntax into es5 syntax
for better browser compatibility. As the Ecmascript committee keeps updating the standards of the
Ecmascript language, plug-ins provide an excellent and maintainable solution to easily update the
Babel compiler's behavior.
Babel is made out of numerous components which work together to bring the latest Ecmascript syntax
to life. Specifically the code transformation flow works with the following components and following
relations:
- The parser parses the code string into a data representational structure called AST (abstract
syntax tree) using
@babel/parser
. - The AST is being manipulated by pre-defined plug-ins which
use
@babel/traverse
. - The AST is being transformed back into code using
@babel/generator
.
Now you have a better understanding of Babel and you can actually understand what's happening when
you build a plug-in; and speaking of which, how do we do that?
Building and Using a Babel Plugin
First I would like us to understand Babel's generated AST as this is essential for building the
plug-in, because the plug-in's going to manipulate the AST, and therefore we need to understand it.
If you'll go to astexplorer.net you'll find an amazing
compiler that will transform code into AST. Let's take the code foo = "foo"
as an example. The
generated AST should look like so:
As you can see, each node in the tree represents a part of the code, and it's recursive. The
assignment expression foo = "foo"
uses the operator =
, the operand on the left is an identifier
named foo
and the operand on the right is a literal with the value "foo"
. So that's how it goes,
each part of the code can be presented as a node which is made out of other nodes, each node has a
type and additional properties based on its type.
Now let's say that we would like to change the value "foo"
to "bar"
, hypothetically speaking
what we will have to do would be grab the corresponding literal node and change its value from
"foo"
, to "bar"
. Let's take this simple example and turn it into a plug-in.
I've prepared a quick template project that you can use to quickly write plug-ins and test them by
transforming them. The project can be downloaded by cloning
this repository. The project contains the following
files:
-
in.js
- includes the input code that we would like to transform. -
out.js
- includes the output of the code we've just transformed. -
transform.js
- takes the code inin.js
, transforms it, and writes the new code toout.js
. -
plugin.js
- the transformation plug-in that will be applied throughout transformation.
To implement our plug-in, copy the following content and paste it in the in.js
file:
foo = 'foo'
and the following content to the transform.js
file:
module.exports = () => {
return {
visitor: {
AssignmentExpression(path) {
if (
path.node.left.type === 'Identifier' &&
path.node.left.name === 'foo' &&
path.node.right.type === 'Literal' &&
path.node.right.value === 'foo'
) {
path.node.right.value = 'bar'
}
}
}
}
}
To initiate the transformation, simply run node transform.js
. Now open the out.js
file, and you
should see the following content:
foo = 'bar'
The visitor
property is where the actual manipulation of the AST should be done. It walks through
the tree and runs the handlers for each specified node type. In our case, whenever the visitor has
encountered a node of type AssignmentExpression
node, it will replace the right operand with
"bar"
in case we assign the "foo"
value to foo
. We can add a manipulation handler for any node
type that we want, it can be AssignmentExpression
, Identifier
, Literal
, or even Program
,
which is the root node of the AST.
So going back to the main purpose of for which we gathered, I'll first provide you with a reminder:
/* In */
foo = 'foo'
/* Out */
if (typeof window.foo === 'undefined') window.foo = 'foo'
We will first take all global assignments and turn it into member assignment expressions of window
to prevent confusions and potential misunderstandings. I like to start by first exploring the
desired AST output:
And then writing the plug-in itself accordingly:
module.exports = ({ types: t }) => {
return {
visitor: {
AssignmentExpression(path) {
if (path.node.left.type === 'Identifier' && !path.scope.hasBinding(path.node.left.name)) {
path.node.left = t.memberExpression(
t.identifier('window'),
t.identifier(path.node.left.name)
)
}
}
}
}
}
I will now introduce you to 2 new concepts that I haven't mention before but are being used in the
plug-in above:
- The
types
object is a Lodash-esque utility library for AST nodes. It contains methods for building, validating, and converting AST nodes. It's useful for cleaning up AST logic with well thought out utility methods. Its methods should all start be equivalent to camel cased node types. All types are defined in@babel/types
, and further more, I recommend you to look at the source code as you build the plug-in in order to define the desired node creators' signatures, since most of it is not documented. More information regardstypes
can be found here. - Just like the
types
object, thescope
object contains utilities which are related to the current node's scope. It can check whether a variable is defined or not, generate unique variable IDs, or rename variables. In the plug-in above, we used thehasBinding()
method to check whether the identifier has a corresponding declared variable or not by climbing up the AST. More information regardsscope
can be found here.
Now we will add the missing peace to the puzzle which is transforming assignment expressions into
conditional assignment expressions. So we wanna turn this code:
window.foo = 'foo'
Into this code:
if (typeof window.foo === 'undefined') window.foo = 'foo'
If you'll investigate that code's AST you'll see that we're dealing with 3 new node types:
- UnaryExpression —
typeof window.foo
- BinaryExpression —
... === 'undefined'
- IfStatement —
if (...)
Notice how each node is composed out of the one above it. Accordingly, we will update our plug-in.
We will keep the old logic, where we turn global variables into members of window
, and on top of
that, we will make it conditional with the IfStatement
:
module.exports = ({ types: t }) => {
return {
visitor: {
AssignmentExpression(path) {
if (path.node.left.type === 'Identifier' && !path.scope.hasBinding(path.node.left.name)) {
path.node.left = t.memberExpression(
t.identifier('window'),
t.identifier(path.node.left.name)
)
}
if (path.node.left.type == 'MemberExpression' && path.node.left.object.name == 'window') {
const typeofNode = t.unaryExpression('typeof', path.node.left)
const isNodeUndefined = t.binaryExpression(
'===',
typeofNode,
t.stringLiteral('undefined')
)
const ifNodeUndefined = t.ifStatement(isNodeUndefined, t.expressionStatement(path.node))
path.replaceWith(ifNodeUndefined)
path.skip()
}
}
}
}
}
So basically what we do here is checking whether we deal with a window
member assignment
expression, and if so we will create the conditional statement and replace it with the current node.
Few notes:
- Without getting fancy with the explenation, I've created a nested
ExpressionStatement
inside theIfStatement
simply because this is what is expected of me, according to the AST. - I've used the
replaceWith
method to replace the current node with the newly created one. More about manipulation methods likereplaceWith
be found here. - Normally the
AssignmentExpression
handler should be called again, because technically I've created a new node of that type when we called thereplaceWith
method, but since I don't want to run another traversal for newly created nodes, I've called theskip
method, otherwise I would have had an infinite recursion. More about visiting methods likeskip
can be found here.
So there you go, by now the plug-in should be complete. It's not the most complex plug-in out there,
but it's definitely a good example for this intro that will give you a good basis for further
plug-ins that you'll build down the road.
As a recap, whenever you forget for any reason how a plug-in works, go through this article. As you
work on the plug-in itself, investigate through the desired AST outcome at
astexplorer.net and for API docs I recommend you to work with this
wonderful
handbook.
Top comments (0)