Build Custom JavaScript Code Transformation Tools: Complete AST Guide for Plugin Development

#programming #devto #javascript #softwareengineering

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Let's talk about making your own tools that change JavaScript code before it runs. I build these kinds of tools often, and I want to show you how it's done, step by step. Think of it like teaching a robot to read your code, understand it, and then rewrite it to be better or do new things. We'll start from the absolute beginning.

Everything begins with something called an Abstract Syntax Tree, or AST. It sounds complex, but it's just a way to turn your code into a structured map. Instead of seeing let x = 5; as text, the AST sees it as a "VariableDeclaration" node with a kind of "let," containing a "VariableDeclarator" that has an "Identifier" (x) and a "NumericLiteral" (5).

You need a parser to make this tree. I usually use @babel/parser. You give it a string of code, and it gives you back this detailed tree object.

const parser = require('@babel/parser');
const code = `let greeting = "hello, world";`;
const ast = parser.parse(code, { sourceType: 'module' });

console.log(JSON.stringify(ast, null, 2));
// You'll see a big JSON object describing the tree structure.

Once you have the tree, you need to walk through it and make changes. This is where the visitor pattern comes in. You tell your tool, "When you see a node of this type, run my function." It's like leaving instructions at every intersection in a maze.

const traverse = require('@babel/traverse').default;
const t = require('@babel/types');

// Let's find all string literals and log them.
traverse(ast, {
  StringLiteral(path) {
    console.log(`Found a string: "${path.node.value}" at line ${path.node.loc.start.line}`);
    // The 'path' object gives us access to the node and methods to change it.
  }
});

The real power is in changing the tree. Let's say I want to replace every string "hello, world" with "hola, mundo". I can do that by checking the node's value and replacing it.

traverse(ast, {
  StringLiteral(path) {
    if (path.node.value === 'hello, world') {
      path.replaceWith(t.stringLiteral('hola, mundo'));
      // t.stringLiteral() is a helper from babel-types to create a new node.
    }
  }
});

This is the core of any plugin. You find the patterns you care about and swap in new nodes.

Now, let's get more organized. A simple visitor is fine, but for bigger tools, I create a class to manage different transformations. This keeps my code clean.

class SimpleTransformer {
  constructor() {
    this.visitors = {};
  }

  addVisitor(nodeType, handler) {
    this.visitors[nodeType] = handler;
  }

  transform(ast) {
    traverse(ast, this.visitors);
    return ast;
  }
}

const myTransformer = new SimpleTransformer();
myTransformer.addVisitor('StringLiteral', (path) => {
  path.node.value = path.node.value.toUpperCase();
});

const newAST = myTransformer.transform(ast);
// All strings in the code are now uppercase.

A common need is to understand where variables come from. This is called scope tracking. When you see console.log(x), you need to know if x was declared with let in this function, or if it's a global variable. Building a scope tracker helps with refactoring and finding bugs.

I build a system of connected scopes. Each function or block creates a new scope that sits inside its parent scope.

class SimpleScope {
  constructor(parent = null) {
    this.parent = parent;
    this.bindings = new Map(); // Stores variable names and their info
  }

  addBinding(name, node) {
    this.bindings.set(name, { node, references: [] });
  }

  findBinding(name) {
    if (this.bindings.has(name)) {
      return this.bindings.get(name);
    }
    if (this.parent) {
      return this.parent.findBinding(name);
    }
    return null; // Not found anywhere
  }
}

// Let's use it while traversing the AST.
let currentScope = new SimpleScope();

traverse(ast, {
  FunctionDeclaration(path) {
    // Entering a function creates a new scope.
    const functionScope = new SimpleScope(currentScope);
    const oldScope = currentScope;
    currentScope = functionScope;

    // The function name itself is a binding in the parent scope.
    oldScope.addBinding(path.node.id.name, path.node.id);

    // Parameters are bindings in the function's own scope.
    path.node.params.forEach(param => {
      if (t.isIdentifier(param)) {
        currentScope.addBinding(param.name, param);
      }
    });

    // Traverse the function body
    path.traverse({
      VariableDeclarator(childPath) {
        if (t.isIdentifier(childPath.node.id)) {
          currentScope.addBinding(childPath.node.id.name, childPath.node.id);
        }
      },
      Identifier(childPath) {
        // Check if this is a reference to a variable, not a declaration.
        if (childPath.parent.type !== 'VariableDeclarator' && childPath.parent.type !== 'FunctionDeclaration') {
          const binding = currentScope.findBinding(childPath.node.name);
          if (binding) {
            binding.references.push(childPath.node);
            console.log(`Variable "${childPath.node.name}" is used here.`);
          }
        }
      }
    });

    // Exit the function scope
    currentScope = oldScope;
    path.skip(); // Don't traverse this node's children again
  }
});

With scope information, I can build a plugin that finds unused variables. If a binding has zero references, maybe we can delete it.

function findUnusedVariables(ast) {
  const unused = [];
  const allBindings = new Map();

  // First pass: collect all bindings
  traverse(ast, {
    VariableDeclarator(path) {
      if (t.isIdentifier(path.node.id)) {
        allBindings.set(path.node.id.name, { node: path.node, refCount: 0 });
      }
    },
    FunctionDeclaration(path) {
      allBindings.set(path.node.id.name, { node: path.node.id, refCount: 0 });
    }
  });

  // Second pass: count references
  traverse(ast, {
    Identifier(path) {
      const name = path.node.name;
      if (allBindings.has(name) && path.parent.type !== 'VariableDeclarator' && path.parent.type !== 'FunctionDeclaration') {
        allBindings.get(name).refCount++;
      }
    }
  });

  // Find bindings with zero references
  for (const [name, data] of allBindings) {
    if (data.refCount === 0) {
      unused.push({ name, node: data.node });
    }
  }
  return unused;
}

After changing the AST, I need to turn it back into code I can run. This is code generation. Babel provides a generator for this, but sometimes I need custom formatting.

const generate = require('@babel/generator').default;

const transformedCode = generate(newAST).code;
console.log(transformedCode);

If I'm making big changes, I also need source maps. A source map is a file that links the new code back to the original. This is crucial for debugging, so when you get an error on line 5 of the new code, your browser can show you it actually came from line 12 of your original file.

const result = generate(newAST, {
  sourceMaps: true,
  sourceFileName: 'original.js'
}, code); // Pass the original source string

console.log(result.code);
console.log(result.map); // This is the source map object

One of the most exciting things I do is add new syntax. Imagine if JavaScript had a "pipe" operator |> like some other languages, where x |> f means f(x). We can't run that in a browser today, but I can make a plugin that changes it into regular JavaScript.

I'd start by parsing the new syntax. I might need to modify the parser to understand |>. Then, I write a visitor to transform it.

// We'll assume our parser can now create a 'PipelineExpression' node.
traverse(ast, {
  PipelineExpression(path) {
    // A node like: left |> right
    const { left, right } = path.node;

    // We need to turn this into a function call: right(left)
    let callExpression;
    if (t.isCallExpression(right)) {
      // If right is already a call like 'f(y)', we need 'f(left, y)'
      callExpression = t.callExpression(right.callee, [left, ...right.arguments]);
    } else {
      // If right is just an identifier like 'f', we need 'f(left)'
      callExpression = t.callExpression(right, [left]);
    }
    path.replaceWith(callExpression);
  }
});

// So, `const y = x |> double |> addFive;` becomes:
// `const y = addFive(double(x));`

I also build plugins for performance. One simple trick is constant folding. If you write const x = 2 * 3 * 5;, why calculate that at runtime? My plugin can do the math once during compilation and make it const x = 30;.

traverse(ast, {
  BinaryExpression(path) {
    const { left, right, operator } = path.node;
    // Check if both sides are simple number literals
    if (t.isNumericLiteral(left) && t.isNumericLiteral(right)) {
      let result;
      switch (operator) {
        case '+': result = left.value + right.value; break;
        case '-': result = left.value - right.value; break;
        case '*': result = left.value * right.value; break;
        case '/': result = left.value / right.value; break;
        default: return; // Don't fold other operators
      }
      path.replaceWith(t.numericLiteral(result));
    }
  }
});

Another performance plugin is dead code elimination. If I see an if (false) block, I know the code inside can never run. I can remove the entire block.

traverse(ast, {
  IfStatement(path) {
    const test = path.node.test;
    if (t.isBooleanLiteral(test)) {
      if (test.value === true) {
        // if (true) { ... } -> just keep the 'consequent' block
        path.replaceWith(t.isBlockStatement(path.node.consequent) ? path.node.consequent : t.blockStatement([path.node.consequent]));
      } else if (test.value === false && path.node.alternate) {
        // if (false) { ... } else { ... } -> keep the 'alternate' block
        path.replaceWith(t.isBlockStatement(path.node.alternate) ? path.node.alternate : t.blockStatement([path.node.alternate]));
      } else if (test.value === false) {
        // if (false) { ... } -> remove the entire statement
        path.remove();
      }
    }
  }
});

For larger projects, I might add a custom type system. JavaScript doesn't check types until runtime, but I can make a plugin that checks them at build time. I could create a special comment syntax or use JavaScript's existing JSDoc comments.

// My custom syntax: `let x: number = 5;`
// I would parse the `: number` part as a type annotation node.
traverse(ast, {
  VariableDeclarator(path) {
    const id = path.node.id;
    if (id.typeAnnotation) {
      const typeName = id.typeAnnotation.typeAnnotation.name; // e.g., 'number'
      const init = path.node.init;

      // Add a runtime check in development builds
      if (process.env.NODE_ENV === 'development') {
        const checkBlock = t.expressionStatement(
          t.callExpression(
            t.memberExpression(t.identifier('assert'), t.identifier('type')),
            [
              init,
              t.stringLiteral(typeName),
              t.stringLiteral(id.name)
            ]
          )
        );
        // Insert this check after the variable declaration
        path.parentPath.insertAfter(checkBlock);
      }
      // Remove the type annotation from the final code
      id.typeAnnotation = null;
    }
  }
});
// This transforms `let x: number = 5;` into:
// `let x = 5; assert.type(x, 'number', 'x');`

Putting it all together, building a plugin is about finding a pattern in the AST and applying a rule. I start small, with a single transformation. I test it on a tiny piece of code. Then I gradually handle more complex cases and edge conditions.

The goal is to make the computer do repetitive work, enforce team rules, or even create a small language that makes sense just for our project. It turns the build process from a slow, mysterious step into an active assistant that improves my code as I write it. It's not magic; it's just a careful, step-by-step process of reading a tree and writing a new one.

📘 Checkout my latest ebook for free on my channel!

Be sure to like, share, comment, and subscribe to the channel!

101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!