DEV Community

Kim Brandwijk
Kim Brandwijk

Posted on

From 10,000 to 18: Drastically Improving Luau Bytecode Decompilation Quality

A deep dive into the techniques that reduced unnamed temporaries by 98.8%

Introduction

When reverse-engineering compiled Luau bytecode (the VM used by Roblox and various game engines), decompilers face a fundamental challenge: the compiler has optimized away much of the structure that made the original code readable. Variable names become register numbers, control flow becomes jumps, and elegant expressions become sequences of stack operations.

I spent several weeks improving medal, an open-source Luau decompiler, contributing 7 pull requests that transformed output like this:

-- BEFORE: Unreadable mess of temporaries
local v1_ = treesData.growingTrees
local v2_ = #v1_
local v3_ = 1
while v3_ <= v2_ do
    local v4_ = v1_[v3_]
    local v5_ = v4_.treeTypeIndex
    local v6_ = self.pooledAnimals:getOrCreateNext()
    v6_:setPosition(v4_.x, v4_.y, v4_.z)
    v3_ = v3_ + 1
end
Enter fullscreen mode Exit fullscreen mode

Into this:

-- AFTER: Clean, readable code with original names
local growingTrees = treesData.growingTrees
local numGrowingTrees = #growingTrees
local growingTreeIndex = 1
while growingTreeIndex <= numGrowingTrees do
    local tree = growingTrees[growingTreeIndex]
    local treeTypeIndex = tree.treeTypeIndex
    local animal = self.pooledAnimals:getOrCreateNext()
    animal:setPosition(tree.x, tree.y, tree.z)
    growingTreeIndex = growingTreeIndex + 1
end
Enter fullscreen mode Exit fullscreen mode

This article walks through each optimization technique, showing the bytecode patterns that cause problems and the AST transformations that fix them.


How Decompilation Works

Before diving into specific optimizations, let's understand the overall decompilation pipeline and the key data structures involved.

The Decompilation Pipeline

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│  Bytecode   │───▶│     CFG     │───▶│     SSA     │───▶│     AST     │───▶│   Source    │
│   (Input)   │    │Construction │    │   Form      │    │  Building   │    │   (Output)  │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
     Parse            Lift              Transform          Structure          Format
Enter fullscreen mode Exit fullscreen mode

Stage 1: Bytecode Parsing
Read the binary bytecode format, extracting:

  • Instructions (opcodes + operands)
  • Constant pool (strings, numbers, tables)
  • Debug information (variable names, line numbers)
  • Function prototypes (nested functions, upvalue references)

Stage 2: CFG Construction
Build a Control Flow Graph - a directed graph where:

  • Nodes are "basic blocks" (sequences of instructions with no jumps in the middle)
  • Edges represent possible execution paths (jumps, branches, fall-through)

Stage 3: SSA Transformation
Convert to Static Single Assignment form for analysis (explained below).

Stage 4: AST Building
Reconstruct high-level Abstract Syntax Tree structures:

  • Convert CFG patterns back to if/while/for statements
  • Collapse temporaries into expressions
  • Apply pattern-matching optimizations

Stage 5: Code Generation
Format the AST as readable Lua source code.

What is a CFG?

A Control Flow Graph represents all possible execution paths through a program. Each node is a "basic block" - a straight-line sequence of instructions with:

  • One entry point (execution always starts at the first instruction)
  • One exit point (execution always ends at the last instruction)
Original bytecode:           CFG representation:

0: LOADN R0, 0              ┌──────────────────┐
1: LOADN R1, 10             │ Block 0 (entry)  │
2: JUMPIFLT R0, R1, +5  ───▶│ R0 = 0           │
                            │ R1 = 10          │
                            │ if R0 < R1 goto 1│
                            └────────┬─────────┘
                                     │
                         ┌───────────┴───────────┐
                         ▼                       ▼
                ┌─────────────────┐    ┌─────────────────┐
                │    Block 1      │    │    Block 2      │
                │ (loop body)     │    │ (after loop)    │
                │ R0 = R0 + 1     │    │ return R0       │
                │ jump to Block 0 │    └─────────────────┘
                └────────┬────────┘
                         │
                         └──────────▶ (back to Block 0)
Enter fullscreen mode Exit fullscreen mode

The CFG makes control flow explicit, which is essential for:

  • Detecting loops (back-edges in the graph)
  • Finding unreachable code
  • Understanding variable lifetimes

What is SSA?

Static Single Assignment is an intermediate representation where every variable is assigned exactly once. This simplifies analysis because you always know where a value came from.

-- Original code
x = 1
x = x + 1
y = x * 2

-- SSA form
x = 1
x = x + 1
y = x * 2
Enter fullscreen mode Exit fullscreen mode

Each assignment creates a new "version" of the variable (subscripts). But what happens when control flow merges?

-- Original
if condition then
    x = 1
else
    x = 2
end
print(x)  -- Which x?
Enter fullscreen mode Exit fullscreen mode

SSA uses phi functions (φ) at merge points:

-- SSA form
if condition then
    x = 1
else
    x = 2
end
x = φ(x, x)  -- "x₃ is x₁ if we came from then-branch, x₂ if from else-branch"
print(x)
Enter fullscreen mode Exit fullscreen mode

Why SSA matters for decompilation:

  • Copy propagation: If x₂ = x₁, we can replace all uses of x₂ with x₁
  • Dead code elimination: If a variable version is never used, remove its assignment
  • Name preservation: When merging versions, we can prefer the one with a debug name

What is an AST?

An Abstract Syntax Tree represents the hierarchical structure of source code. Unlike bytecode (a flat list of instructions), an AST captures nesting and relationships.

-- Source code
if x > 0 then
    y = x * 2
end
Enter fullscreen mode Exit fullscreen mode
-- AST representation
IfStatement
├── condition: BinaryOp(>)
│   ├── left: Local("x")
│   └── right: Number(0)
├── then_block: Block
│   └── AssignStatement
│       ├── left: Local("y")
│       └── right: BinaryOp(*)
│           ├── left: Local("x")
│           └── right: Number(2)
└── else_block: Block (empty)
Enter fullscreen mode Exit fullscreen mode

Key AST node types:

  • Statements: Assign, If, While, For, Return, Call (as statement)
  • Expressions (RValues): Local, Number, String, Binary, Unary, Call, Index, Closure
  • LValues (assignable): Local, Index (table field), Global

The decompiler's job is to reconstruct this tree structure from flat bytecode, then format it as readable source.

Where Optimizations Happen

Different optimizations happen at different stages:

Stage Optimizations
CFG Loop detection, unreachable code removal
SSA Copy propagation, dead code elimination, name preservation
AST Pattern matching (and-chains, ternaries), expression inlining
Format Method notation (: vs .), spacing, indentation

Most of the improvements in this article happen at the SSA and AST stages, where we have enough structure to recognize patterns but haven't yet committed to final output.


The Anatomy of Luau Bytecode

Before diving into optimizations, let's understand what we're working with. Luau bytecode is a register-based VM instruction set. Here's what a simple function looks like:

-- Original source
function Color.new(r, g, b, a)
    local self = setmetatable({}, Color_mt)
    self.r = r or 0
    self.g = g or 0
    self.b = b or 0
    self.a = a or 1
    return self
end
Enter fullscreen mode Exit fullscreen mode

Bytecode:

=== Function 2 (new) ===
Parameters: 4, Stack: 7, Upvalues: 1

     0: LOP_NEWTABLE    R5, 3, 0          -- Create empty table
     2: LOP_GETUPVAL    R6, 0             -- Get Color_mt upvalue
     3: LOP_FASTCALL2   setmetatable, R5  -- setmetatable({}, Color_mt)
     5: LOP_GETIMPORT   R4, setmetatable
     7: LOP_CALL        R4, 3, 2          -- Result in R4
     8: LOP_ORK         R5, R0, K2        -- r or 0
     9: LOP_SETTABLEKS  R5, R4, "r"       -- self.r = result
    11: LOP_ORK         R5, R1, K2        -- g or 0
    12: LOP_SETTABLEKS  R5, R4, "g"       -- self.g = result
    ...
    20: LOP_RETURN      R4, 2             -- return self

Local Debug Info:
  R0: "r"    (PC 0-21)
  R1: "g"    (PC 0-21)
  R2: "b"    (PC 0-21)
  R3: "a"    (PC 0-21)
  R4: "self" (PC 8-21)
Enter fullscreen mode Exit fullscreen mode

Key observations:

  1. Registers are reused - R5 is used as a scratch register multiple times
  2. Debug info exists but is partial - Parameters and some locals have names, but temporaries don't
  3. Structure is flattened - The compiler doesn't preserve expression boundaries

Preserving Debug Names Through SSA

The Problem

Luau bytecode includes debug information mapping registers to variable names. But during SSA (Static Single Assignment) transformation, these names were being lost:

-- Raw decompilation output
local v1_ = tonumber(x)
local v2_ = tonumber(y)
local v3_ = tonumber(z)
Enter fullscreen mode Exit fullscreen mode

The bytecode actually had debug names:

Local Debug Info:
  R0: "x" (PC 0-15)
  R1: "y" (PC 0-15)
  R2: "z" (PC 0-15)
Enter fullscreen mode Exit fullscreen mode

The Solution

The fix involved three changes:

  1. Parse debug info from bytecode (luau-lifter/src/deserializer/function.rs)
  2. Preserve names during lifting (luau-lifter/src/lifter.rs)
  3. Prefer named locals during SSA destruction (cfg/src/ssa/destruct.rs)

The SSA destruction phase was the critical fix. When merging SSA versions, we now check which version has a debug name:

// In destruct.rs - prefer named locals when selecting representatives
fn select_representative(&self, versions: &[RcLocal]) -> RcLocal {
    // Prefer version with a name
    for version in versions {
        if version.name().is_some() {
            return version.clone();
        }
    }
    versions[0].clone()
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: Original names preserved
x = tonumber(x)
y = tonumber(y)
z = tonumber(z)
Enter fullscreen mode Exit fullscreen mode

Fixing Loop Variable Inlining

The Problem

While loop conditions were having their initial values incorrectly substituted:

-- Broken output
while 1 <= numGrowingTrees do
    -- uses growingTreeIndex internally
end
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_LOADN      R2, 1           -- growingTreeIndex = 1
     5: LOP_JUMPIFNOTLT R2, R1, +20    -- while growingTreeIndex <= numGrowingTrees
    10: LOP_GETTABLE   R4, R0, R2      -- trees[growingTreeIndex]
    ...
    18: LOP_ADDK       R2, R2, K1      -- growingTreeIndex = growingTreeIndex + 1
    20: LOP_JUMPBACK   -15             -- loop back
Enter fullscreen mode Exit fullscreen mode

The SSA inlining pass saw R2 = 1 and tried to substitute 1 everywhere R2 was used - including the loop condition.

The Solution

Detect loop headers via back-edge analysis and protect loop phi parameters:

// In inline.rs
fn is_loop_header(&self, block_id: BlockId) -> bool {
    // A block is a loop header if any predecessor has a higher block ID
    // (indicating a back-edge from within the loop)
    self.function.predecessors(block_id)
        .any(|pred| pred > block_id)
}

fn should_inline(&self, local: &RcLocal, block_id: BlockId) -> bool {
    if self.is_loop_header(block_id) {
        // Don't inline phi parameters in loop headers
        if self.is_phi_parameter(local, block_id) {
            return false;
        }
    }
    true
}
Enter fullscreen mode Exit fullscreen mode

Propagating Upvalue Names

The Problem

When a local is captured as an upvalue by a nested closure, the closure's bytecode might not have the debug name even though the parent function does:

-- Broken output
function ConfigurationManager:configurationKeyIterator()
    return function()
        -- upvalues: (ref) v_u_3_, (copy) numElements
        if numElements <= v_u_3_ then
            return nil
        end
        v_u_3_ = v_u_3_ + 1
Enter fullscreen mode Exit fullscreen mode

The Solution

During upvalue linking, propagate names from parent locals:

// In link_upvalues()
fn link_upvalues(&mut self, parent_func: &Function, child_func: &mut Function) {
    for (upval_idx, parent_local) in child_func.upvalue_sources.iter().enumerate() {
        let child_upval = &mut child_func.upvalues[upval_idx];

        // If parent has a name and child doesn't, propagate it
        if let Some(name) = parent_local.name() {
            child_upval.set_name_if_unnamed(name);
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: Upvalue inherits parent's name
function ConfigurationManager:configurationKeyIterator()
    return function()
        -- upvalues: (ref) currentIndex, (copy) numElements
        if numElements <= currentIndex then
            return nil
        end
        currentIndex = currentIndex + 1
Enter fullscreen mode Exit fullscreen mode

Self Parameter to Method Notation

The Problem

Lua supports two calling conventions for methods:

  • obj.method(obj, args) - explicit self
  • obj:method(args) - implicit self (syntactic sugar)

The decompiler was always outputting the explicit form:

function BaseMission.initialize(self)
function BaseMission.delete(self)
function BaseMission.update(self, dt)
Enter fullscreen mode Exit fullscreen mode

The Solution

Detect functions defined on tables with self as first parameter:

fn should_convert_to_method(func: &Function) -> bool {
    // Must be assigned to a table field (Class.method pattern)
    let is_table_method = matches!(&func.definition_site,
        Some(LValue::Index(idx)) if idx.table.is_global());

    // First parameter must be named "self"
    let has_self_param = func.parameters.first()
        .and_then(|p| p.name())
        .map_or(false, |n| n == "self");

    is_table_method && has_self_param
}
Enter fullscreen mode Exit fullscreen mode

When formatting, convert to colon notation and remove the self parameter:

impl Display for NamedFunction {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        if self.use_method_notation {
            // Output: function Table:method(other_params)
            write!(f, "function {}:{}({})",
                self.table_name,
                self.method_name,
                self.params_without_self())?;
        } else {
            // Output: function Table.method(self, other_params)
            write!(f, "function {}.{}({})", ...)?;
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: Idiomatic Lua
function BaseMission:initialize()
function BaseMission:delete()
function BaseMission:update(dt)
Enter fullscreen mode Exit fullscreen mode

Boolean Ternary Simplification

The Problem

The Luau compiler transforms boolean expressions into control flow:

-- Source
MobileHUD.new(g_server ~= nil, ...)

-- Compiled to bytecode, then decompiled
local v3_
if g_server == nil then
    v3_ = false
else
    v3_ = true
end
MobileHUD.new(v3_, ...)
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_GETGLOBAL  R5, "g_server"
     2: LOP_JUMPIFNIL  R5, +4           -- if g_server == nil
     4: LOP_LOADB      R3, 1, +2        -- v3_ = true, skip next
     6: LOP_LOADB      R3, 0            -- v3_ = false
     8: ...                             -- continue
Enter fullscreen mode Exit fullscreen mode

The Solution

Pattern match on the AST after initial decompilation:

fn simplify_boolean_ternaries(block: &mut Block) {
    let mut i = 0;
    while i < block.len() {
        // Look for: if cond then x = true else x = false end
        if let Statement::If(if_stat) = &block[i] {
            if let (Some(then_assign), Some(else_assign)) =
                (single_bool_assign(&if_stat.then_block),
                 single_bool_assign(&if_stat.else_block))
            {
                // Both branches assign to same variable
                if then_assign.target == else_assign.target {
                    let (then_val, else_val) =
                        (then_assign.value, else_assign.value);

                    if then_val == true && else_val == false {
                        // x = cond
                        block[i] = assign(then_assign.target,
                            if_stat.condition.clone());
                    } else if then_val == false && else_val == true {
                        // x = not cond
                        block[i] = assign(then_assign.target,
                            negate(if_stat.condition.clone()));
                    }
                }
            }
        }
        i += 1;
    }
}
Enter fullscreen mode Exit fullscreen mode

For conditions like x == nil, transform to x ~= nil for the boolean context.

Result

-- After: Direct boolean expression
MobileHUD.new(g_server ~= nil, ...)
Enter fullscreen mode Exit fullscreen mode

Numeric For-Loop Inlining

The Problem

For-loop bounds were extracted into temporaries:

local v2_ = select("#", ...)
for i = 1, v2_ do
    -- body
end
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_GETIMPORT  R2, select
     2: LOP_LOADK      R3, "#"
     3: LOP_GETVARARGS R4, -1
     4: LOP_CALL       R2, -1, 2        -- v2_ = select("#", ...)
     5: LOP_LOADN      R3, 1            -- initial = 1
     6: LOP_MOVE       R4, R2           -- limit = v2_
     7: LOP_LOADN      R5, 1            -- step = 1
     8: LOP_FORNPREP   R3, +10          -- for i = R3, R4, R5
Enter fullscreen mode Exit fullscreen mode

The Solution

Search backwards from for-loops to find assignments that can be inlined:

fn inline_into_numeric_for_loops(&self, block: &mut Block) {
    for for_idx in 0..block.len() {
        let Statement::NumericFor(nf) = &block[for_idx] else { continue };

        // Check if limit is a local reference
        let RValue::Local(limit_local) = &nf.limit else { continue };

        // Search backwards for assignment to this local
        for assign_idx in (0..for_idx).rev() {
            if let Statement::Assign(assign) = &block[assign_idx] {
                if assign.targets_local(limit_local) {
                    // Inline the RHS directly into the for-loop
                    let new_limit = assign.right[0].clone();
                    block[for_idx].as_numeric_for_mut().limit = new_limit;
                    block.remove(assign_idx);
                    break;
                }
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: Expression inlined
for i = 1, select("#", ...) do
    -- body
end
Enter fullscreen mode Exit fullscreen mode

And-Chain and Multi-Return Collapse

And-Chain Simplification

Pattern: local v = a; if v then v = b end

-- Before
local v = self.isValid
if v then
    v = self.isReady
end
return v

-- After
local v = self.isValid and self.isReady
return v
Enter fullscreen mode Exit fullscreen mode

The transformation detects when:

  1. A local is assigned a value
  2. The next statement is if local then local = other_value end
  3. The local is the condition AND is reassigned in the then-branch

Multi-Return Collapse

Pattern: Multiple returns captured then immediately assigned to fields

-- Before
local v1_, v2_, v3_ = getWorldTranslation(node)
self.x = v1_
self.y = v2_
self.z = v3_

-- After
self.x, self.y, self.z = getWorldTranslation(node)
Enter fullscreen mode Exit fullscreen mode

Implementation searches for where each temporary is used:

fn collapse_multi_return_assignments(block: &mut Block) {
    // Find: local temps = call()
    let assign_idx = find_multi_return_assign(block)?;
    let temps = &block[assign_idx].as_assign().left;

    // For each temp, find where it's used in subsequent statements
    let mut targets = Vec::new();
    for (i, temp) in temps.iter().enumerate() {
        if temp.is_underscore() {
            targets.push(LValue::Local(underscore()));
            continue;
        }

        // Search for: target = temp
        let target = find_temp_usage(block, assign_idx + 1, temp)?;
        targets.push(target);
    }

    // Rewrite to: targets = call()
    let call = block[assign_idx].as_assign().right[0].clone();
    block[assign_idx] = Assign::new(targets, vec![call]).into();

    // Remove the now-dead individual assignments
    remove_temp_assignments(block, assign_idx + 1, temps);
}
Enter fullscreen mode Exit fullscreen mode

Side-Effect Lookahead and SSA Name Preservation

This set of optimizations together achieved a 98.5% reduction in unnamed temporaries.

Preserving Named Variables in SSA Copy Propagation

The most impactful single change. During SSA copy propagation, when eliminating copies like named = temp, prefer keeping named:

// In construct.rs - propagate_copies()
if assign.is_copy() {
    let (from, to) = (assign.right, assign.left);

    // Prefer keeping the variable with a debug name
    let from_has_name = from.name().is_some();
    let to_has_name = to.name().is_some();

    if from_has_name && !to_has_name {
        // from has name, to doesn't -> keep from, map to → from
        self.local_map.insert(to.clone(), from.clone());
    } else {
        // Keep to (original behavior)
        self.local_map.insert(from.clone(), to.clone());
    }
}
Enter fullscreen mode Exit fullscreen mode

This single change reduced temporaries from ~2176 to ~611 (72% reduction).

Side-Effect Inlining with Lookahead

Previously, temporaries with side effects (like function calls) could only be inlined into the immediately following statement. This failed for:

local v4_ = tostring(x)
local v5_ = y              -- intervening statement
local v6_ = tostring(v5_)
foo(v4_ .. v6_)            -- v4_ used here, but not adjacent
Enter fullscreen mode Exit fullscreen mode

The fix: look ahead to find where the temporary is used, verifying no intervening statement modifies the values it depends on:

fn find_safe_inline_target(&self, local: &RcLocal, rvalue: &RValue,
                            block: &Block, def_idx: usize) -> Option<usize> {
    // What locals does this RValue read?
    let reads: HashSet<_> = rvalue.values_read().into_iter().collect();

    // Scan forward looking for where local is used
    for i in (def_idx + 1)..block.len() {
        let stmt = &block[i];

        // Check if this statement writes to any local we need to read
        let writes = stmt.values_written();
        if writes.iter().any(|w| reads.contains(w)) {
            return None;  // Can't inline past this
        }

        // Check if this statement uses our local
        if stmt.values_read().contains(local) {
            return Some(i);
        }
    }
    None
}
Enter fullscreen mode Exit fullscreen mode

Or-Chain Simplification

Pattern: local v = x; if not v then v = y endlocal v = x or y

-- Before
local activeElement = self:getActiveElement()
if not activeElement then
    activeElement = self:getDefaultElement()
end

-- After
local activeElement = self:getActiveElement() or self:getDefaultElement()
Enter fullscreen mode Exit fullscreen mode

Write-Only Local Simplification

Pattern: Locals declared then only written (never read) become _:

-- Before
local v28_, v29_
v28_, rotY, v29_ = getWorldRotation(self.node)

-- After
_, rotY, _ = getWorldRotation(self.node)
Enter fullscreen mode Exit fullscreen mode

Dead Local Removal

Locals that are assigned but never read are eliminated:

-- Before: v8_ is never used
local v8_ = numKeys
for i = 2, numKeys do
    -- uses numKeys directly
end

-- After
for i = 2, numKeys do
    -- v8_ eliminated
end
Enter fullscreen mode Exit fullscreen mode

__set_list Pattern Simplification

The Luau compiler generates SETLIST for array literals, which decompiles to:

-- Before
local v35_ = {}
__set_list(v35_, 1, {a, b, c})
return v35_

-- After
return {a, b, c}
Enter fullscreen mode Exit fullscreen mode

Inlining Into Nested Blocks

The Problem

Side-effect-free values defined at parent scope couldn't be inlined into nested if/while/for blocks:

local v14_ = true
if direction ~= nil then
    pressedAccept = v14_   -- v14_ not inlined
end
Enter fullscreen mode Exit fullscreen mode

The inliner was clearing pending inlines before recursing into nested blocks.

The Solution

Pass deferred inlines down into nested blocks:

fn apply_pending_deferred_to_nested(&mut self, statement: &mut Statement) {
    if self.deferred_inlines.is_empty() {
        return;
    }

    match statement {
        Statement::If(if_stat) => {
            self.apply_deferred_to_block(&mut if_stat.then_block);
            self.apply_deferred_to_block(&mut if_stat.else_block);
        }
        Statement::While(while_stat) => {
            self.apply_deferred_to_block(&mut while_stat.block);
        }
        // ... other control structures
    }
}

fn apply_deferred_to_block(&mut self, block: &mut Block) {
    for stmt in block.iter_mut() {
        // Apply any pending inlines to this statement
        for local in stmt.values_read() {
            if let Some(rvalue) = self.deferred_inlines.remove(&local) {
                self.replace_local_with_rvalue(stmt, &local, rvalue);
            }
        }
        // Recurse into nested blocks
        self.apply_pending_deferred_to_nested(stmt);
    }
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: v14_ inlined
if direction ~= nil then
    pressedAccept = true
end
Enter fullscreen mode Exit fullscreen mode

Numeric For-Loop Shadowing Fix

The Problem

The for-loop inlining was working backwards - it was replacing good variable names with temporaries instead of the other way around:

-- Before (broken)
local v27_ = minX
for minX = v27_, maxX, step do  -- Wrong direction!
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_GETUPVAL   R3, 0             -- minX from upvalue
     1: LOP_MOVE       R4, R3            -- Copy to R4 (for-loop init)
     2: LOP_GETUPVAL   R5, 1             -- maxX
     3: LOP_LOADN      R6, 1             -- step = 1
     4: LOP_FORNPREP   R4, +10           -- for minX = R4, R5, R6

Local Debug Info:
  R3: "minX" (PC 0-20)    -- Original variable
  R4: "minX" (PC 4-15)    -- For-loop counter shadows it
Enter fullscreen mode Exit fullscreen mode

The issue: register R3 has the debug name "minX", but R4 (the for-loop counter) also has the name "minX". When we see R4 = R3, we were inlining the wrong direction - replacing the named R3 with temp, instead of eliminating the temp.

The Solution

Only inline when the ASSIGNED local (LHS) is an unnamed temporary:

fn inline_into_numeric_for_loops(block: &mut Block) {
    // ...
    // Only inline if the ASSIGNED local is an unnamed temporary
    let assigned_local = assign.left[0].as_local()?;
    let is_unnamed_temp = assigned_local.name()
        .map_or(true, |n| n.starts_with('v') && n.ends_with('_'));

    if !is_unnamed_temp {
        continue;  // Don't inline away named variables
    }
    // ... proceed with inlining
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After (correct)
for minX = minX, maxX, step do
Enter fullscreen mode Exit fullscreen mode

Intervening Statements in SetList Collapse

The Problem

The __set_list simplification only worked when the table creation was immediately followed by the SetList call. Real code often has intervening statements:

local v3_ = {}
local i = self.componentJoints[spec.frontAxisJoint].rotLimit  -- intervening!
__set_list(v3_, 1, {unpack(i)})
spec.rotLimit = v3_
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_NEWTABLE   R3, 0, 1          -- v3_ = {}
     2: LOP_GETTABLEKS R4, R0, "componentJoints"
     4: LOP_GETTABLEKS R5, R1, "frontAxisJoint"
     6: LOP_GETTABLE   R4, R4, R5
     7: LOP_GETTABLEKS R4, R4, "rotLimit"  -- i = self.componentJoints[...].rotLimit
     9: LOP_DUPTABLE   R5, K0             -- Prepare array for SETLIST
    11: LOP_GETIMPORT  R6, unpack
    13: LOP_MOVE       R7, R4
    14: LOP_CALL       R6, 2, -1          -- unpack(i)
    15: LOP_SETLIST    R5, R6, -1         -- __set_list call
    17: LOP_MOVE       R5, R3             -- Copy table
    18: LOP_SETTABLEKS R5, R0, "rotLimit" -- spec.rotLimit = v3_
Enter fullscreen mode Exit fullscreen mode

The NEWTABLE (line 0) and SETLIST (line 15) have unrelated statements between them. The decompiler needs to recognize these can still be collapsed.

The Solution

Allow safe intervening statements - those that don't read or write the table local:

fn find_set_list_with_intervening(block: &Block, table_local: &RcLocal,
                                   start_idx: usize) -> Option<(usize, Vec<usize>)> {
    let mut intervening_indices = Vec::new();

    for i in start_idx..block.len() {
        let stmt = &block[i];

        // Found the SetList call?
        if is_set_list_for(stmt, table_local) {
            return Some((i, intervening_indices));
        }

        // Check if this statement touches the table local
        if let Statement::Assign(assign) = stmt {
            let reads = assign.values_read();
            let writes = assign.values_written();

            if reads.contains(table_local) || writes.contains(table_local) {
                return None;  // Can't skip this statement
            }

            intervening_indices.push(i);
        } else {
            return None;  // Non-assignment statement, stop
        }
    }
    None
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: intervening statement preserved, table eliminated
local i = self.componentJoints[spec.frontAxisJoint].rotLimit
spec.rotLimit = {unpack(i)}
Enter fullscreen mode Exit fullscreen mode

Post-Simplification Inlining Pass

The Problem

After simplify_and_chains collapses split and-chain assignments, some temporaries become single-def-single-use but aren't inlined because the inlining pass already ran:

-- After simplify_and_chains:
local v69_ = A and B and C == D  -- now single def!
hotspot:setVisible(v69_)         -- single use!
Enter fullscreen mode Exit fullscreen mode

The Solution

Add a new pass inline_single_use_unnamed_temps that runs after and-chain simplification:

pub fn inline_single_use_unnamed_temps(block: &mut Block) {
    loop {
        let mut changed = false;
        let mut i = 0;

        while i < block.0.len().saturating_sub(1) {
            if let Statement::Assign(assign) = &block.0[i] {
                // Check: single LHS that's an unnamed temp
                if assign.left.len() != 1 { i += 1; continue; }
                let Some(local) = assign.left[0].as_local() else { i += 1; continue; };

                if !is_unnamed_temp(local) { i += 1; continue; }

                // Check: exactly one use in the next statement
                let next_stmt = &block.0[i + 1];
                let uses = count_local_uses(next_stmt, local);

                if uses == 1 {
                    // Inline!
                    let rvalue = assign.right[0].clone();
                    replace_local_in_statement(&mut block.0[i + 1], local, &rvalue);
                    block.0.remove(i);
                    changed = true;
                    continue;
                }
            }
            i += 1;
        }

        if !changed { break; }
    }
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: temp inlined
hotspot:setVisible(A and B and C == D)
Enter fullscreen mode Exit fullscreen mode

SetList in Call Arguments

The Problem

Tables were being created for use as function arguments:

local v81_ = {}
__set_list(v81_, 1, {translation[1], translation[2], translation[3]})
local j = j(node, get, set, v81_, time)
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_NEWTABLE   R10, 0, 3         -- v81_ = {}
     2: LOP_DUPTABLE   R11, K0           -- Prepare array
     4: LOP_GETTABLE   R12, R5, K1       -- translation[1]
     6: LOP_GETTABLE   R13, R5, K2       -- translation[2]
     8: LOP_GETTABLE   R14, R5, K3       -- translation[3]
    10: LOP_SETLIST    R11, R12, 3       -- __set_list(v81_, 1, {...})
    12: LOP_GETLOCAL   R11, 0            -- j (function)
    14: LOP_MOVE       R12, R1           -- node
    15: LOP_MOVE       R13, R2           -- get
    16: LOP_MOVE       R14, R3           -- set
    17: LOP_MOVE       R15, R10          -- v81_ as argument
    18: LOP_MOVE       R16, R4           -- time
    19: LOP_CALL       R11, 6, 2         -- j(node, get, set, v81_, time)
Enter fullscreen mode Exit fullscreen mode

The table in R10 is used as an argument to the call at line 17. Previous patterns only handled tables used in direct assignments or returns, not as call arguments.

The Solution

Pattern 4: Detect when the table local is used as a call argument:

// Check if table local is used in a call on the RHS of an assignment
if let Statement::Assign(next_assign) = &block[usage_idx] {
    for rvalue in &next_assign.right {
        // Handle both direct calls and Select-wrapped calls
        let call_args = match rvalue {
            RValue::Call(call) => Some(&call.arguments),
            RValue::MethodCall(mc) => Some(&mc.arguments),
            RValue::Select(sel) => {
                if let RValue::Call(call) = &*sel.value {
                    Some(&call.arguments)
                } else { None }
            }
            _ => None,
        };

        if let Some(args) = call_args {
            // Find and replace the table local in arguments
            for arg in args {
                if arg.as_local() == Some(&table_local) {
                    // Replace with the array literal
                    *arg = array_literal.clone();
                    // Remove table creation and __set_list
                }
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: inline array literal in call
local j = j(node, get, set, {translation[1], translation[2], translation[3]}, time)
Enter fullscreen mode Exit fullscreen mode

Nested Block Recursion for For-Loops

The Problem

Numeric for-loop inlining only processed the top-level block, missing patterns inside if/while/for bodies:

for x = x, z, cellSize do
    local v81_ = _              -- nested!
    for _ = v81_, maxY, cellSize do
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_MOVE       R3, R0            -- x (outer loop init)
     1: LOP_MOVE       R4, R2            -- z (outer loop limit)
     2: LOP_MOVE       R5, R6            -- cellSize (outer loop step)
     3: LOP_FORNPREP   R3, +20           -- for x = x, z, cellSize
     -- Inside outer loop body:
     5: LOP_LOADNIL    R7                -- _ (the variable named underscore)
     6: LOP_MOVE       R8, R7            -- v81_ = _ (copy for inner loop)
     7: LOP_GETUPVAL   R9, 0             -- maxY
     8: LOP_MOVE       R10, R6           -- cellSize
     9: LOP_FORNPREP   R8, +10           -- for _ = v81_, maxY, cellSize
Enter fullscreen mode Exit fullscreen mode

The pattern v81_ = _ followed by for _ = v81_ appears inside the outer for-loop's body. The inlining pass wasn't recursing into nested blocks, so this remained unoptimized.

The Solution

Recurse into nested blocks:

fn inline_into_numeric_for_loops(block: &mut Block) {
    // Process this block's statements
    for statement in &mut block.0 {
        // ... handle numeric for at this level ...

        // Recurse into nested blocks
        match statement {
            Statement::If(if_stat) => {
                inline_into_numeric_for_loops(&mut if_stat.then_block.lock());
                inline_into_numeric_for_loops(&mut if_stat.else_block.lock());
            }
            Statement::While(while_stat) => {
                inline_into_numeric_for_loops(&mut while_stat.block.lock());
            }
            Statement::NumericFor(nf) => {
                inline_into_numeric_for_loops(&mut nf.block.lock());
            }
            Statement::GenericFor(gf) => {
                inline_into_numeric_for_loops(&mut gf.block.lock());
            }
            Statement::Repeat(rep) => {
                inline_into_numeric_for_loops(&mut rep.block.lock());
            }
            _ => {}
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: nested for-loop also simplified
for x = x, z, cellSize do
    for _ = _, maxY, cellSize do
Enter fullscreen mode Exit fullscreen mode

Boolean Condition Simplification

The Problem

Patterns where a temporary captures a boolean expression, then is used as both condition and value:

local v23_ = isActive and isEnabled
if v23_ then
    self.shouldProcess = v23_
end
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_GETUPVAL   R3, 0             -- isActive
     1: LOP_JUMPIFNOT  R3, +4            -- if not isActive, skip
     3: LOP_GETUPVAL   R3, 1             -- isEnabled (short-circuit)
     5: LOP_JUMPIFNOT  R3, +6            -- if v23_ then
     7: LOP_GETTABLEKS R4, R0, "shouldProcess"
     9: LOP_MOVE       R5, R3            -- self.shouldProcess = v23_
    10: LOP_SETTABLEKS R5, R0, "shouldProcess"
Enter fullscreen mode Exit fullscreen mode

The and-expression is evaluated into R3, then R3 is tested. If true, R3 is assigned to the field. But inside the then-branch, we know R3 is truthy - so we can replace it with true.

The Solution

Inside the then-branch, we know v23_ evaluated to true. Replace it:

fn simplify_boolean_condition_assignments(block: &mut Block) {
    let mut i = 0;
    while i + 1 < block.0.len() {
        // Pattern: local v = <bool>; if v then ... v ... end
        let Statement::Assign(assign) = &block.0[i] else { i += 1; continue };
        let Statement::If(if_stat) = &block.0[i + 1] else { i += 1; continue };

        // Must be single assignment to unnamed temp
        let local = assign.left[0].as_local()?;
        if !is_boolean_expression(&assign.right[0]) { i += 1; continue; }

        // Condition must be exactly this local
        let cond_local = if_stat.condition.as_local()?;
        if cond_local != local { i += 1; continue; }

        // Replace uses of local with `true` in then-branch
        replace_local_with_true_in_block(&mut if_stat.then_block.lock(), local);

        // Inline the boolean expression into the condition
        if_stat.condition = assign.right[0].clone();
        block.0.remove(i);  // Remove the assignment
    }
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: temp eliminated, true substituted
if isActive and isEnabled then
    self.shouldProcess = true
end
Enter fullscreen mode Exit fullscreen mode

Function Call Boolean Patterns

The Problem

The boolean condition simplification only worked for boolean expressions. But function calls returning booleans (or truthy values) have the same pattern:

local v46_ = superFunc(self)
if v46_ then
    turnOn = v46_
end
Enter fullscreen mode Exit fullscreen mode

Also needed to handle the negated case:

local v48_ = superFunc(self)
if not v48_ then
    turnOff = v48_
end
Enter fullscreen mode Exit fullscreen mode

The Bytecode Pattern

     0: LOP_GETUPVAL   R5, 0             -- superFunc
     1: LOP_MOVE       R6, R0            -- self
     2: LOP_CALL       R5, 2, 2          -- v46_ = superFunc(self)
     3: LOP_JUMPIFNOT  R5, +5            -- if v46_ then
     5: LOP_MOVE       R6, R5            -- turnOn = v46_
     6: LOP_SETUPVAL   R6, 1
Enter fullscreen mode Exit fullscreen mode

The call result in R5 is tested, then assigned. For the negated case:

     3: LOP_JUMPIF     R5, +5            -- if not v48_ (note: JUMPIF vs JUMPIFNOT)
     5: LOP_MOVE       R6, R5            -- turnOff = v48_
Enter fullscreen mode Exit fullscreen mode

Inside the then-branch after if v we know v is truthy; after if not v we know v is falsy.

The Solution

Extend the pattern to handle Call, MethodCall, and Select (for multi-return), plus detect negation:

// Extend RHS check
let rhs = &assign.right[0];
let is_boolean_expr = is_boolean_expression(rhs);
let is_call_like = matches!(rhs,
    RValue::Call(_) | RValue::MethodCall(_) | RValue::Select(_));

if !is_boolean_expr && !is_call_like {
    continue;
}

// Handle negated conditions: if not v then x = v end
let (cond_local, is_negated) = if let Some(l) = if_stat.condition.as_local() {
    (l, false)
} else if let Some(unary) = if_stat.condition.as_unary() {
    if matches!(unary.operation, UnaryOperation::Not) {
        if let Some(l) = unary.value.as_local() {
            (l, true)
        } else { continue; }
    } else { continue; }
} else { continue; };

// Replace with appropriate boolean
if is_negated {
    // In "if not v then", v is false in the then-branch
    replace_local_with_false_in_block(&mut if_stat.then_block.lock(), &local);
} else {
    // In "if v then", v is true in the then-branch
    replace_local_with_true_in_block(&mut if_stat.then_block.lock(), &local);
}
Enter fullscreen mode Exit fullscreen mode

Result

-- After: function call inlined, temp replaced with boolean
if superFunc(self) then
    turnOn = true
end

if not superFunc(self) then
    turnOff = false
end
Enter fullscreen mode Exit fullscreen mode

Recursive Use Counting Fix

The Problem

The use-counting for determining if a temp can be inlined wasn't recursing into nested blocks:

local v23_ = getValue()
if v23_ then           -- use 1
    process(v23_)      -- use 2 (in nested block - was missed!)
end
Enter fullscreen mode Exit fullscreen mode

This led to incorrect inlining that removed the temp even though it was used twice.

The Solution

Count uses recursively across all nested blocks:

fn count_local_uses_recursive(block: &Block, local: &RcLocal) -> usize {
    let mut count = 0;

    for statement in &block.0 {
        // Count direct uses
        count += statement.values_read().iter()
            .filter(|l| *l == local)
            .count();

        // Recurse into nested blocks
        match statement {
            Statement::If(if_stat) => {
                count += count_local_uses_recursive(&if_stat.then_block.lock(), local);
                count += count_local_uses_recursive(&if_stat.else_block.lock(), local);
            }
            Statement::While(w) => {
                count += count_local_uses_recursive(&w.block.lock(), local);
            }
            Statement::NumericFor(nf) => {
                count += count_local_uses_recursive(&nf.block.lock(), local);
            }
            Statement::GenericFor(gf) => {
                count += count_local_uses_recursive(&gf.block.lock(), local);
            }
            Statement::Repeat(r) => {
                count += count_local_uses_recursive(&r.block.lock(), local);
            }
            _ => {}
        }
    }

    count
}
Enter fullscreen mode Exit fullscreen mode

The Numbers

Metric Before After Reduction
Unnamed temporaries ~10,000 18 99.8%
v_u_ upvalue temps 79 0 100%
Files with temps 500+ 11 97.8%

Remaining Patterns (The Hard Ones)

The 123 remaining temporaries fall into categories that are difficult or impossible to optimize:

For-loop bound preservation - Semantically required:

   local v27_ = minX
   for minX = v27_, maxX, step do  -- Can't inline; minX is shadowed
Enter fullscreen mode Exit fullscreen mode

Boolean short-circuit chains - Used multiple times:

   local v = condition1 and condition2
   if v then
       v = condition3   -- v is reassigned
   end
Enter fullscreen mode Exit fullscreen mode

Complex control flow results - Values computed across branches that can't be simplified to expressions


Lessons Learned

1. SSA is Your Friend (and Enemy)

SSA form makes many optimizations easy (copy propagation, dead code elimination), but it can also destroy information. The key insight: preserve names during SSA transformations, not just after.

2. Order Matters

The order of optimization passes is critical:

  1. Parse debug info early
  2. SSA construction with name preservation
  3. SSA-based optimizations
  4. SSA destruction (prefer named representatives)
  5. AST-level pattern matching
  6. Final cleanup passes

Running pattern matching before SSA destruction would miss opportunities; running it too early interferes with other passes.

3. Test on Real Code

A real-world codebase (~500 Luau files, ~200K lines) was invaluable. Synthetic tests miss edge cases; real code has all the weird patterns.

4. Bytecode Debug Info is Gold

Even partial debug information dramatically improves output quality. Always parse it, always preserve it.


Conclusion

Decompilation is fundamentally about reconstructing intent from implementation. The compiler has thrown away variable names, flattened expressions, and transformed elegant source into efficient bytecode. Our job is to reverse that transformation.

The techniques here - SSA name preservation, pattern matching, lookahead analysis - are applicable to any decompiler. The key is understanding what the compiler does and doing the opposite.

All code is available in the medal repository. PRs welcome.


This work was done to improve reverse engineering of Luau bytecode for modding purposes.

Top comments (0)