Hercules Lemke Merscher

Posted on Sep 1 • Originally published at bitmaybewise.substack.com

Debugging shift-reduce conflicts: Lessons learned building Tsonnet's parser

#tsonnet #jsonnet #compiler #ocaml

Building Tsonnet has been a wild ride of compiler adventures, and one thing that kept tripping me up was those cryptic shift-reduce conflict warnings that menhir would throw at me. You know the ones:

Warning: 2 shift/reduce conflicts were arbitrarily resolved.

Arbitrarily resolved? That sounds threatening. What exactly is menhir doing to my beautiful grammar, and why should I care?

Turns out, understanding these conflicts is crucial for anyone building parsers. Let me walk you through what I learned by building a simple arithmetic parser that demonstrates exactly what goes wrong -- and how to fix it.

What is shifting and reducing anyway?

Before we dive into conflicts, let's understand what shifting and reducing actually mean. When your menhir-generated parser is running, it maintains two key data structures:

A stack (holding parsing states and partially-built AST nodes)
An input stream (the tokens from ocamllex)

Shifting means: "Take the next token from the input stream and push it onto the stack, along with a new parser state."

Let's trace through parsing 2 + 3 * 4. Our ocamllex tokenizer produces: [INT(2); PLUS; INT(3); TIMES; INT(4)]

Here's what happens during shifting:

(* Initial state *)
Stack: [State0]
Input: [INT(2); PLUS; INT(3); TIMES; INT(4)]

(* Shift INT(2) *)
Stack: [State0; INT(2), StateX]  
Input: [PLUS; INT(3); TIMES; INT(4)]

(* Shift PLUS *)
Stack: [State0; INT(2), StateX; PLUS, StateY]
Input: [INT(3); TIMES; INT(4)]

The exact state numbers depend on how menhir generates the state machine for your specific grammar.

The shift action says: "I'm in the middle of parsing some rule(s), and I need to consume this next token to make progress."

Reducing is the flip side: "I've collected enough tokens to complete a grammar rule, so let me pop items from the stack and build an AST node."

(* After seeing INT(2), we could reduce immediately *)
Stack: [State0; INT(2), StateX] -> [State0; expr(Int 2), StateX]

Building a conflict-prone parser

Let's create a complete example that shows these conflicts in action. I'll give you all the code needed to reproduce this:

First, the dependencies (install with opam install menhir).

Then, our files:

dune-project:

(lang dune 3.0)

(using menhir 3.0)

(package
 (name arithmetic)
 (depends ocaml dune menhir))

lib/ast.ml:

type expr =
  | Int of int
  | Add of expr * expr
  | Mul of expr * expr

let rec string_of_expr = function
  | Int n -> string_of_int n
  | Add (e1, e2) -> Printf.sprintf "(%s + %s)" (string_of_expr e1) (string_of_expr e2)
  | Mul (e1, e2) -> Printf.sprintf "(%s * %s)" (string_of_expr e1) (string_of_expr e2)

let rec eval = function
  | Int n -> n
  | Add (e1, e2) -> eval e1 + eval e2
  | Mul (e1, e2) -> eval e1 * eval e2

lib/lexer.mll:

{
  open Parser

  exception SyntaxError of string
}

let white = [' ' '\t']
let newline = '\r' | '\n' | "\r\n"
let digit = ['0'-'9']
let int = digit+

rule token = parse
  | white+     { token lexbuf }
  | newline    { Lexing.new_line lexbuf; token lexbuf }
  | int        { INT (int_of_string (Lexing.lexeme lexbuf)) }
  | '+'        { PLUS }
  | '*'        { TIMES }
  | '('        { LPAREN }
  | ')'        { RPAREN }
  | eof        { EOF }
  | _          { raise (SyntaxError ("Unexpected char: " ^ Lexing.lexeme lexbuf)) }

lib/parser.mly (the conflict-prone version):

%{
  open Ast
%}

%token <int> INT
%token PLUS TIMES
%token LPAREN RPAREN
%token EOF

%start <Ast.expr> main

%%

main:
  | expr EOF { $1 }

(* This grammar has shift/reduce conflicts! *)
expr:
  | INT { Int $1 }
  | expr PLUS expr { Add ($1, $3) }
  | expr TIMES expr { Mul ($1, $3) }
  | LPAREN expr RPAREN { $2 }

lib/dune:

(library
 (public_name arithmetic)
 (name arithmetic))

(ocamllex lexer)

(menhir
 (modules parser)
 (flags --dump))

bin/main.ml:

open Arithmetic

let parse_string s =
  let lexbuf = Lexing.from_string s in
  try
    Ok (Parser.main Lexer.token lexbuf)
  with
  | Lexer.SyntaxError msg -> Error ("Lexer error: " ^ msg)
  | Parser.Error ->
    let pos = Lexing.lexeme_start_p lexbuf in
    Error (Printf.sprintf "Parser error at line %d, character %d" 
           pos.pos_lnum (pos.pos_cnum - pos.pos_bol))

let test_expressions = [
  "2 + 3 * 4";
  "2 * 3 + 4";
  "(2 + 3) * 4";
  "2 + (3 * 4)";
  "1 + 2 + 3";
  "1 * 2 * 3";
]

let () =
  Printf.printf "=== Arithmetic Expression Parser ===\n\n";
  List.iter (fun expr ->
    Printf.printf "Input: %s\n" expr;
    match parse_string expr with
    | Ok ast ->
      Printf.printf "  Parsed as: %s\n" (Ast.string_of_expr ast);
      Printf.printf "  Evaluates to: %d\n" (Ast.eval ast);
    | Error msg ->
      Printf.printf "  Error: %s\n" msg;
    Printf.printf "\n"
  ) test_expressions

The moment of truth

Build and run this:

dune build
Warning: 2 states have shift/reduce conflicts.
Warning: 4 shift/reduce conflicts were arbitrarily resolved.

Here's what you'll get:

dune exec ./bin/main.exe
Input: 2 + 3 * 4
  Parsed as: (2 + (3 * 4))
  Evaluates to: 14

Input: 2 * 3 + 4
  Parsed as: (2 * (3 + 4))
  Evaluates to: 14

Wait, what?! That second result is mathematically wrong. It should be ((2 * 3) + 4) = 10, not (2 * (3 + 4)) = 14.

Breaking down the conflict

When parsing 2 * 3 + 4, after seeing 2 * 3, menhir faces a choice:

Shift the +: Keep building, eventually getting 2 * (3 + 4)
Reduce the *: Finish multiplication first, eventually getting (2 * 3) + 4

Menhir consistently chose to shift, making everything right-associative. It resolved the conflict by picking a strategy, but not necessarily the correct one for arithmetic.

The conflict report in _build/default/lib/parser.conflicts shows exactly what happened:

** Conflict (shift/reduce) in state 8.
** Token involved: PLUS
** This state is reached from main after reading:

    expr TIMES expr

** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)

    main
    expr EOF
    expr (?)

** In state 8, looking ahead at PLUS, reducing production
** expr -> expr TIMES expr
** is permitted because of the following sub-derivation:

    expr PLUS expr
    expr PLUS expr TIMES expr // lookahead token appears

** In state 8, looking ahead at PLUS, shifting is permitted
** because of the following sub-derivation:

    expr PLUS expr
    expr TIMES expr PLUS expr // lookahead token appears

Menhir is telling us: "I saw expr TIMES expr and now I see a PLUS. I could either finish the multiplication (reduce) or keep reading (shift). I chose to shift."

The fix: precedence declarations

Here's where the magic happens. Update your parser with precedence rules:

%{
  open Ast
%}

%token <int> INT
%token PLUS TIMES
%token LPAREN RPAREN
%token EOF

%start <Ast.expr> main

(* Precedence declarations: lower precedence first *)
%left PLUS
%left TIMES

%%

main:
  | expr EOF { $1 }

expr:
  | INT { Int $1 }
  | expr PLUS expr { Add ($1, $3) }
  | expr TIMES expr { Mul ($1, $3) }
  | LPAREN expr RPAREN { $2 }

The precedence declarations tell menhir:

%left PLUS: + is left-associative with precedence level 1
%left TIMES: * is left-associative with precedence level 2 (higher)

Higher precedence = tighter binding, so * binds tighter than +.

Now rebuild and run:

dune build  # No more conflict warnings!

dune exec ./bin/main.exe
Input: 2 + 3 * 4
  Parsed as: (2 + (3 * 4))    # Still correct
  Evaluates to: 14

Input: 2 * 3 + 4
  Parsed as: ((2 * 3) + 4)    # Now correct!
  Evaluates to: 10            # Right answer!

Conclusion

This is what I wish someone had told me when I started working on Tsonnet: shift-reduce conflicts aren't bugs -- they're missing specifications.

Our grammar was ambiguous, and precedence declarations make it unambiguous. When menhir encounters our earlier conflict, instead of arbitrarily choosing, it now compares precedence:

TIMES has higher precedence than PLUS
So reduce the TIMES first: ((2 * 3) + 4)

While building Tsonnet, I ran into these conflicts constantly. Object field evaluation, variable declarations, array indexing -- they all create similar ambiguities. The lesson? Don't ignore those menhir warnings. Each conflict represents a place where your language's semantics are unclear.

Sometimes the "arbitrary" resolution happens to work by accident (like our 2 + 3 * 4 case), but sometimes it produces completely wrong results (like our 2 * 3 + 4 case). Better to be explicit about what you want.

The next time menhir tells you about shift-reduce conflicts, don't panic. It's not breaking your parser -- it's asking for clarification about your language's design. Take the time to understand what the conflicts mean and resolve them explicitly with precedence declarations.

Building a compiler? Check out the Tsonnet series where I document the complete journey from zero to a working Jsonnet-compatible interpreter. It's been quite the parsing adventure!

Don't let your parsing conflicts resolve themselves arbitrarily! Subscribe to Bit Maybe Wise for more compiler adventures, complete with all the precedence declarations you can handle.

Photo by Claudio Schwarz on Unsplash