DEV Community

kyb8801
kyb8801

Posted on

sympy.parse_expr will run os.system if you let it. Here's the AST gate that stopped me from shipping the RCE.

sympy.parse_expr will run os.system if you let it. Here's the AST gate that stopped me from shipping the RCE.

I was building an MCP server that accepts a measurement formula as a string from an LLM, parses it with sympy, and evaluates it via Monte Carlo. Five minutes of integration. Thirteen tests. Twelve passed.

The thirteenth was a safety test. The formula it passed was:

"__import__('os').system('echo PWNED')"
Enter fullscreen mode Exit fullscreen mode

The test expected a ValueError. Instead I got a test failure message and the string PWNED printed to my terminal.

Let me spell that out. sympy.parse_expr, with the default arguments I was using, actually invoked os.system on my machine. My own parser, running in my own test process, shelled out and echoed text into my terminal. If I had shipped this to production, any LLM user with a sufficiently creative prompt could have had that same shell-out happen on a Cloud Run instance under my billing account.

Why this happens

sympy.parse_expr is implemented on top of Python's eval(). When you don't explicitly lock down the global dictionary it evaluates in, eval inherits the current module's __builtins__, which includes __import__, open, compile, exec, getattr, and friends. A string that looks like a math expression but references __import__ resolves like a Python expression — and runs.

This is documented sympy behavior with warnings in the docs. Warnings do not save you when you are moving fast on launch day.

The vulnerability surface is broader than __import__. Any of these can give an attacker code execution depending on what is in scope:

  • __import__('os').system(...) — direct shell out
  • __builtins__.__dict__['eval'](...) — re-enter eval
  • (0).__class__.__bases__[0].__subclasses__() — escape to arbitrary class instances
  • getattr(__builtins__, 'open')('/etc/passwd').read() — file read
  • ''.join.__globals__['__builtins__']['exec']('...') — exec via string method walk

The fix that actually works: validate at the AST level before sympy sees the string

import ast

_DANGEROUS_NAMES = frozenset({
    "__import__", "__builtins__", "__class__", "__subclasses__",
    "eval", "exec", "compile", "open", "globals", "locals",
    "getattr", "setattr", "delattr", "exit", "quit",
})

_FORBIDDEN_AST_NODES = (
    ast.Attribute,    # blocks "os.system" and "obj.__class__"
    ast.Subscript,    # blocks "x[0]" indexing into dunders
    ast.Lambda,
    ast.GeneratorExp,
    ast.ListComp,
    ast.DictComp,
    ast.Starred,
    ast.JoinedStr,
    ast.FormattedValue,
    ast.NamedExpr,
)

def _validate_formula_ast(formula: str) -> None:
    tree = ast.parse(formula, mode="eval")
    for node in ast.walk(tree):
        if isinstance(node, _FORBIDDEN_AST_NODES):
            raise ValueError(
                f"Disallowed formula construct: {type(node).__name__}"
            )
        if isinstance(node, ast.Name):
            if node.id in _DANGEROUS_NAMES:
                raise ValueError(f"Disallowed identifier: {node.id!r}")
            if node.id.startswith("__") and node.id.endswith("__"):
                raise ValueError(f"Dunder identifiers are not allowed")
Enter fullscreen mode Exit fullscreen mode

The gate runs before sympy ever sees the string. The whitelist permits exactly what a measurement model needs — arithmetic operators, unary minus, named inputs, and calls into sympy's own math namespace for exp, log, sin, sqrt. Anything else raises a clean ValueError that names the offending construct.

ast.parse(..., mode="eval") is the standard library's free safety net here. It refuses to parse a statement (you cannot pass import os), and it gives you a typed tree you can walk and filter without ever evaluating anything.

Why blocking ast.Attribute matters specifically

The most common attack path through sympy.parse_expr is attribute access: os.system, ().__class__.__bases__[0].__subclasses__(), and the string-method-walk family. If you block every ast.Attribute node and every ast.Subscript node at the parser stage, you cut off basically all known sandbox escapes through the eval-based path.

The cost is that your accepted formula grammar becomes "arithmetic on bare names + calls to a whitelisted set of math functions." For a measurement uncertainty calculator, that is exactly the grammar you want. For more general DSLs, the whitelist gets longer but the pattern is the same.

Verifying the gate

def test_mc_blocks_import_trick():
    bad = "__import__('os').system('echo PWNED')"
    with pytest.raises(ValueError, match=r"(?i)disallowed|dunder|attribute"):
        propagate(formula=bad, estimates={}, components=[])

def test_mc_blocks_attribute_walk():
    bad = "(0).__class__.__bases__[0].__subclasses__()"
    with pytest.raises(ValueError, match=r"(?i)disallowed|dunder|attribute"):
        propagate(formula=bad, estimates={}, components=[])

def test_mc_allows_real_formula():
    good = "(V * R) / (V + I)"
    # Should not raise. The downstream sympy call will lambdify it.
    propagate(formula=good, estimates={"V": 10.0, "R": 100.0, "I": 0.1}, components=[])
Enter fullscreen mode Exit fullscreen mode

The test asserts not only that the call raises, but that the error message contains "disallowed", "dunder", or "attribute" — because a downstream sympy error would indicate the gate failed. The safe version passed. The shell never ran. Test suite ended that evening at twenty-six green.

The pattern, generalized

Any time you accept a string-as-code from a source you do not trust — LLM output, web input, webhook payload, any of it — the pattern is:

  1. Parse with ast.parse(s, mode="eval") — refuses statements outright
  2. Walk the tree with ast.walk(tree) and filter against a whitelist of allowed node types and names
  3. Reject explicitly with a typed error
  4. Then hand the sanitized input to whatever permissive parser you actually wanted to use

Defense in depth, with the permissive parser as the inner layer. This is what kept me from shipping the RCE to a hosted MCP endpoint that any LLM could call.

If you are building an MCP server that takes a formula, a query, a filter expression, a JSONata path, or any other string-as-code from your LLM, please go check whether your parser is eval-based and what its global dictionary contains by default. The bug is shipping today in more code than its authors realize.


Repo: github.com/kyb8801/measurement-uncertainty-mcp. The gate above lives in math_kernel.py. Live MCP: measurement-uncertainty.mcpize.run.

Top comments (0)