DEV Community

Vivis Dev
Vivis Dev

Posted on • Originally published at pythonkoans.substack.com

Exploring the dangerous power of unquoted Python strings, and how they caused CVE-2024-9287

The Peril of Unquoted Arguments

We often have the need to run commands in _*shells1 *_using Python. Subprocess is a cross-platform2 python module which helps you do this.

Shells commands are separated by command delimiters (such as ‘;’ ‘&&’ ‘||’ and ‘\n’ in POSIX3). Argument delimiters on the other hand define how a single command’s arguments are split. On POSIX-compliant systems the IFS4environment variable defines the characters used to split arguments. By default, IFS is set to split on spaces, newlines and tabs.

When a path or argument to a command contains a space, the shell does not see a single continuous entity. It sees two distinct things. A single path has become two arguments.

Let us begin with a simple example.

Part 1: The Space as a Separator

Consider the creation of a directory with a space in its name.

When we use the os module it understands the path as a single string because it does not involve the shell's interpretation. The directory is created as intended with a space in its name.

Now let us use the subprocess module with the shell=True flag. This flag instructs Python to pass our command to the shell for execution as a single string.

When this code runs, the shell sees the command as mkdir my new path. The shell interprets this as a command to create a directory named my a second directory named new and a third named path. The single path becomes three paths.

Part 2: The Command Injection

The danger is not only in misinterpretation but also in malicious injection.

Consider if the path came from an external source such as a user defined argument for example.

The shell sees mkdir my; rm -rf /. The semicolon is a command separator in the shell. The shell will first execute mkdir my and then it will execute rm -rf / which deletes the root directory.

The unquoted path has allowed the user to inject a new command. This is a profound and dangerous failure of boundary. A simple space or semicolon can shatter the integrity of the system.

Part 3: The Principle of Quoting

To prevent this we must use quoting. Quoting places a protective barrier around the string telling the shell to treat it as a single unit regardless of its contents.

The shell now sees mkdir "my; rm -rf /". The entire string is treated as a single argument for mkdir. No new directories are created. No commands are executed. The semicolon is rendered harmless a mere character within the string.

Part 4: The Path of Wisdom

The wise path is the one that avoids the shell entirely when not needed. Most linters (like ruff) will detect this for you.

Here we pass a list of arguments to subprocess.run. Python does not pass a single string to the shell. Instead it executes mkdir directly as a separate process and passes user_input as its first argument. The shell is never involved and the risk is eliminated.

This is the preferred way. It is clean and safe.

Part 5: A Real-World Command Injection (CVE-2024-9287)

Command Injection is not just a theoretical problem. Last year, a high severity vulnerability was found to affect all Python versions <= 3.13:

A vulnerability has been found in the CPython venv\ module and CLI where path names provided when creating a virtual environment were not quoted properly, allowing the creator to inject commands into virtual environment "activation" scripts (ie "source venv/bin/activate"). This means that attacker-controlled virtual environments are able to run commands when the virtual environment is activated. Virtual environments which are not created by an attacker or which aren't activated before being used (ie "./venv/bin/python") are not affected.

Previously, when a virtual environment was created, the activation scripts (activate, activate.bat, etc.) would use the environment name provided by the user to construct the venv path without enclosing it in quotation marks.

For example, a name like my test venv would be written into the script as /home/user/my test venv. An attacker could craft a virtual environment name like my-venv-with-space-and-command; malicious_command which would be interpreted by the shell as a path followed by an additional command to execute.

The fix5 (see summarized code below) was to ensure that all paths written into the venv activation scripts are properly quoted using the shlex module. The fix uses shlex.quote to ensure that any special characters or spaces in the path are escaped or enclosed in single quotes. This prevents the shell from misinterpreting the path as separate arguments or commands.

Forging the Blade

The Master Blacksmith’s lesson was simple. Just as a piece of metal may appear to be a single piece of steel, but splits into shards when struck by a hammer; unquoted command strings are split into different arguments or commands by the shell.

When constructing commands in Python:

  • Avoid shell=True and use subprocess.run([…]) instead

  • If you must use subprocess with shell=True, quote the command string

  • When constructing shell commands outside of subprocess, use shlex to avoid command injection from untrusted input.

The blacksmith does not strike a piece of steel without thought. So too should the developer not pass a path to the shell without care.

Thanks for reading Python Koans! If you enjoyed this post, share it with your friends :)

Python Koans | Vivis Dev | Substack

Python lessons wrapped in koans. Small puzzles, deep truths. Not your usual tutorial thread. Click to read Python Koans, by Vivis Dev, a Substack publication with hundreds of subscribers.

favicon pythonkoans.substack.com

1

A shell _is a program that provides an interface between the user and the operating system (OS). It’s called a _“shell” because it surrounds the kernel (the “core”) and lets you interact with it. The shell takes commands (from your keyboard, a script, or another program), interprets them, and asks the OS to run the corresponding programs or built-in functions.

2

The subprocess module is available on all platforms except mobile (i.e. Android, iOS) and webassembly (i.e. WASI).

3

POSIX is an IEEE standard (IEEE 1003) defining a common API and shell behavior for Unix-like systems. It ensures that programs and scripts written on one POSIX-compliant system will work on another.

4

IFS can be made to split on other characters by changing it’s value before running a command. For example: IFS=, read a b c <<< "one,two,three"; echo "$a | $b | $c" will split the comma-delimited string into the three variables a, b and c.

5

Because the issue affected all versions of Python, a patch was created for all 3.x versions. The commit diff for Python 3.12 can be found here.

Top comments (0)