DEV Community

Rachit Avasthi
Rachit Avasthi

Posted on

Fixing PySpark on Windows: Downgrading from Python 3.13 to 3.11 (Complete Guide)

If you’re trying to run PySpark on Windows with Python 3.13, you’ll quickly run into errors like:

AttributeError: module 'socketserver' has no attribute 'UnixStreamServer'

Enter fullscreen mode Exit fullscreen mode

This can be frustrating—especially when your code is perfectly fine.

In this post, I’ll walk you through a complete, working setup for PySpark on Windows by:

  • Installing Python 3.11 alongside Python 3.13
  • Creating a clean virtual environment
  • Installing a compatible PySpark version
  • Optionally fixing Windows-specific Spark warnings using winutils.exe

This setup is stable, beginner-friendly, and recommended for learning and local development.


Why PySpark Fails with Python 3.13

The problem isn’t your code—it’s compatibility.

  • PySpark does not yet support Python 3.13
  • PySpark 4.x has known issues on Windows
  • Some internal APIs were removed in Python 3.13 that PySpark still relies on

✅ The correct combination on Windows is:

  • Python 3.11
  • PySpark 3.5.x
  • Java 8 or 11

Step 1: Install Python 3.11 (Side-by-Side)

Do not uninstall Python 3.13. Instead, install Python 3.11 alongside it.

  1. Download Python 3.11 (64-bit):

    https://www.python.org/downloads/release/python-3119/

  2. Run the installer:

    • Check “Add Python to PATH”
    • Click Customize installation
    • Enable “Install for all users”
  3. Finish the installation

Verify it worked:

py -3.11 --version

Enter fullscreen mode Exit fullscreen mode

You should see:

Python 3.11.x

Enter fullscreen mode Exit fullscreen mode

Step 2: Allow Virtual Environment Activation in PowerShell

By default, Windows blocks script execution, which prevents virtual environments from activating.

Run this once:

Set-ExecutionPolicy RemoteSigned -Scope CurrentUser

Enter fullscreen mode Exit fullscreen mode

Press Y to confirm.

This change is safe and only applies to your user account.


Step 3: Create a Python 3.11 Virtual Environment

Navigate to your project directory:

cd C:\Users\User\Desktop\Training\Week5

Enter fullscreen mode Exit fullscreen mode

Remove any old virtual environment:

Remove-Item -Recurse -Force venv

Enter fullscreen mode Exit fullscreen mode

Create a new one using Python 3.11:

py -3.11 -m venv venv

Enter fullscreen mode Exit fullscreen mode

Activate it:

venv\Scripts\activate

Enter fullscreen mode Exit fullscreen mode

You should now see:

(venv)

Enter fullscreen mode Exit fullscreen mode

Confirm the Python version:

python --version

Enter fullscreen mode Exit fullscreen mode

Expected output:

Python 3.11.x

Enter fullscreen mode Exit fullscreen mode

Step 4: Install the Correct PySpark Version

Do not install the latest PySpark blindly.

❌ Avoid

pip install pyspark

Enter fullscreen mode Exit fullscreen mode

✅ Install the Windows-safe version

pip install pyspark==3.5.1

Enter fullscreen mode Exit fullscreen mode

Verify:

pip show pyspark

Enter fullscreen mode Exit fullscreen mode

You should see:

Version: 3.5.1

Enter fullscreen mode Exit fullscreen mode

Step 5: Test Your Spark Setup

Create a file called Lab1.py:

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("Test").getOrCreate()

df = spark.range(10)
df.show()

spark.stop()

Enter fullscreen mode Exit fullscreen mode

Run it:

python Lab1.py

Enter fullscreen mode Exit fullscreen mode

If you see numbers from 0 to 9, Spark is running successfully 🎉


Common Mistakes to Avoid

  • Running Python explicitly from 3.13:

    C:\...\Python313\python.exe Lab1.py
    
    
  • Installing PySpark 4.x on Windows

  • Using Python 3.12 or newer with Spark

  • Forgetting to activate the virtual environment

Golden Rule

When (venv) is active, always use python, never a full Python path.


Optional: Fix winutils.exe Warnings on Windows

You may see warnings like:

Did not find winutils.exe
HADOOP_HOME and hadoop.home.dir are unset

Enter fullscreen mode Exit fullscreen mode

Spark works fine without winutils, but adding it removes these warnings.


Which winutils Version to Use

  • Hadoop version: 3.3.6
  • Compatible with Spark 3.5.x

Setup winutils

  1. Create this folder:
C:\hadoop\bin\

Enter fullscreen mode Exit fullscreen mode
  1. Place winutils.exe inside:
C:\hadoop\bin\winutils.exe

Enter fullscreen mode Exit fullscreen mode
  1. Set environment variables:
setx HADOOP_HOME C:\hadoop
setx PATH "%PATH%;C:\hadoop\bin"

Enter fullscreen mode Exit fullscreen mode
  1. Restart PowerShell and verify:
winutils.exe

Enter fullscreen mode Exit fullscreen mode

If usage info prints, it’s working.


Final Working Setup

Component Version
Python 3.11.x
PySpark 3.5.1
Hadoop (winutils) 3.3.6
OS Windows
Virtual Environment Enabled

Conclusion

Setting up PySpark on Windows requires careful version alignment, but once configured correctly, it works reliably.

By:

  • Keeping Python 3.13 installed
  • Using Python 3.11 in a virtual environment
  • Pinning PySpark to 3.5.1
  • Optionally configuring winutils

you now have a stable Spark development environment on Windows.

Happy coding 🚀


Top comments (0)