In the evolving world of data engineering, developing Python-based workloads in Snowflake (via Snowpark, Python UDFs, or Stored Procedures) has become increasingly popular. However, as pipelines become more complex, a critical question arises: How should we develop and maintain our Python code for Snowflake?
While the convenience of browser-based editors like Snowflake Workspaces is fine for quick scripts, there is a significant "Developer Experience (DevX) Gap" that emerges when you try to build production-grade Python code in a browser tab.
Why I'm writing this blog
I've seen many Data Engineers and Analytics Engineers fall into the "UI Trap" of writing complex Python logic directly in Snowflake, only to struggle with inconsistent environments, broken dependencies, and the frustration of "it works on my machine, but not on others" problems. This blog is born out of a desire to share a better way.
My goal is to encourage people to step out of the browser and into a professional local development environment. By establishing repeatable local dev environments where every developer uses the same Python version, the same dependencies, and the same tooling, we can build Python-based features that are not just functional and robust, but most importantly maintainable by others.
One aspect of democratizing data rich features in a product is by making it easier to develop and maintain code using consistent tools. This is why we need to focus on local DevX!
What we'll cover
We will explore the merits of a local-first approach to Snowflake Python development, specifically focusing on:
-
Deterministic Python versions with
pyenvand.python-version -
Robust dependency management with
Poetryandpyproject.toml - Consistent tooling configured in a single file
-
Simplified task execution with
Poe the Poet
Python version management with pyenv
On macOS pyenv is a tool for managing multiple Python versions. It allows you to install and switch between different Python versions on a per-project basis by creating a .python-version file in the project root.
Why this matters for DevX: By pinning the Python version in version control, you ensure that:
- Every developer uses the same Python version for the project.
- Your CI/CD pipeline can install the exact same version.
- You avoid subtle bugs that arise from Python version differences.
- Dependencies work consistently (some packages require specific Python versions).
- Debugging is easier when issues are reproducible across all environments.
Setup:
-
Install
pyenv:
brew install pyenv -
Create a
.python-versionfile in the project root:
3.10 -
Install the Python version specified in the
.python-versionfile:
pyenv install -
Verify the desired Python version is installed and is set for the project:
pyenv version
Dependency management with Poetry and the pyproject.toml file
Poetry is a tool for dependency management and packaging in Python. It allows you to declare the packages your project depends on and it will manage (install/update) them for you. Poetry offers a lockfile to ensure repeatable installs, and can build your project for distribution.
It uses the pyproject.toml file (which we'll explore next) as its source of truth.
Why this matters for DevX: With pyproject.toml and Poetry, you've eliminated the "works on my machine, but not on others" problem at the dependency level. Every developer and every CI/CD runner will install the exact same versions of every package, every time!
Installing Poetry
-
Install Poetry using Homebrew:
brew install poetry -
Verify the installation:
poetry --version Configure Poetry to create virtual environments in the project directory (recommended for better DevX). This ensures that when you run
poetry install, it creates a.venvfolder directly in the project, making it easy to activate and manage:
poetry config virtualenvs.in-project true
The pyproject.toml file
The pyproject.toml file is a PEP 518 standard that replaces the need for multiple configuration files (setup.py, requirements.txt, setup.cfg, etc.) with one unified file. It uses TOML (Tom's Obvious, Minimal Language) format.
Benefits:
- Single source of truth: All project configuration lives in one file.
- Version constraints: You can specify package versions according to Poetry's dependency specification and version constraints.
- Deterministic builds: Poetry generates a
poetry.lockfile that pins every dependency—both direct (what you specify) and transitive (dependencies of your dependencies)—ensuring identical installs across environments. - Tool configuration: You can configure multiple tools in the same file (no need for separate config files).
Example pyproject.toml file:
[project]
name = "PROJECT_NAME"
version = "0.1.0"
description = "PROJECT_DESCRIPTION"
authors = [
{name = "YOUR_NAME",email = "youremail@domain.com"}
]
readme = "README.md"
# Production dependencies that your code needs to run.
[tool.poetry.dependencies]
python = ">=3.10,<3.11"
snowflake-snowpark-python = "1.33.0" # Snowflake Snowpark Python library
pydantic = "2.11.7" # Data validation library in Python
# Development-only tools that aren't needed in production.
[tool.poetry.group.dev.dependencies]
black = "^23.0.0" # Code formatter
pylint = "^3.0.0" # Linter for code quality
isort = "^5.13.2" # Import statement organiser
poethepoet = "^0.27.0" # Task runner for simplifying development tasks
pytest = "^8.1.2" # Testing framework for Python
pytest-xdist = "^3.0.0" # Run tests in parallel for faster execution
pytest-cov = "^5.0.0" # Generate code coverage reports
[build-system]
requires = ["poetry-core>=2.0.0,<3.0.0"]
build-backend = "poetry.core.masonry.api"
# Configure all your tools
[tool.black]
line-length = 120
target-version = ['py310']
[tool.isort]
profile = "black"
multi_line_output = 3
[tool.pylint]
max-line-length = 120
fail-under = 9.5
# Configure tasks for Poe the Poet
[tool.poe.tasks]
# Private tasks (prefixed with _ to hide from the help menu)
_format_black = "black ."
_format_isort = "isort ."
_pylint = "pylint src/"
# Public tasks that compose the individual tools
format = ["_format_black", "_format_isort"]
lint = ["format", "_pylint"]
test = "pytest --cov -vv"
Installing dependencies
Once your pyproject.toml is set up, installing all dependencies (including dev dependencies) is a single command. It will:
- Create a virtual environment (in
.venvif you configured Poetry to do so). - Install all dependencies (including dev dependencies).
- Generate or update
poetry.lockto ensure reproducible installs across environments.
For new projects where you haven't written code yet, you'll need to use the --no-root flag:
poetry install --no-root
Why --no-root is needed initially:
When you first create a project manually or with poetry init, Poetry assumes you're building a package. If you run poetry install without any code, you'll get an error:
Installing the current project: example-project (0.1.0)
Error: The current project could not be installed: No file/folder found for package example-project
The --no-root flag tells Poetry to skip installing your project as a package and only install the dependencies you've specified.
When you won't need --no-root:
Once you've written code and added a packages section to your pyproject.toml file like the example below, you can use the standard poetry install command (without --no-root):
[tool.poetry]
packages = [{include = "<YOUR_PACKAGE_NAME>", from = "src"}]
Configuring VS Code (optional):
To use the project's virtual environment in VS Code / Cursor for IntelliSense, debugging, and running code in the IDE:
- Press
Cmd+Shift+P(orCtrl+Shift+Pon Windows/Linux) - Type "Python: Select Interpreter"
- Select "Enter interpreter path"
- Enter the path to your project's virtual environment:
./<PROJECT_ROOT>/.venv/bin/python(adjust the path to match your project location) - VS Code will now use the same Python environment as Poetry, giving you access to all installed packages and proper code completion.
Poe the Poet: Simplifying development tasks
Poe the Poet is a task runner that lets you define common development commands in your pyproject.toml file. Instead of remembering long commands like poetry run black . && poetry run isort . && poetry run pylint src/, you can create a simple alias and run poetry run poe lint. See the [tool.poe.tasks] section in the example pyproject.toml file above for the configuration.
Benefits:
- Consistency: Everyone on your team uses the same commands
-
Simplicity:
poe lintinstead of remembering multiple flags -
Composability: Chain tasks together (e.g.,
lintrunsformatthenpylint) -
Documentation: Tasks are self-documenting in
pyproject.toml
Wrapping Up
You now have a solid foundation for local Snowflake Python development:
- ✅ Deterministic Python versions with
pyenvand.python-version - ✅ Robust dependency management with
Poetryandpyproject.toml - ✅ Consistent tooling configured in a single file
- ✅ Simplified task execution with
Poe the Poet
This setup eliminates the "it works on my machine, but not on others" problem at its source. Every developer on your team will have the exact same environment, the same dependencies, and the same tooling automatically.
The DevX payoff: By investing in these foundations, you're not just setting up tools, you're creating an environment where Data Engineers can focus on building features instead of fighting with configuration. This is how we democratize data development.
I hope you find this guide helpful. If you have questions or feedback, I'd love to hear from you!
Top comments (0)