DEV Community

Cover image for Automating Your Local DBT & Snowflake Playground with Python
Arunkumar Panneerselvam
Arunkumar Panneerselvam

Posted on

Automating Your Local DBT & Snowflake Playground with Python

> Harnessing Python to reverse-engineer Snowflake metadata as dbt sources — an easy, efficient playground setup

Introduction

Everyone learns differently. Whether it's diving into textbooks, absorbing visual content, or hands-on tinkering, the path to mastery varies widely. Personally, I find the best way is by doing building, automating, debugging, and iterating until something clicks.

Recently, I explored how to automatically generate a full dbt project structure from a Snowflake trial account taking away the toil of manual YAML and SQL file creation. If you’ve dabbled in dbt or Snowflake, you know setting up sources can be time-consuming, especially across schemas and tables.

This guide, however, is not a beginner’s tutorial on Snowflake or dbt basics. I assume you’ve got some familiarity already. Instead, it focuses on streamlining setup using Python scripts and Visual Studio Code (or any coding environment you prefer), enabling rapid experimentation and transformation.

Why Automate Source Generation?

Manual source definition is repetitive and error-prone. When working with complex databases, initial setup becomes a bottleneck.

Using a Python script to introspect the Snowflake metadata and generate compliant dbt source files:

  • Saves hours of setup time
  • Avoids human mistakes in YAML/SQL
  • Scales effortlessly as schemas or tables grow
  • Sets a foundation for continuous and reproducible transformations

Setting Up Your Development Environment, You’ll need a few tools ready:

Python 3.13+: Download & install
Pip: The Python package manager — install if not available
Virtualenv: To isolate your dependencies

Step 1 — Install Python

Download Python

Download Python | Python.org

The official home of the Python Programming Language

favicon python.org

Download python and install it first. I recommend installing the latest version of python 3.

You can verify if python has been correctly installed by:

python --version
Enter fullscreen mode Exit fullscreen mode

you should receive something like:

Python 3.12.1
Enter fullscreen mode Exit fullscreen mode

Step 2 — Install pip

Download pip:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
Enter fullscreen mode Exit fullscreen mode

Then install it:

python get-pip.py
Enter fullscreen mode Exit fullscreen mode

You should see similar to this,

PS C:\Users \arun> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py                                                                                                                                                   PS C:\Users \arun> python get-pip.py
Collecting pip
  Using cached pip-25.2-py3-none-any.whl.metadata (4.7 kB)
Using cached pip-25.2-py3-none-any.whl (1.8 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 25.2
    Uninstalling pip-25.2:
      Successfully uninstalled pip-25.2
Successfully installed pip-25.2
Enter fullscreen mode Exit fullscreen mode

Verify the installation:

python -m pip --version
Enter fullscreen mode Exit fullscreen mode

you should get something like:

PS C:\Users \arun> python -m pip --version
pip 25.2 from C:\Users \arun\AppData\Local\Programs\Python\Python313\Lib\site-packages\pip (python 3.13)
Enter fullscreen mode Exit fullscreen mode

If you get errors when running pip, it might be due to the environment variables not properly set up. Follow this article to set up the environment vars for pip.

Step 3 — Install virtualenv

Virtualenv is a tool in Python that allows you to create isolated environments where you can install packages and dependencies without affecting the global Python installation. It helps manage project-specific dependencies and avoids conflicts between different projects.

You can run the whole project without virtualenv but I prefer isolating different projects in case I need different python or library versions.

python -m pip install --user virtualenv
Enter fullscreen mode Exit fullscreen mode

Installation output should be similar to this,

Step 4 — Create the Virtual Environment

Now we are ready to create our virtual environment: first we create and, then we activate it. Just go to the directory you want to create this project and run:

python -m venv dbt-env
Enter fullscreen mode Exit fullscreen mode

This will install the environment under that directory as shown below:

Check the user guide for details.

Step 5 — Activate the Virtual Environment

Once the virtual environment created you will need to activate it. In PowerShell, from the root directory you have created the environment, you activate the environment like this:

.\Scripts\activate.ps1
Enter fullscreen mode Exit fullscreen mode

It will look like this:

Virtualenv shown in green at the left

When you want to operate within this project, ensure that you have the dbt-env activated, as shown above. Additionally, all subsequent installations and operations must occur within this directory or its subdirectories.

Installing DBT

Installing DBT with pip couldn't be simpler. Follow the instructions from dbt documentation. I am summarizing it below for Snowflake specifically.

Install dbt-core:

python -m pip install dbt-core
Enter fullscreen mode Exit fullscreen mode

Then install the Snowflake-specific libraries:

python -m pip install dbt-snowflake
Enter fullscreen mode Exit fullscreen mode

You can see the libraries and the versions under your environment as shown below:

pip list
Enter fullscreen mode Exit fullscreen mode

It should show your libraries as below (Note: I may have more libraries as I made wider installations)

Snowflake instance

Creating a Trial Snowflake Account

If you don’t have access to an Enterprise Snowflake Account, the easiest and cheapest way to complete this step is using a Trial Account. Also, we will be generating the data sources from the trial account objects.

Complete the Snowflake Signup

Then you select the edition and the cloud provider. I have selected Microsoft Azure. You will be asked for the purpose of the trial account and… that's it, you have the trial account created !!!

You will receive an email to activate your Snowflake account and the link to your instance. Click on that link and activate it. Upon signup, note your account identifier, username, password, default warehouse, and database details. These are essential for connecting from your scripts and dbt profiles.

Initialize Your dbt Project

With the environment active, start your dbt project. Run DBT Init statement to initialize the project as shown below. Make sure the environment has been activated as shown in Step 5 of the section “Create the Virtual Environment”.

dbt init snowflake_dbt_main
Enter fullscreen mode Exit fullscreen mode

During setup, choose Snowflake, input your credentials, and specify the default schema, warehouse, etc.

(dbt-main) PS D:\dbt\dbt-main> dbt init snowflake_dbt_main
00:11:48  Running with dbt=1.10.11
00:11:48  
Your new dbt project "snowflake_dbt_main" was created!

For more information on how to configure the profiles.yml file,
please consult the dbt documentation here:

  https://docs.getdbt.com/docs/configure-your-profile

One more thing:

Need help? Don't hesitate to reach out to us via GitHub issues or on Slack:

  https://community.getdbt.com/

Happy modeling!

00:11:48  Setting up your profile.
Which database would you like to use?
[1] snowflake

(Don't see the one you want? https://docs.getdbt.com/docs/available-adapters)

Enter a number: 1
account (https://<this_value>.snowflakecomputing.com): xxxxxxx
user (dev username): xxxxxxxx
[1] password
[2] keypair
[3] sso
Desired authentication type option (enter a number): 1
password (dev password):
role (dev role): ACCOUNTADMIN
warehouse (warehouse name): COMPUTE_WH
database (default database that dbt will build objects in): SNOWFLAKE_SAMPLE_DATA
schema (default schema that dbt will build objects in): TPCDS_SF100TCL
threads (1 or more) [1]:
00:15:27  Profile snowflake_dbt_main written to C:\Users \arun\.dbt\profiles.yml using target's profile_template.yml and your supplied values. Run 'dbt debug' to validate the connection.
(dbt-main) PS D:\dbt\dbt-main> 
Enter fullscreen mode Exit fullscreen mode

Provide a valid Account and username. The Account Identifier has 2 segments:
-
You might get something like:
ED25756-YUNGSKL
where the organization is ED25756 and the account name would be YUNGSKL
You can also find this in the Snowflake Account URL or under the profile > Account > Account details.

The project is now initialized, and you will see the following directories,

Snowflake Data Sources

In order to create the data sources from our newly created trial Snowflake account, below Python script that does exactly that, it connects to your Snowflake account, pulls the tables and views from a specific Warehouse, Database, and Schema, and creates the DBT objects.

The Python Automation Script
The heart of this project — a Python script — connects to Snowflake, extracts metadata on tables, views, and columns, then auto-generates:

YAML files describing data sources
SQL files querying these sources with a standard wrapper

This eliminates manual YAML/SQL creation.

The script code is here for your usage. Just do the modifications according to your Snowflake hierarchy.

Sample code snippet function showing metadata query and file generation

To run this script, you would need the following information,

  • Snowflake Account Code, User, and Password. This can be fetched from the earlier steps of the setup or you can get it from the Snowflake account.
  • DBT Models path. You can get this by right-clicking in the Models folder in the IDE (Visual Studio Code).

Setting up environment variables before execution can be done by setting the parameters & values under the "Environment variables" as shown below,

Create the following environment variables,

DBT_MODELS_PATH
SNOWFL_ACCT
SNOWFL_USER
SNOWFL_PWD

with the appropriate values as they are needed for the script to execute.

or

Ensure you have environment variables configured:

If you are using PowerShell,
_

$env:DBT_MODELS_PATH = "D:\dbt\dbt-main\snowflake_dbt_main\models"
$env:SNOWFLAKE_USER=your_user
$env:SNOWFLAKE_PASSWORD=your_password
$env:SNOWFLAKE_ACCOUNT=your_account
_

If on Linux,
_

export SNOWFLAKE_USER=your_user
export SNOWFLAKE_PASSWORD=your_password
export SNOWFLAKE_ACCOUNT=your_account
export DBT_MODELS_PATH=/full/path/to/your/dbt/models
_

Running the Automation

Once you run it, as your account is a trial and it already has some sample data, the DBT model will automatically create all the tables and views from the database SNOWFLAKE_SAMPLE_DATA and schemas below

['TPCH_SF100', 'TPCH_SF1000']

It will use the default warehouse COMPUTE_WH. Look for creation messages in the terminal and verify files under the models directory.

python .\data_generate.py
Enter fullscreen mode Exit fullscreen mode

The output must be similar to,

PS D:\dbt> python .\data_generate.py
Models directory: D:\dbt\dbt-main\snowflake_dbt_main\models
Connected to Snowflake successfully.
Sample metadata:
               schema                 object_name object_type   column_name data_type mandatory
0  INFORMATION_SCHEMA            APPLICABLE_ROLES        VIEW       GRANTEE   varchar  not null
1  INFORMATION_SCHEMA            APPLICABLE_ROLES        VIEW     ROLE_NAME   varchar  not null
2  INFORMATION_SCHEMA            APPLICABLE_ROLES        VIEW    ROLE_OWNER   varchar  not null
3  INFORMATION_SCHEMA            APPLICABLE_ROLES        VIEW  IS_GRANTABLE   varchar      null
4  INFORMATION_SCHEMA  APPLICATION_SPECIFICATIONS        VIEW         LABEL   varchar  not null
Creating directory: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src
Creating directory: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src
Directory creation completed with code: 0
Starting to create YAML files for schema TPCH_SF100...
Generating YAML files in: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src
Processing object: CUSTOMER with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\CUSTOMER.yml
Processing object: LINEITEM with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\LINEITEM.yml
Processing object: NATION with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\NATION.yml
Processing object: ORDERS with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\ORDERS.yml
Processing object: PART with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\PART.yml
Processing object: PARTSUPP with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\PARTSUPP.yml
Processing object: REGION with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\REGION.yml
Processing object: SUPPLIER with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\SUPPLIER.yml
Starting to create YAML files for schema TPCH_SF1000...
Generating YAML files in: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src
Processing object: CUSTOMER with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\CUSTOMER.yml
Processing object: LINEITEM with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\LINEITEM.yml
Processing object: NATION with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\NATION.yml
Processing object: ORDERS with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\ORDERS.yml
Processing object: PART with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\PART.yml
Processing object: PARTSUPP with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\PARTSUPP.yml
Processing object: REGION with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\REGION.yml
Processing object: SUPPLIER with type BASE TABLE
Created YAML file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\SUPPLIER.yml
Starting to create SQL models for schema TPCH_SF100...
Generating SQL files in: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src
Processing SQL model: CUSTOMER (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\CUSTOMER.sql
Processing SQL model: LINEITEM (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\LINEITEM.sql
Processing SQL model: NATION (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\NATION.sql
Processing SQL model: ORDERS (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\ORDERS.sql
Processing SQL model: PART (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\PART.sql
Processing SQL model: PARTSUPP (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\PARTSUPP.sql
Processing SQL model: REGION (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\REGION.sql
Processing SQL model: SUPPLIER (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF100_src\SUPPLIER.sql
Starting to create SQL models for schema TPCH_SF1000...
Generating SQL files in: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src
Processing SQL model: CUSTOMER (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\CUSTOMER.sql
Processing SQL model: LINEITEM (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\LINEITEM.sql
Processing SQL model: NATION (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\NATION.sql
Processing SQL model: ORDERS (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\ORDERS.sql
Processing SQL model: PART (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\PART.sql
Processing SQL model: PARTSUPP (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\PARTSUPP.sql
Processing SQL model: REGION (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\REGION.sql
Processing SQL model: SUPPLIER (type BASE TABLE)
Created SQL file: D:\dbt\dbt-main\snowflake_dbt_main\models\TPCH_SF1000_src\SUPPLIER.sql
PS D:\dbt>
Enter fullscreen mode Exit fullscreen mode

You should be able to see the models now,

Start Transforming!

With your sources in place, use dbt commands:

dbt run
dbt test

Enjoy a frictionless start on your Snowflake-Dbt transformations.

Conclusion

Setting up a dbt environment connected to Snowflake no longer needs to feel tedious or intimidating. With a bit of scripting magic, you can automate the heavy lifting and focus on what truly matters — crafting data transformations and insights.

Experiment, extend, and share your improvements!

🚀 Enjoyed this guide? Give it a clap, share your feedback in the comments, or ask questions below!

💬 Looking forward to your insights and stories — let’s grow together in the data engineering world!

Top comments (0)