Your Go‑To Guide for Snowflake Architecture, Security, and Maturity Basics
Photo by Scott Blake on Unsplash
tl;dr
In this first post of a series focused on architecture, I start with what I often get asked to evaluate Snowflake architectures or build ones from scratch, so this article provides a checklist for evaluating and setting up your first secure Snowflake foundation. Like any architecture, depending on your compliance level or needs, it can be adapted or parts ignored.
The Foundational Checklist
I. Account & Initial Security Configuration
Enable Multi-Factor Authentication (MFA) for All Users:
- Snowflake Feature: Integrated MFA support is enabled by default, DO NOT DISABLE. You can now use Passkeys as well.
- Why: Critical first line of defense against compromised credentials. Enforce for all users, especially administrators.
Set Up Authentication Policies (e.g., for Federated Authentication/SSO):
- Snowflake Feature: Support for SAML 2.0 for SSO.
- Why: Centralizes identity management and enhances user experience while enforcing organizational authentication standards.
Define and Implement Strong Password Policies:
- Snowflake Feature: Account-level password policies (length at least 12 characters, complexity, history). Better to link to your SSO services if possible. FYI, often password expiry is used but I am not a fan of it. Its better to actively monitor failed attempts and force password resets. Expiries make users have bad habits.
- Why: Reduces the risk of brute-force attacks and weak password usage.
Regularly Review User Accounts & Deactivate Dormant Ones:
- Snowflake Practice: Query SNOWFLAKE.ACCOUNT_USAGE.USERS view on a quarterly basis. I typically create calendar reminders for the appropriate people to do this and document what happened in a short ticket or document.
-- Query to identify Snowflake users who have not logged in within the last 90 days
SELECT
name AS username,
created_on,
last_success_login,
IFF(last_success_login IS NULL, 'NEVER', TO_VARCHAR(last_success_login)) AS last_login_display,
disabled,
deleted_on
FROM
SNOWFLAKE.ACCOUNT_USAGE.USERS
WHERE
last_success_login IS NULL OR last_success_login < DATEADD(day, -90, CURRENT_DATE())
ORDER BY
created_on DESC;
- Why: Minimizes attack surface by removing unnecessary access points.
Limit the Number of ACCOUNTADMIN Users:
- Snowflake Practice: Grant ACCOUNTADMIN role sparingly. Use other system-defined roles (SYSADMIN, SECURITYADMIN, USERADMIN) for delegated administration.
- Why: Principle of least privilege; reduces the impact of a compromised high-privilege account.
Configure Session Policies:
- Snowflake Feature: Control idle session timeouts and client session policies.
- Recommended:
SESSION_IDLE_TIMEOUT_MINS = 30
SESSION_UI_IDLE_TIMEOUT_MINS = 30
ALLOWED_SECONDARY_ROLES = ('SYSADMIN,ENV_ADMIN')
- Why: It reduces the risk of unauthorized access from unattended sessions. Notice that I do not recommend ALL on ALLOWED_SECONDARY_ROLES. Although Snowflake does not provide guidance, I feel this should be managed regardless.
II. Role-Based Access Control (RBAC) & Identity Management
Design a Granular and Hierarchical Role Structure:
- Snowflake Feature: Flexible role hierarchy (roles can inherit privileges from other roles).
+-------------------+ Is Assigned +--------------------+ Is Granted +--------------------------------+ Are Applied To +---------------------------------+
| USER |--------------------->| ROLE |---------------------->| PRIVILEGES |------------------------>| OBJECT |
+-------------------+ +--------------------+ +--------------------------------+ +---------------------------------+
| | | | | | | |
| USER_ALICE | | ROLE_DATA_READER | | SELECT (Read) | | TABLE_CUSTOMER_TRANSACTIONS |
| (Data Analyst) |--------------------->| |---------------------->| |------------------------>| (e.g., in Sales Database, |
| | | | | | | Marketing Schema) |
+-------------------+ +--------------------+ +--------------------------------+ +---------------------------------+
^ ^
| |
+-------------------+ +--------------------+ +--------------------------------+ |
| USER_BOB | | ROLE_DATA_WRITER | | SELECT (Read) | |
| (Data Engineer) |--------------------->| |---------------------->| INSERT (Write) |------------------------>|
| | | | | UPDATE (Write) | |
+-------------------+ +--------------------+ | DELETE (Write) | +---------------------------------+
+--------------------------------+
- Why: Enables clear separation of duties and adherence to the principle of least privilege. Map roles to business functions.
Grant Privileges to Roles, Not Directly to Users:
- Snowflake Practice: Assign users to roles, and grant database object privileges (SELECT, INSERT, etc.) to those roles. Although, Snowflake now support UBAC, I would not generally recommend it outside very specific use-cases.
- Why: Simplifies access management and auditing. Easier to manage permissions as users change roles.
Utilize Future Grants for Simplified Privilege Management:
- Snowflake Feature: GRANT … ON FUTURE … TO ROLE …
- Why: Automatically grants specified privileges on new objects created within a schema/database to a role, reducing manual intervention.
Implement Managed Access Schemas where Appropriate:
- Snowflake Feature: Schemas where object owners lose the ability to grant access, centralizing grant management with the schema owner (or a role with MANAGE GRANTS privilege).
- Why: Enforces stricter control over who can grant access to objects within a schema.
Conduct Quarterly Audits of Role Privileges and User Assignments:
- Snowflake Practice: Query SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_ROLES and SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_USERS.
SELECT
grantee_name,
privilege,
granted_on,
name as object_name,
granted_by,
created_on
FROM SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_ROLES
WHERE deleted_on IS NULL;
- Why: Ensures access rights remain appropriate and identifies any privilege creep.
Consider SCIM for Automated User Provisioning/Deprovisioning:
- Snowflake Feature: SCIM support for integrating with Identity Providers (IdPs) like Okta, Azure AD.
- Why: Automates user lifecycle management, ensuring timely removal of access when users leave or change roles.
III. Data Protection & Encryption
You get end-to-end encryption by default, but you should still understand Snowflake’s End-to-End Encryption:
- Snowflake Feature: Automatic encryption of all data at rest (AES-256) and in transit (TLS).
- Why: Provides a foundational layer of data protection managed by Snowflake.
Classify Data and Apply Appropriate Tagging:
- Snowflake Feature: Automatic Object Tagging with Sensitive Auto classification feature. Although I do caution here that you probably should profile your data anyway on a somewhat regular basis just in case that automatic tagging is not covering 100% of your use cases.
- Why: Helps identify sensitive data, track lineage, and can be used in conjunction with masking/row-access policies or for cost allocation.
Utilize Dynamic Data Masking for Sensitive Columns:
- Snowflake Feature: Create masking policies to selectively mask data at query time based on user role or other conditions.
- Why: Allows users to query tables containing sensitive data without exposing the actual sensitive values, based on their authorization.
Leverage Secure Data Sharing for Controlled External Access:
- Snowflake Feature: Share data with other Snowflake accounts (Readers or Full Accounts) without data movement.
- Why: Provides a secure and governed way to share live data with partners, customers, or other business units.
Optional — Implement Customer-Managed Keys (Tri-Secret Secure) if Required by Policy:
- Snowflake Feature: Tri-Secret Secure (requires Business Critical edition or higher).
- Why: Provides an additional layer of control by allowing you to manage your own master key in AWS Key Management Service (KMS) or Azure Key Vault.
Optional — Implement Row-Access Policies for Fine-Grained Row-Level Security:
- Snowflake Feature: Create policies to filter which rows are visible to users based on their role or attributes. I caution on this one is that it typically really seen needed in higher compliance environments, data clean rooms. But it is a good tools in case you do not need to hire rows from certain users.
- Why: Enables different users or roles to see different subsets of rows in the same table.
IV. Network & Connectivity Security
Implement Network Policies to Restrict Access by IP Address:
- Snowflake Feature: Account-level and user-level network policies.
- Why: Limits connections to your Snowflake account from trusted IP addresses or ranges, reducing the risk of unauthorized external access.
Utilize Private Connectivity (AWS PrivateLink, Azure Private Link, Google Cloud Private Service Connect):
- Snowflake Feature: Options for establishing private connections between your VPC/VNet and Snowflake.
- Why: Ensures data traffic does not traverse the public internet, enhancing security for sensitive workloads.
Secure Client Connections (Drivers, Connectors):
- Snowflake Practice: Ensure drivers (JDBC, ODBC, Python connector, etc.) are up-to-date, well, you should be deploying software through a CI/CD that verifies for vulnerabilities, and update drivers as needed. It could be hard to keep drivers up to date but at least a process that keeps drivers up to date for any major security updates. Plus, use secure connection parameters.
- Why: Protects data in transit between client applications and Snowflake.
V. Data Governance & Lifecycle Management
Define Data Retention and Time Travel Policies:
- Snowflake Feature: Configurable data retention period for Time Travel (default 1 day for standard and up to 90 days for Enterprise Edition and above). Caution: If the table is loaded daily or frequently, then custom-set the time travel so it is not long, as it would become very expensive. Decide on your RPO and set it accordingly. 7 to 15 days is not a bad option. Remember that you get seven extra days with Fail-safe.
- Why: Enables recovery from accidental data modification/deletion, but also has cost implications. Align with business and compliance needs.
Implement Secure Views for Controlled Data Exposure:
- Snowflake Feature: Create views to abstract underlying table structures, join data, or restrict column access.
- Why: Provides a controlled interface for users to access data without giving direct access to base tables.
Manage Object Dependencies and Change Management:
- Snowflake Practice: Use tools or scripts to track dependencies before dropping or altering objects. Implement a change management process. This is where moving towards a full SLDC process. Everything that defines your Snowflake environment should be treated as code and live in a Git repository.
- Why: Prevents unintended impact on downstream processes or user access.
Consider Data Masking/Anonymization for Non-Production Environments:
Snowflake Practice:
- Single accounts: Use cloning (CREATE … CLONE) for creating dev/test environments, then apply masking policies or transform data to remove sensitivity.
- Multi-account structure: Use cloning (CREATE … CLONE) for creating dev/test environments, then apply masking policies or transform data to remove sensitivity, plus database sharing in lower accounts.
- Why: Reduces the risk of exposing sensitive production data in lower environments.
VI. Monitoring, Auditing & Cost Management (Security Context)
Regularly Monitor Access History and Login Attempts:
- Snowflake Feature: SNOWFLAKE.ACCOUNT_USAGE.LOGIN_HISTORY, QUERY_HISTORY, ACCESS_HISTORY views.
- Why: Helps detect suspicious activity, unauthorized access attempts, and understand data access patterns.
Set Up Alerts for Critical Security Events:
- Snowflake Practice: Use tasks or external tools to query audit logs and trigger alerts (e.g., failed logins, privilege escalations).
- Why: Enables timely response to potential security incidents.
Monitor Usage of Secure Features (Masking, Row Access Policies):
- Snowflake Feature: SNOWFLAKE.ACCOUNT_USAGE.MASKING_POLICIES, ROW_ACCESS_POLICIES views.
- Why: Ensure policies are being applied as expected and identify any configuration issues.
Understand Cost Implications of Security Features:
- Snowflake Practice: Features like long Time Travel retention, frequent cloning for masked environments, and extensive query history can impact storage and compute costs.
- Why: Balance security requirements with budget considerations.
VII. Secure Development & Deployment Practices
Securely Manage Snowflake Credentials:
- Snowflake Practice: Avoid embedding credentials directly in code, scripts, or configuration files. Instead:
- Use Snowflake Key Pair Authentication for programmatic access using Snowflake Account type = Service. Store private keys securely and encrypted, not in version control.
- Leverage environment variables to inject credentials into applications or scripts at runtime.
- Integrate with secret management tools (e.g., AWS Secrets Manager, Azure Key Vault, HashiCorp Vault) for centralized and secure storage and retrieval of credentials.
- Utilize OAuth for applications accessing Snowflake, avoiding direct credential storage.
- Why: Reduces the risk of accidental exposure of sensitive credentials, maintains control over credential lifecycle, and improves auditing capabilities.
Parameterize SQL Queries for Snowflake:
- Snowflake Practice: When building applications that execute SQL against Snowflake, use parameterized queries or bind variables. This applies to all drivers (JDBC, ODBC, Python Connector, etc.).
- Example (Python):
cursor.execute(“SELECT * FROM mytable WHERE id = %s”, (user_id,))instead of constructing the query string directly with theuser_id. - Utilize Snowflake’s stored procedures for complex logic or recurring tasks, passing parameters rather than hardcoding values.
- Why: Protects against SQL injection attacks, improves query performance by allowing Snowflake to reuse query plans, and enhances security by preventing data exposure through query manipulation.
Version Control Snowflake Objects and Configurations:
- Snowflake Practice: Treat Snowflake database objects (tables, views, procedures, functions), security configurations (roles, grants, policies), and deployment scripts as code.
- Store DDL (CREATE/ALTER statements), DML (INSERT/UPDATE/DELETE scripts), and policy definitions (masking, row access) in a Git repository or other version control system.
- Use Infrastructure-as-Code (IaC) tools like Terraform or Snowflake’s SnowSQL scripting to automate deployment and configuration management.
- Establish a defined branching and merging strategy for Snowflake changes, ensuring code review and control.
- Why: Enables change tracking, collaboration, rollback capability, and automation of Snowflake environment management, contributing to both security and reliability.
Isolate Development, Testing, and Production Environments in Snowflake:
- Snowflake Practice: Establish separate Snowflake accounts or database environments for development, testing (QA/staging), and production. Separate accounts offer stronger isolation, especially for billing and top-level security settings, but can increase complexity for data movement if not using features like sharing effectively. Most environments will function well with just separate databases unless your security team decides separate accounts are required.
- Use cloning (CREATE … CLONE) to copy databases and objects from production to lower environments for testing, ensuring data consistency (and mask or anonymize this cloned data!).
- Apply rigorous change control processes for promoting changes from development through to production.
- Why: Prevents accidental or unauthorized changes from affecting production data, allows thorough testing in realistic conditions, and reduces the blast radius of potential errors or security vulnerabilities.
Implement Automated Testing for Snowflake Objects and Security Policies:
- Snowflake Practice: Develop automated tests to validate the correctness and security of Snowflake objects and configurations.
- Write SQL scripts or use testing frameworks (e.g., Python’s
pytestwith Snowflake Connector) to verify data integrity, query results, and policy enforcement. - Automate testing as part of your CI/CD pipeline for Snowflake changes, ensuring new code or configuration adheres to security standards.
- Why: Detects bugs, inconsistencies, and security misconfigurations early, automates repetitive validation tasks, and ensures consistent quality in Snowflake deployments.
Follow Secure Code Review Practices for Snowflake:
- Snowflake Practice: Require peer review for all SQL scripts, policy changes, and deployment configurations before they are applied to Snowflake environments.
- Focus reviews on security considerations, such as access control, data masking, and SQL injection prevention.
- Document code review findings and ensure they are addressed before code is deployed.
- Why: Identifies potential errors, security vulnerabilities, and non-optimal practices early in the development process, and promotes knowledge sharing among team members.
Regularly Audit and Review Deployment Processes:
- Snowflake Practice: Periodically review the processes used to deploy and manage changes in your Snowflake environment.
- Analyze deployment logs, audit trails, and change requests to identify areas for improvement.
- Update deployment procedures, automation scripts, and security guidelines as needed.
Why: Ensures deployment practices remain effective, secure, and aligned with evolving security requirements and good practices.
Tools Recommendations
- Terraform: Recommended for managing infrastructure declaratively, ensuring consistency, drift detection, and rollback capabilities. Terraform is excellent for managing the “scaffolding” and security aspects of your Snowflake environment: Databases, Schemas, Warehouses, Users, Roles, Grants, Network Policies, Session Policies, and Resource Monitors.
- dbt: Ideal for data transformations, automated testing, documentation, and lineage management within Snowflake databases. dbt excels at managing the lifecycle of your data models (tables and views built via SQL SELECT statements) within Snowflake.
- CI/CD: Ideal for git workflows and testing, although GitHub is quite good, this is more of a preference for what you are already using for CI/CD. Automate the testing and deployment of all changes coming from Git.
- Optional — Airflow: Excellent for orchestration and scheduling complex workflows involving Snowflake.
Phased Implementation Approach
Adopt the phased implementation strategy:
- Phase 1: Initial setup (Account Security, SSO, MFA, basic RBAC, Terraform for IaC, initial dbt implementation — basic model 1 fact, and 2 dimensions, reporting linked via service account, dev). The security benefit here is primarily in establishing secure, repeatable data transformation and deployment patterns early on, and using a service account for reporting.
- Phase 2: Intermediate steps (expand data model, expand RBAC, data augmentation from marketplace if required, production).
- Phase 3: Advanced maturity (Dynamic data masking, column-level policies, full CI/CD integration, integration into security dashboards).
Find a printable list here.
Summary
This checklist, which you can find a printable document on this link, gives you a solid Snowflake foundation starts with strong account security, robust RBAC, built‑in encryption, and disciplined governance. To evolve toward advanced maturity, layer in monitoring, cost awareness, and secure DevOps.
Please let me know what you think in the comments, or find me on LinkedIn and give me feedback.
I am Augusto Rosa, a Snowflake Data Superhero, Snowflake SME, and Snowflake Toronto User-Group Organizer. I am also the Head of Data, Cloud, & Security Architecture at Archetype Consulting. You can follow me on LinkedIn.
Subscribe to my Medium blog https://blog.augustorosa.com and Archetype Consulting blogs https://blog.archetypeconsulting.com/ for the most interesting Data Engineering and Snowflake news.
Top comments (0)