DEV Community

Thesius Code
Thesius Code

Posted on • Originally published at datanest-stores.pages.dev

Advanced SQL Query Library

Advanced SQL Query Library

200+ production-ready SQL queries organized by analytics use case: cohort analysis, funnel metrics, retention, revenue, user segmentation, and operational reporting. Each query includes inline comments, schema assumptions, and dialect notes for PostgreSQL, BigQuery, Snowflake, and Redshift.

Key Features

  • 200+ Production Queries — tested patterns, not textbook examples
  • 8 Analytics Domains — cohort, funnel, retention, revenue, segmentation, engagement, operations, ad-hoc
  • Multi-Dialect Support — notes for PostgreSQL, BigQuery, Snowflake, and Redshift syntax differences
  • Schema Documentation — expected table structures and column definitions for every query
  • Performance Annotations — index recommendations and query plan considerations
  • Composable CTEs — modular query building blocks you can chain together
  • Window Function Patterns — 30+ window function recipes for ranking, running totals, and gaps

Quick Start

  1. Browse queries by category in src/queries/
  2. Read the schema assumptions in docs/schema.md
  3. Copy the query and update table/column names for your warehouse
  4. Check dialect notes if not using PostgreSQL
-- Quick win: Revenue by month with MoM growth
SELECT
    DATE_TRUNC('month', order_date) AS month,
    SUM(amount) AS revenue,
    LAG(SUM(amount)) OVER (ORDER BY DATE_TRUNC('month', order_date)) AS prev_month,
    ROUND(
        (SUM(amount) - LAG(SUM(amount)) OVER (ORDER BY DATE_TRUNC('month', order_date)))
        / NULLIF(LAG(SUM(amount)) OVER (ORDER BY DATE_TRUNC('month', order_date)), 0)
        * 100, 1
    ) AS mom_growth_pct
FROM orders
WHERE order_status = 'completed'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;
Enter fullscreen mode Exit fullscreen mode

Usage Examples

Cohort Retention Analysis

-- Monthly cohort retention matrix
WITH user_cohorts AS (
    SELECT
        user_id,
        DATE_TRUNC('month', MIN(event_date)) AS cohort_month
    FROM events
    WHERE event_type = 'signup'
    GROUP BY user_id
),
user_activity AS (
    SELECT DISTINCT
        e.user_id,
        c.cohort_month,
        DATE_TRUNC('month', e.event_date) AS activity_month
    FROM events e
    JOIN user_cohorts c ON e.user_id = c.user_id
),
retention AS (
    SELECT
        cohort_month,
        DATE_DIFF('month', cohort_month, activity_month) AS month_number,
        COUNT(DISTINCT user_id) AS active_users
    FROM user_activity
    GROUP BY cohort_month, month_number
)
SELECT
    cohort_month,
    month_number,
    active_users,
    ROUND(
        active_users * 100.0
        / FIRST_VALUE(active_users) OVER (
            PARTITION BY cohort_month ORDER BY month_number
        ), 1
    ) AS retention_pct
FROM retention
ORDER BY cohort_month, month_number;
-- Index: events(user_id, event_date, event_type)
Enter fullscreen mode Exit fullscreen mode

Conversion Funnel Analysis

-- Step-by-step funnel with conversion rates
WITH funnel AS (
    SELECT
        session_id,
        MAX(CASE WHEN event = 'page_view' THEN 1 ELSE 0 END) AS step_1_view,
        MAX(CASE WHEN event = 'add_to_cart' THEN 1 ELSE 0 END) AS step_2_cart,
        MAX(CASE WHEN event = 'begin_checkout' THEN 1 ELSE 0 END) AS step_3_checkout,
        MAX(CASE WHEN event = 'purchase' THEN 1 ELSE 0 END) AS step_4_purchase
    FROM events
    WHERE event_date >= CURRENT_DATE - INTERVAL '30 days'
    GROUP BY session_id
)
SELECT
    'Page View' AS step,
    SUM(step_1_view) AS users,
    100.0 AS pct_of_top
UNION ALL
SELECT 'Add to Cart', SUM(step_2_cart),
    ROUND(SUM(step_2_cart) * 100.0 / NULLIF(SUM(step_1_view), 0), 1)
FROM funnel
UNION ALL
SELECT 'Begin Checkout', SUM(step_3_checkout),
    ROUND(SUM(step_3_checkout) * 100.0 / NULLIF(SUM(step_1_view), 0), 1)
FROM funnel
UNION ALL
SELECT 'Purchase', SUM(step_4_purchase),
    ROUND(SUM(step_4_purchase) * 100.0 / NULLIF(SUM(step_1_view), 0), 1)
FROM funnel;
Enter fullscreen mode Exit fullscreen mode

Window Functions: Running Total with Reset

-- Running total that resets each month
SELECT
    order_date,
    amount,
    SUM(amount) OVER (
        PARTITION BY DATE_TRUNC('month', order_date)
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS mtd_running_total,
    ROW_NUMBER() OVER (
        PARTITION BY DATE_TRUNC('month', order_date)
        ORDER BY amount DESC
    ) AS rank_in_month
FROM orders
WHERE order_status = 'completed'
ORDER BY order_date;
Enter fullscreen mode Exit fullscreen mode

Query Index

Category Count Key Queries
Cohort Analysis 25 Retention matrix, cohort LTV, cohort comparison
Funnel Metrics 20 Step conversion, drop-off, time-to-convert
Retention 20 D1/D7/D30, rolling retention, resurrection
Revenue 30 MRR, NRR, ARPU, revenue by segment, growth
Segmentation 20 RFM, behavioral, demographic, value-based
Engagement 25 DAU/MAU, session depth, feature usage, stickiness
Operations 30 SLA tracking, queue times, error rates
Window Functions 30 Ranking, running totals, gaps, islands, LAG/LEAD

Dialect Differences

Function PostgreSQL BigQuery Snowflake Redshift
Date truncate DATE_TRUNC('month', d) DATE_TRUNC(d, MONTH) DATE_TRUNC('MONTH', d) DATE_TRUNC('month', d)
Date diff d1 - d2 DATE_DIFF(d1, d2, DAY) DATEDIFF('day', d2, d1) DATEDIFF('day', d2, d1)
String agg STRING_AGG(s, ',') STRING_AGG(s, ',') LISTAGG(s, ',') LISTAGG(s, ',')

Best Practices

  1. Use CTEs over subqueries — CTEs are readable, debuggable, and often optimized the same way
  2. Always qualify columnst.column_name prevents ambiguity in joins
  3. Filter early — push WHERE clauses as close to the base tables as possible
  4. Use NULLIF in divisionx / NULLIF(y, 0) prevents division-by-zero errors
  5. Index your join columns — missing indexes are the #1 cause of slow analytics queries
  6. Comment your assumptions — future readers need to know why, not just what

Troubleshooting

Issue Cause Fix
Query times out Full table scan on large table Add date filter; check indexes on join/filter columns
Duplicate rows in results Incorrect join cardinality Use DISTINCT or verify join produces expected 1:1 or 1:N
NULL results in aggregations NULLs in SUM/COUNT Use COALESCE or filter NULLs explicitly
Different results across dialects DATE_DIFF argument order Check dialect reference table above

This is 1 of 11 resources in the Data Analyst Toolkit toolkit. Get the complete [Advanced SQL Query Library] with all files, templates, and documentation for $29.

Get the Full Kit →

Or grab the entire Data Analyst Toolkit bundle (11 products) for $129 — save 30%.

Get the Complete Bundle →


Related Articles

Top comments (0)