DEV Community

Sangmin Lee
Sangmin Lee

Posted on • Originally published at claudeguide.io

How to Use Claude for Data Analysis: Practical Guide

Originally published at claudeguide.io/how-to-use-claude-for-data-analysis

How to Use Claude for Data Analysis: Practical Guide

Claude is most useful for data analysis when you paste the data directly into the conversation and ask specific questions — not "analyse this", but "what's the trend in column X", "which customers churned in Q3", or "write Python to calculate the 30-day moving average." Claude can read CSVs, interpret summaries, write analysis code, explain statistical outputs, and generate visualisation code. This guide covers the practical patterns that get useful results.


Pattern 1: Paste CSV data for direct analysis

For datasets under 10,000 rows, paste directly:

Here's my sales data from Q1:

date,product,revenue,units
2026-01-02,Widget A,1250,50
2026-01-02,Widget B,890,35
2026-01-03,Widget A,1100,44
...

1. What's the total revenue by product?
2. Which day had the highest sales?
3. What's the week-over-week growth trend?
Enter fullscreen mode Exit fullscreen mode

Claude reads the CSV headers, understands the data structure, and answers analytical questions directly. For straightforward aggregations and trends, no code is needed.

For larger datasets: paste a sample (first 20 rows + data description) and ask Claude to write pandas code you run yourself.


Pattern 2: Ask Claude to write analysis code

For repeatable analysis or large datasets:

I have a pandas DataFrame called `df` with these columns:
- user_id (int)
- signup_date (datetime)
- plan (str: 'free', 'starter', 'pro')
- monthly_revenue (float)
- last_active_date (datetime)
- churned (bool)

Write Python code to:
1. Calculate churn rate by plan
2. Find the median time-to-churn for churned users
3. Identify the cohort month (based on signup_date) with the highest churn rate
Enter fullscreen mode Exit fullscreen mode

Claude writes production-quality pandas code with proper handling of nulls, datetime parsing, and groupby operations. Review the code, then run it.

Best practice: specify the exact DataFrame schema in your prompt. Claude's code quality is much higher when it knows the column names and types.


Pattern 3: Interpret statistical output

After running analysis code, paste the output back for interpretation:

Here's the output from my churn analysis:

Plan    Churn Rate    Median Days to Churn    N
free    0.34          45                       2,840
starter 0.18          120                      1,230
pro     0.08          280                      450

The overall cohort analysis shows August 2025 had a 40% churn rate vs. 
the baseline of 18–22%.

What are the likely explanations for the August spike? 
What data would I need to confirm or rule out each hypothesis?
Enter fullscreen mode Exit fullscreen mode

Claude interprets statistical patterns, suggests causal hypotheses, and identifies what additional data would confirm them.


Pattern 4: Generate visualisation code

Using the churn DataFrame described above, write matplotlib/seaborn code to:
- A line chart showing monthly churn rate over time
- A bar chart comparing churn rate by plan
- Use a professional-looking style with clear labels and titles

I'm using Python 3.12 with matplotlib 3.8 and seaborn 0.13.
Enter fullscreen mode Exit fullscreen mode

Specify your library versions — Claude knows the current API and avoids deprecated functions.


Pattern 5: SQL query writing

For database analysis:

I have a PostgreSQL database with these tables:

users (id, email, created_at, plan, country)
events (id, user_id, event_name, properties, created_at)
subscriptions (id, user_id, plan, started_at, ended_at, mrr)

Write a SQL query to:
Find users who signed up in January 2026, upgraded from free to starter 
within 30 days of signup, and then upgraded to pro within 90 days.
Show their signup date, first upgrade date, second upgrade date, and current MRR.
Enter fullscreen mode Exit fullscreen mode

Claude writes complex joins, window functions, and date arithmetic reliably when given the full schema.


Claude's data analysis capabilities

Strong:

  • Aggregations and groupings (sum, count, average by category)
  • Time series analysis (trends, seasonality, moving averages)
  • Cohort analysis (retention, churn by cohort)
  • Statistical interpretation (explaining p-values, confidence intervals, correlation)
  • Code generation (pandas, SQL, R, Excel formulas)

Moderate:

  • Pattern recognition in complex multi-dimensional data
  • Causal inference and A/B test interpretation
  • Forecasting (can write code for ARIMA/Prophet, but won't run it)

Weak:

  • Running code and observing output directly (unless via Claude Code or computer use)
  • Very large datasets (paste samples + have Claude write code instead)
  • Domain-specific statistical methods not well-represented in training data

Asking the right questions

Too vague (produces generic analysis):

Analyse this sales data and tell me what you find.
Enter fullscreen mode Exit fullscreen mode

Specific and useful:

For this sales data:
1. Is the decline in Widget A revenue driven by fewer units sold, lower price, 
   or a mix? Show the calculation.
2. Which sales rep had the best Q1 vs Q4 year-over-year growth?
3. If current trends continue, what's the projected Q2 revenue? 
   Show your calculation method.
Enter fullscreen mode Exit fullscreen mode

The more specific your questions, the more actionable the analysis.


Data privacy considerations

For sensitive data (PII, financial records, health data):

  • Anonymise before pasting: replace names with IDs, generalise ages to buckets
  • Consider whether you're allowed to share data with a third-party AI service
  • For highly sensitive data, have Claude write analysis code you run locally instead of pasting the data

Frequently asked questions

Can Claude access my database directly?
Via the API with custom tools, yes. Claude can call a database query function you define. Via the claude.ai chat interface, no — you need to paste data or results. For production data analysis automation, build a Claude agent with database access.

How much data can I paste into Claude?
Up to 200,000 tokens (Claude's context window). That's roughly 150,000 words or about 10,000–20,000 rows of typical CSV data. For larger datasets, paste a sample and have Claude write code you run on the full dataset.

Is Claude better than dedicated BI tools (Tableau, Looker)?
Different tools for different purposes. BI tools are better for persistent dashboards, scheduled reports, and sharing with non-technical stakeholders. Claude is better for exploratory analysis ("I have a question and need to dig into the data now"), writing analysis code, and interpreting complex patterns.

Can Claude write Excel formulas?
Yes, including complex nested formulas. Describe what you want in plain English: "I need a formula in column F that calculates the running total of column C only for rows where column B = 'Active'". Claude writes the exact Excel formula.

What programming languages can Claude write analysis code in?
Python (pandas, numpy, scipy, matplotlib, seaborn, plotly), R (dplyr, ggplot2, tidyr), SQL (PostgreSQL, MySQL, SQLite, BigQuery, Snowflake syntax), and others including Julia and MATLAB for more specialised use cases.


Related guides


Take It Further

Power Prompts 300: Claude Code Productivity Patterns — Section 6 covers Data Analysis Prompts: 30 templates for data interpretation, code generation, statistical analysis, visualisation, and SQL query writing — all with the schema-specification pattern that produces production-quality code.

→ Get Power Prompts 300 — $29

30-day money-back guarantee. Instant download.

Top comments (0)