DEV Community

ClawGear
ClawGear

Posted on

35 ChatGPT Prompts for Data Analysts: Extract Insights Faster, Communicate Findings, and Level Up Your Skills

Data analysts spend countless hours wrangling messy datasets, writing complex queries, and translating numbers into stories that drive decisions. ChatGPT can act as a tireless collaborator — helping you explore data faster, debug SQL in seconds, and craft executive-ready reports with clarity. Whether you're a junior analyst building your toolkit or a senior analyst looking to automate repetitive work, these 35 prompts will change how you work.

Data Exploration and EDA

1. Understand a new dataset at a glance

I have a dataset with the following columns: [list columns and data types]. Give me a structured EDA plan including: key questions to ask, potential data quality issues to check, distributions to examine, and relationships between variables worth exploring.
Enter fullscreen mode Exit fullscreen mode

2. Identify and handle missing data

I'm analyzing a dataset where [column name] has 18% missing values and [column name] has 4% missing values. Suggest the most appropriate strategies for handling each, explain the trade-offs between deletion, mean/median imputation, and model-based imputation, and give me Python code to implement your recommendation.
Enter fullscreen mode Exit fullscreen mode

3. Detect outliers intelligently

I have a numeric column [column name] in my dataset representing [what it measures]. Walk me through three different outlier detection methods (IQR, Z-score, and isolation forest), explain when each is most appropriate, and provide Python code to apply all three and flag the results.
Enter fullscreen mode Exit fullscreen mode

4. Profile a dataset for a stakeholder summary

I need to create a data profile summary for a non-technical stakeholder. Given these column names and sample values: [paste sample]. Write a plain-English summary describing what the dataset contains, its time range, key metrics, and any obvious data quality concerns I should flag before analysis.
Enter fullscreen mode Exit fullscreen mode

5. Explore correlations and relationships

I want to understand relationships between variables in my dataset. The columns are: [list columns]. Suggest which variable pairs are most worth examining for correlation, recommend the right correlation method for each pair (Pearson, Spearman, or Cramér's V), and give me Python code using pandas and seaborn to visualize the top relationships.
Enter fullscreen mode Exit fullscreen mode

SQL and Query Writing

6. Write a complex aggregation query

Write a SQL query for [database type: PostgreSQL/MySQL/BigQuery/Snowflake] that calculates the [metric] for each [dimension] over the last [time period], filtered by [condition], and ranked from highest to lowest. Table name is [table], relevant columns are [columns]. Add comments explaining each section.
Enter fullscreen mode Exit fullscreen mode

7. Debug a broken SQL query

This SQL query is returning incorrect results / throwing an error. Here is the query: [paste query]. Here is the error message or unexpected output: [describe issue]. The table schema is: [schema]. Diagnose the problem, explain why it's happening, and provide a corrected version.
Enter fullscreen mode Exit fullscreen mode

8. Optimize a slow-running query

This query is running too slowly on a table with [X] million rows: [paste query]. Suggest specific optimizations including indexing strategies, query restructuring, avoiding full table scans, and any database-specific features for [database type] that could speed up execution. Rewrite the optimized version.
Enter fullscreen mode Exit fullscreen mode

9. Write a cohort analysis query

Write a SQL query to perform a cohort analysis on user retention. I have a table called [table name] with columns: user_id, event_date, and event_type. Define cohorts by the month of a user's first event, then calculate retention rates for months 1 through 6 after acquisition. Use [PostgreSQL/BigQuery/Snowflake] syntax.
Enter fullscreen mode Exit fullscreen mode

10. Convert a business question into SQL

Translate this business question into a SQL query: "[business question, e.g., Which product categories had the highest revenue growth in Q1 vs Q4 last year, broken down by region?]". My database has these tables and columns: [describe schema]. Write the query and explain your logic step by step.
Enter fullscreen mode Exit fullscreen mode

Data Visualization and Dashboard Design

11. Choose the right chart type

I need to visualize the following: [describe what you want to show, e.g., the change in monthly active users over 18 months across 4 product lines]. Recommend the most effective chart type, explain why it works better than alternatives, and give me Python code using matplotlib or plotly to create a publication-quality version.
Enter fullscreen mode Exit fullscreen mode

12. Design a KPI dashboard layout

I'm building a dashboard for [audience: e.g., the marketing team / the C-suite] focused on [topic: e.g., campaign performance]. Suggest a dashboard layout including: which KPIs to feature prominently, what supporting charts to include, how to organize sections for scannability, and any interactivity features that would add value in [Tableau/Power BI/Looker].
Enter fullscreen mode Exit fullscreen mode

13. Write Python code for a polished visualization

Write Python code using plotly to create an interactive [chart type] showing [what it shows]. The data has these columns: [columns]. Requirements: use a professional color palette, include proper axis labels and a title, add tooltips showing [details], and make it suitable for embedding in a stakeholder report.
Enter fullscreen mode Exit fullscreen mode

14. Critique and improve a visualization

I'm going to describe a chart I've created and I want your feedback. Chart type: [type]. What it shows: [description]. Current issues I suspect: [your concerns]. Provide a structured critique covering data-ink ratio, color choices, labeling, axis scaling, and whether the chart type matches the message. Then suggest a redesigned version.
Enter fullscreen mode Exit fullscreen mode

15. Build a Matplotlib dashboard with subplots

Write Python code using matplotlib to create a single-figure dashboard with 4 subplots arranged in a 2x2 grid. The charts should show: [chart 1 description], [chart 2 description], [chart 3 description], [chart 4 description]. Use a consistent color scheme, add a main title, and ensure the layout doesn't have overlapping labels. Use this sample data structure: [describe columns].
Enter fullscreen mode Exit fullscreen mode

Statistical Analysis and Interpretation

16. Choose and run the right statistical test

I want to determine whether [describe what you're testing, e.g., conversion rates differ significantly between two landing page variants]. My sample sizes are [n1] and [n2], and my data looks like [describe distribution/type]. Recommend the correct statistical test, explain the assumptions I need to check, and provide Python code to run it and interpret the output.
Enter fullscreen mode Exit fullscreen mode

17. Interpret A/B test results

I ran an A/B test with these results: Control group: [n] users, [x]% conversion. Treatment group: [n] users, [x]% conversion. Test ran for [duration]. Interpret these results: is the difference statistically significant? What is the practical significance? What confidence level is appropriate here? Should we ship the change? Explain your reasoning.
Enter fullscreen mode Exit fullscreen mode

18. Explain a statistical concept in plain language

Explain [statistical concept: e.g., p-value / confidence interval / statistical power / regression to the mean] in plain language suitable for a business audience. Use a concrete example relevant to [industry/domain]. Then give me a one-paragraph summary I can paste into a report to explain why this concept matters for our analysis.
Enter fullscreen mode Exit fullscreen mode

19. Identify and address confounding variables

I'm analyzing the relationship between [variable A] and [variable B] in my dataset. The context is [describe business situation]. Identify potential confounding variables I should control for, explain how each could bias my results, and suggest analytical approaches (stratification, regression controls, matching) to address them.
Enter fullscreen mode Exit fullscreen mode

20. Validate a regression model

I built a linear regression model to predict [target variable] using these features: [list features]. My model has R-squared of [value], RMSE of [value], and the residual plot looks like [describe]. Evaluate whether this model is reliable, identify potential issues (overfitting, heteroscedasticity, multicollinearity), and suggest specific next steps to improve it.
Enter fullscreen mode Exit fullscreen mode

Business Reporting and Stakeholder Communication

21. Write an executive summary from data

I have the following analysis results: [paste key numbers and findings]. Write a 3-paragraph executive summary for the [C-suite / VP of Marketing / board of directors] that leads with the most important insight, supports it with 2-3 key data points, and ends with a clear recommendation. Use direct, confident language and avoid technical jargon.
Enter fullscreen mode Exit fullscreen mode

22. Turn a data dump into a narrative

Here are raw metrics from our [monthly/quarterly] analysis: [paste metrics]. Help me craft a data narrative that tells a coherent story. Identify the single most important trend, suggest a narrative arc (context, complication, resolution), and give me a structured outline I can use for a 5-minute stakeholder presentation.
Enter fullscreen mode Exit fullscreen mode

23. Respond to a stakeholder challenging your findings

A stakeholder pushed back on my analysis saying: "[their exact objection, e.g., 'this data doesn't match what we see on the ground' or 'the sample size is too small to be meaningful']". My analysis methodology was: [brief description]. Help me draft a professional, confident response that addresses their concern directly, acknowledges any valid points, and defends the integrity of my work where appropriate.
Enter fullscreen mode Exit fullscreen mode

24. Write a data caveat section

I'm writing a report on [topic] and need to include a limitations and caveats section. My data sources are [sources], the time period is [dates], and known issues include [describe any gaps, sampling biases, or measurement issues]. Write a professional caveats section that is transparent about limitations without undermining confidence in the overall findings.
Enter fullscreen mode Exit fullscreen mode

25. Create a metrics definition document

Create a metrics definition document for the following KPIs used by our [team/department]: [list metrics, e.g., DAU, churn rate, NPS, CAC, LTV]. For each metric include: plain-English definition, exact calculation formula, data source, update frequency, and known limitations or edge cases. Format it as a clean table.
Enter fullscreen mode Exit fullscreen mode

Python and Automation

26. Automate a recurring data pull

I need to automate a weekly data pull that: connects to [database/API], runs [describe query or request], saves the output as a CSV to [location], and sends a summary email to [recipients]. Write a Python script to do this. I'm using [pandas / SQLAlchemy / psycopg2] for database connections and [smtplib / SendGrid] for email.
Enter fullscreen mode Exit fullscreen mode

27. Clean and standardize messy data

I have a messy pandas DataFrame with these issues: [describe issues, e.g., inconsistent date formats in column X, mixed case text in column Y, duplicate rows based on columns A and B, dollar signs in numeric column Z]. Write Python code to clean and standardize the data, handle edge cases, and print a before/after summary showing how many records were affected by each operation.
Enter fullscreen mode Exit fullscreen mode

28. Build a reusable data pipeline function

Write a reusable Python function that takes a raw CSV file path as input, performs the following transformations: [list transformations], validates that the output meets these conditions: [list validations], logs any rows that failed validation to a separate file, and returns a clean DataFrame. Include docstrings and type hints.
Enter fullscreen mode Exit fullscreen mode

29. Automate a weekly report

Write a Python script that generates a weekly performance report. It should: read data from [source], calculate these metrics: [list metrics], create visualizations for [describe charts], combine everything into a PDF or HTML report using [matplotlib / plotly / jinja2], and save it with a filename that includes the current date. Include error handling.
Enter fullscreen mode Exit fullscreen mode

30. Write unit tests for a data transformation

I have this Python function that transforms data: [paste function]. Write a comprehensive set of unit tests using pytest that cover: normal expected input, edge cases (empty DataFrame, null values, unexpected data types), and known business logic rules the function must satisfy. Include a test fixture with sample data.
Enter fullscreen mode Exit fullscreen mode

Career Development and Skill Building

31. Create a personalized learning plan

I'm a data analyst with [X] years of experience. My current skills include: [list skills]. My goal is to [career goal, e.g., move into a senior analyst role / transition to data science / specialize in product analytics] within [timeframe]. Create a structured 3-month learning plan with specific resources, projects, and milestones to track my progress.
Enter fullscreen mode Exit fullscreen mode

32. Prepare for a data analyst interview

I have an interview for a [level] data analyst role at a [industry] company. The job description mentions: [paste key requirements]. Give me the 10 most likely technical and behavioral interview questions for this role, model answers for the 3 hardest ones, and tips for how to structure a case study answer using the STAR method adapted for data roles.
Enter fullscreen mode Exit fullscreen mode

33. Get feedback on a portfolio project

I'm building a portfolio project to showcase my data analyst skills. The project is: [describe project — dataset, questions explored, methods used, tools]. Give me critical feedback on: whether the project demonstrates valuable skills to employers, gaps in my analysis that I should address, how to present it effectively on GitHub, and what business framing would make it more compelling.
Enter fullscreen mode Exit fullscreen mode

34. Write a data-driven resume bullet

Rewrite these resume bullet points to be more impactful and quantified: [paste your current bullets]. For each one, suggest where I could add metrics or business outcomes, use stronger action verbs, and tighten the language. Also flag any bullets that are too vague and explain what additional context would make them compelling to a hiring manager.
Enter fullscreen mode Exit fullscreen mode

35. Identify your skill gaps and next steps

Here is a job description for the kind of role I want: [paste JD]. Here are my current skills and experience: [describe background]. Do a gap analysis: identify which required skills I'm missing or weak in, rank them by how important they are for the role, and give me a specific action plan for closing the top 3 gaps in the next 90 days.
Enter fullscreen mode Exit fullscreen mode

Get All 35 Prompts in One Place

If these prompts were useful, I've compiled all 35 into a ready-to-use toolkit with bonus prompts and usage notes.

Get the complete AI Prompt Toolkit for this profession →

Works with ChatGPT, Claude, and DeepSeek.

Top comments (0)