DEV Community

Philemon Adaghe
Philemon Adaghe

Posted on

10 Data Analyst Interview Questions You Should Be Ready For (2025) πŸ’ΌπŸ“Š

10 Data Analyst Interview Questions You Should Be Ready For (2025) πŸ’ΌπŸ“Š

1️⃣ Q: Explain the difference between INNER JOIN and LEFT JOIN.

βœ… INNER JOIN returns only matching records between tables.

βœ… LEFT JOIN returns all records from the left table + matched records from the right (NULLs if no match).

Example:

Get all customers and their orders:

  • INNER JOIN β†’ shows only customers who placed orders.
  • LEFT JOIN β†’ shows all customers, even those with no orders.

2️⃣ Q: What are window functions in SQL?

Window functions perform calculations across a set of rows related to the current row.

Example:

Use ROW_NUMBER() to rank sales per region without collapsing data:

sql
SELECT region, sales, ROW_NUMBER() OVER(PARTITION BY region ORDER BY sales DESC)

FROM sales_data;

3️⃣ Q: How do you handle missing or duplicate data in a dataset?

  • For missing data: use imputation (mean/median), drop, or flag it.
  • For duplicates: use .drop_duplicates() in Python or DISTINCT in SQL. Approach depends on context and business impact.

4️⃣ Q: Describe a situation where you derived insights that influenced a business decision.
Example: Analyzed user engagement data β†’ found drop in mobile retention. Suggested UX improvements β†’ increased retention by 15%.

5️⃣ Q: What’s the difference between correlation and causation?

  • Correlation = relationship (e.g., A & B move together)
  • Causation = A directly causes B Always validate with experiments or domain expertise.

6️⃣ Q: How would you optimize a slow SQL query?

  • Check for missing indexes
  • Avoid SELECT *
  • Use EXPLAIN plans
  • Limit subqueries
  • Optimize joins and filters

7️⃣ Q: Explain the use of GROUP BY and HAVING in SQL.

  • GROUP BY aggregates data by columns (e.g., total sales by region)
  • HAVING filters aggregated results Example: sql SELECT region, SUM(sales) FROM orders GROUP BY region HAVING SUM(sales) > 10000;

8️⃣ Q: How do you choose the right chart for a dataset?

  • Line chart β†’ trends
  • Bar chart β†’ comparison
  • Pie chart β†’ proportion (rarely ideal)
  • Scatter plot β†’ relationships Always match chart to message and audience.

9️⃣ Q: What’s the difference between a dashboard and a report?

  • Dashboard β†’ Interactive, real-time, for quick insights
  • Report β†’ Static or scheduled, detailed summaries Dashboards = action; reports = reference.

πŸ”Ÿ Q: Which libraries in Python do you use for data cleaning and analysis?

  • pandas β†’ data manipulation
  • numpy β†’ numerical ops
  • matplotlib / seaborn β†’ visualization
  • sklearn β†’ basic ML and preprocessing

Top comments (0)