Top 20 Meta Data Manipulation SQL Interview Questions (2026)

#interview #meta #datamanipulationsql #questions

Meta's Data Manipulation SQL interviews usually mix product sense with query mechanics. You are rarely solving a pure syntax exercise. The interviewer often wants to see how you define entities, handle messy event data, choose the right grain, and turn business language into clear metrics.

Below are 20 of the most discussed Meta Data Manipulation SQL questions, grouped by theme. If you have a Meta interview coming up, these are a good place to start.

User, shop, and cohort analysis

Determine Old vs. New Users' Shop Visibility Changes (Medium), Onsite

This question checks whether you can define "old" and "new" users correctly before doing any comparison. Interviewers want to see clean cohort logic, date boundaries, and whether you avoid mixing user-level and shop-level grain in the same calculation.
Analyze New Shops' Activity Compared to Existing Ones (Medium), Onsite

A classic Meta pattern is comparing behavior across cohorts with a clear business story. The hard part is usually choosing the shop creation date, joining activity tables without duplication, and making sure "new" versus "existing" is stable for the analysis window.
Analyze spend cohort and source shifts (medium), Onsite

This type of problem tests whether you can track users or advertisers across time and detect movement between segments. A good approach is to assign cohort or source labels in one CTE first, then compute transitions instead of trying to do everything in one large query.
Compute cohort GMV and payer rate with edge cases (Medium), Technical Screen

This is a measure-definition question disguised as a SQL task. Interviewers are checking if you know how to compute denominators, handle null or zero-spend users, and separate GMV from payer rate without letting edge cases skew cohort results.
Define and analyze new-vs-existing activity (Medium), Onsite

Meta likes questions where the first step is deciding what the metric should mean. Expect follow-ups about whether "new" means first-ever activity, first activity in a product surface, or first activity in a period, because each choice changes the query.
Write SQL for retention, conversion, and churn (Medium), Onsite

This is a broad product analytics problem that tests whether you can define lifecycle metrics at the right level. Strong candidates lay out event dates, cohort entry, follow-up windows, and numerator and denominator logic before writing joins.

Content, feed, and social engagement

Analyze Recent Post Performance Using SQL Queries (Medium), Technical Screen

This question is usually about recent activity windows, per-post metrics, and ranking or aggregation. The interviewer is looking for good filtering by recency, sane handling of missing engagement, and awareness that posts and interactions often live at different grains.
Count unconnected posts and reactions (medium), Technical Screen

The challenge here is the relationship logic. You need to identify what "unconnected" means, often through friendship or graph tables, then count content and engagement without accidentally including connected pairs through sloppy joins.
Write SQL for reply-based recipient metrics (hard), Technical Screen

This is the kind of question that looks simple until directionality matters. It tests whether you can distinguish sender from recipient, define reply chains correctly, and compute recipient-centric metrics over a constrained time window.
Compute reply-based user metrics in 7 days (easy), Technical Screen

This is a lighter version of the same family of messaging problems. Interviewers want to see whether you can anchor on a start event, apply a 7-day window cleanly, and count users or actions without off-by-one date mistakes.
Write SQL to compare social-only vs game-only engagement (Medium), Onsite

This question checks segmentation logic more than raw SQL difficulty. You need mutually exclusive buckets, usually from event histories across product types, then an apples-to-apples comparison of engagement metrics across those groups.

Ads, click, and revenue measurement

Write SQL for Pixel Signal Metrics (medium), Technical Screen

Pixel questions are about event quality as much as event volume. Interviewers often want to know if you can define valid signals, deduplicate noisy logs, and produce advertiser-facing metrics that tie back to measurement reliability.
Write SQL for CTR and revenue (medium), Technical Screen

A common ad analytics prompt, but there are traps. Be careful with impression and click grain, guard against division by zero, and make sure revenue attribution matches the same unit of analysis as the CTR calculation.
Write SQL/pandas for KPI anomaly (Medium), Onsite

This reflects the real work more closely than a textbook SQL problem. Meta wants to see how you investigate a broken KPI, whether you can narrow the issue by dimension, and how you move between SQL aggregation and pandas validation under time pressure.

Calls, communication, and real-time product metrics

Write SQL for video-call recipients and FR activity (Medium), Technical Screen

This combines communication events with downstream user actions, likely friend request activity. The key is sequencing, who received a call, what happened after, and whether your join logic respects the temporal order.
Compute video-call SQL metrics with edge cases (Medium), Technical Screen

Meta likes edge-case questions because event logs are messy in production. Expect duplicate calls, failed calls, missing end times, or multiple participants, and be ready to state assumptions before you aggregate.
Write SQL for call analytics (medium), Onsite

This is usually a more open-ended analytics prompt. Interviewers care about whether you can pick useful call metrics, such as completion, duration, or repeat usage, and then structure the query in a way that another analyst could maintain.
Compute High-Call Usage Rates (hard), Technical Screen

Hard call-usage questions often come down to threshold logic and user-level rollups. A strong answer usually builds per-user usage first, then computes the percentage above the threshold, instead of trying to filter aggregated rows too early.

App and product usage patterns

Analyze VR App Usage and Engagement Metrics (Medium), Onsite This question is a good example of Meta's product analytics style. It tests whether you can define engagement for a specific product context, choose a proper active-user denominator, and avoid inflating usage through repeated session or event joins.

Cross-functional practical screens

Tackle Python tasks under time pressure (Medium), Onsite This is the outlier on the list, but it fits how Meta interviews often work in practice. Even for SQL-heavy roles, you may be asked to manipulate data in Python, clean inputs quickly, or reproduce a metric outside SQL, so speed and clarity both matter.

A few patterns show up again and again across these questions. First, define the entity and time window before writing SQL. Then lock the grain of every CTE so you do not create duplicate rows by accident. Say your assumptions out loud, especially when the prompt has vague terms like "new," "active," "recipient," or "engaged."

If you want more practice beyond this list, PracHub has 168+ Meta Data Manipulation SQL questions, including the ones above and many more reported by candidates.