DEV Community

Feng Zhang
Feng Zhang

Posted on • Originally published at prachub.com

Meta Data Scientist Interview Cheatsheet 2026

If you're preparing for Meta data scientist interviews, one pattern shows up fast: the bar is not "can you compute a metric?" It is "can you define the right metric, design a clean experiment, and explain tradeoffs like an owner?"

This article pulls together the most interview-relevant parts of PracHub's Meta Data Scientist interview prep guide, with a focus on areas candidates often get pressed on: notification analytics, A/B testing, cluster randomization, and SQL event logs.

What Meta interviewers are usually testing

Across technical screens and onsites, the questions often sound broad:

  • "How would you evaluate similar-listing notifications?"
  • "Design an experiment for a new ads ranking model"
  • "Write SQL to compute engagement or call metrics"
  • "What would you do if there is interference between users?"

The underlying skill is the same. You need to move from raw events or product ideas to a decision-ready analysis. That means:

  • defining eligibility
  • choosing a randomization unit
  • picking a primary metric
  • adding guardrails
  • checking whether the observed impact is actually incremental

If your answer stays at the dashboard level, it will usually feel weak.


Notification analytics is a causal question, not a CTR question

A common interview prompt is some variation of push notifications or similar-listing alerts. The mistake many candidates make is optimizing for click-through rate.

That is too shallow.

For notification products, Meta cares about whether the notification creates net value or just interrupts people enough to get clicks. A strong answer breaks the system into a funnel:

  • eligibility
  • send
  • delivery
  • impression or open
  • click
  • landing-page engagement
  • downstream action
  • longer-term retention

For a marketplace notification, downstream actions matter more than raw clicks. Examples from the source include:

  • listing_view
  • save
  • seller_message
  • offer_sent
  • purchase_intent
  • transaction_proxy

If the product goal is better buyer discovery, then a better primary metric than notification_click_rate might be:

  • incremental qualified_listing_views_per_user
  • buyer_seller_message_threads_per_eligible_user

That framing shows you understand the product mechanism.

Guardrails matter more for notifications than people think

Notifications impose an attention cost. Your answer should include guardrails such as:

  • push_opt_out_rate
  • notification_disable_rate
  • app_uninstall_rate
  • hide_report_rate
  • negative_feedback_rate
  • session_depth
  • 7d_retention
  • total notification volume per user

If you ignore fatigue, your experiment design looks incomplete.

Be precise about eligibility and exposure

Another common failure mode is saying, "compare users who got notifications with users who didn't."

That comparison is biased. Users who receive notifications are often already more active, have permissions enabled, or have more relevant inventory available.

A better answer starts with a fixed eligible population, for example users who:

  • viewed or saved a marketplace item in the last 7 days
  • have push permissions enabled
  • have at least one similar listing available

Then analyze intent-to-treat on randomized eligible users. You can inspect treatment-on-treated later, but only with the right causal caveats.

Watch for cannibalization and spillovers

A similar-listing notification can shift behavior from search, organic feed, saved items, or other notification channels rather than create new demand. So you should measure total marketplace engagement, not only attributed notification clicks.

If the product has social, household, or marketplace spillovers, say that directly. That is often when an interviewer pushes into cluster randomization.


A/B testing answers need an estimand, not just a p-value

Meta interviewers want to hear that you can design an experiment before data exists, not just analyze one afterward.

Start with the decision and the causal quantity. In plain terms: what launch decision does this test inform, and for whom?

For many interview prompts, your structure can be:

  1. Define the product change and eligible population
  2. Choose a randomization unit
  3. Name the primary metric
  4. Add guardrails
  5. Discuss power, variance, and diagnostics
  6. Explain how you'd interpret null or mixed results

Choose the randomization unit based on interference risk

User-level randomization is often fine for isolated product changes. It is not automatically correct.

If one user's treatment can affect another user's outcome, then SUTVA may fail. In Meta-style products, that comes up in:

  • social feeds
  • messaging
  • ads auctions
  • marketplaces
  • creator ecosystems

In those cases, you may need cluster, geo, advertiser, page, or marketplace-level randomization.

If you say, "I'd use user-level randomization if interference is low, otherwise I'd consider cluster or geo designs," that is already much stronger than forcing every problem into a 50/50 user RCT.

Power should be discussed at the right level

For repeated notifications or clustered experiments, observations are correlated. You should talk about power at the user or cluster level, not at the event level.

The source also calls out the design effect for clustered experiments:

DE = 1 + (m - 1)rho

where m is average cluster size and rho is intracluster correlation.

That matters because a huge row count can still translate into a much smaller effective sample size.

CUPED is worth mentioning if the prompt invites depth

For noisy product metrics, pre-experiment covariates can reduce variance. The source mentions CUPED, which adjusts outcomes using pre-period behavior. You do not need to derive it in every answer, but mentioning it in a Meta interview often signals practical experiment experience.

Use it when the pre-period metric strongly predicts the post-period metric, such as engagement, spend, or retention.


How to answer a "similar-listing notifications" question

A solid answer could sound like this:

First, clarify the product goal. Are you trying to increase discovery, transactions, or re-engagement among users with shopping intent?

Next, define the eligible population: users who recently viewed or saved an item, have push permissions on, and have relevant similar inventory available.

Then propose user-level randomization if interference is limited. Treatment users receive similar-listing pushes, control users stay on the current notification policy.

Pick a primary metric tied to downstream value, like incremental qualified_listing_views_per_eligible_user or buyer_seller_message_threads_per_user.

Use secondary metrics like:

  • notification_open_rate
  • save_rate
  • return_sessions

Add guardrails:

  • push_opt_out_rate
  • notification_settings_disable_rate
  • hide_report_rate
  • notifications received per user
  • 7d_retention

Then say you'd analyze ITT first, check whether gains are incremental versus cannibalized from existing surfaces, and look at heterogeneous effects by intent, notification sensitivity, and inventory density.

That answer is much closer to what interviewers want than "I'd compare CTR between treatment and control."


SQL event log questions are mostly about grain and joins

The SQL side of the interview is less about syntax tricks and more about getting metric definitions right.

The source's advice is simple and useful:

1) Decide the grain first

Know what one row means before you write code:

  • user-day
  • call-day
  • impression-day
  • country-day

A lot of mistakes come from skipping this step.

2) Be careful with time windows

Use bounded windows like:

  • event_ts >= start
  • event_ts < end

That avoids double counting midnight events.

3) Aggregate before joining when needed

Joining raw event tables too early can multiply rows and inflate clicks, responses, revenue, or duration.

4) Protect ratio calculations

Use safe denominators and be explicit about what happens when the denominator is zero.

5) Clarify deduplication rules

If the metric requires one valid event per entity or one response per user, say how you would dedupe, often with ROW_NUMBER().

These are basic ideas, but they come up constantly in product analytics interviews.


What candidates most often miss

From the source, the recurring weak spots are:

  • optimizing for CTR alone
  • being vague about eligibility or exposure
  • ignoring interference and repeated treatment
  • assuming every experiment should be user-level 50/50
  • treating a null result as proof of no effect
  • skipping diagnostics like sample ratio mismatch, logging sanity, or pre-period balance

If you avoid those, your answers already sound more senior.


A better way to use a cheatsheet

Don't memorize lines. Practice turning these patterns into spoken answers.

Take a prompt like notifications, ads ranking, or call metrics, and force yourself to answer in this order:

  • goal
  • population
  • unit of randomization
  • primary metric
  • guardrails
  • power and inference risks
  • interpretation

If you want more realistic drills, PracHub also has interview practice questions here.

And if you're preparing specifically for Meta, the full Meta Data Scientist interview prep guide on PracHub is the better reference because it keeps these topics in one place and ties them to actual interview-style prompts.

Top comments (0)