Honestly, after the first round, I thought I was done.
During the interview, I kept getting interrupted and redirected by the interviewer. A few times she had to keep digging before I finally answered what she actually wanted. After the call ended, I literally sat in my chair for ten minutes without moving.
Then three days later, I got the second-round invite.
Round 1: Getting Pressed on Every Detail
The interviewer started with questions about the dashboard project from my first internship and asked how I selected the metrics.
I started explaining a bunch of things, and she immediately cut me off:
"Why these three metrics instead of others?"
I honestly wasn't prepared for that follow-up. I tried to explain, but then she asked:
"If the business goal changes, would your metrics also change?"
That question completely threw me off at first.
Later I realized the point was actually simple: metrics should always follow the business objective. They're not fixed forever.
For the text classification discussion, she asked how to choose between Precision and Recall. I answered that if falsely banning users is costly, then Precision matters more; if missing harmful content is more dangerous, then Recall matters more. She seemed satisfied and moved on.
For my second internship, she asked about the most technical part of the ETL pipeline. I explained a deduplication optimization I worked on:
One user could generate multiple logs per day, and directly using COUNT DISTINCT was too expensive. So we first grouped records and kept the earliest event before joining downstream tables.
Then she asked:
"What if the data volume becomes 10x larger?"
I answered that approximate counting methods like HyperLogLog could help reduce computation cost.
The SQL question was calculating DAU and average online duration. I used CTEs to split the logic and then joined the results together.
She followed up with:
"How would you handle overlapping sessions from multiple logins?"
I said that if we wanted accurate session duration, we'd need to merge overlapping intervals first. But if the requirement was only a simplified average, then it wouldn't be necessary.
She nodded and said:
"The approach makes sense."
Then she mentioned time was running out and skipped the Python section.
After Round 1, I honestly felt my performance was pretty average.
Round 2: Much Better Flow
The second interviewer had a much more casual style, so the atmosphere felt less stressful.
We started with common topics like data preprocessing and Bias-Variance Tradeoff. Those were things I had already prepared well, so the conversation went smoothly.
Then came a hypothesis case:
"If a product feature is about to launch, how would you evaluate its effectiveness?"
I had practiced open-ended cases before, so I immediately structured it as an A/B testing problem:
- Define the business objective
- Create hypotheses
- Design the experiment
- Run it for two weeks
- Measure the outcome metrics
He then asked:
"How do you know two weeks is enough?"
I explained that we'd estimate required sample size using historical data and back-calculate the experiment duration.
The SQL question was counting users who registered and posted on the same day.
SELECT COUNT(DISTINCT u.user_id)
FROM users u
JOIN posts p
ON u.user_id = p.user_id
WHERE DATE(u.created_at) = DATE(p.created_at);
He specifically hinted that one user could create multiple posts.
After I finished, he asked whether DISTINCT was necessary.
I answered yes — without it, users with multiple posts would be counted repeatedly.
He replied:
"Correct. A lot of people miss that."
The final discussion was a business case around improving creator engagement based on that metric.
I explained that the metric essentially reflected first-time activation for new users. The issue could come from two directions:
- The onboarding flow does not encourage users to create content
- New users don't receive enough exposure after posting, so they never get positive feedback
Accordingly, the solutions would be:
- Improve onboarding guidance for first posts
- Adjust recommendation logic to give new creators initial exposure
To measure effectiveness, I suggested tracking:
- Time from registration to first post
- 7-day retention for new creators
He said:
"This is a very complete analysis."
Then we moved into the Q&A section.
What Changed Between Round 1 and Round 2?
In Round 1, I kept getting stuck because I focused too much on explaining what I did instead of why I made certain decisions.
Between the two rounds, I worked with ProgramHelp · FAANG Interview Coaching for VO preparation, specifically focusing on business cases and analytical thinking.
They designed several realistic TikTok-style DS interview scenarios and gave very direct feedback about where my logic broke down and how interviewers would interpret certain answers.
What helped most was that they didn't just teach generic frameworks. They focused on helping me actually understand how to communicate business reasoning clearly.
The team includes people from Oxford, Princeton, Amazon, and Google with strong DS backgrounds, so they understand how big-tech interviews are evaluated. Several of my friends also used their VO coaching and ended up landing offers.
For me, the improvement between the first and second rounds was huge.
They also help with:
- Resume optimization
- Behavioral interviews
- VO interview coaching
ProgramHelp · FAANG Interview Coaching
Final Thoughts
TikTok DS interviews are not really about memorizing knowledge.
What they actually test is whether you can use data to clearly explain a business problem.
Ironically, almost failing the first round was exactly what helped me finally understand that.
Feel free to leave questions below.
Top comments (0)