Human Friendly Data Science Interviews

TL;DR. We focused on a holistic view of our candidates (technical and interpersonal skills) while trying to be fair with everyone’s time and life experiences. We could identify the outstanding people and those that weren’t a good fit and have had a great experience working with our hires!

After reading The software industry’s greatest sin: hiring by Neil Sainsbury and Take-home vs. whiteboard coding: The problem is bad interviews by Andrew Rondeau, several critical points about interviewing software developers stood out to me:

Software developers are usually assessed based on technical aspects ignoring their personal and organizational qualities. This might produce technically correct software with good performance, but that might be far from fulfilling users’ needs.
Someone can be technically excellent but lack the skills to understand and interact with your users and the rest of the team.
Someone can be technically excellent but keep using technologies they find interesting but are not aligned with the company’s goals.
There are tradeoffs between whiteboard and take-home questions: time invested by both parties, different development environment/conditions, visibility of the candidate’s technical and personal qualities, and the feedback loop between the examiner and the applicant.
A key aspect is to plan a good interview with coding assignments that consider the company’s needs and are fair for everyone involved.
Presenting existing code is briefly discussed by Andrew Rondeau as an alternative to whiteboard and take-home questions.

I am a postdoctoral researcher at a group that explores mobile data’s role in monitoring or supporting people with different health conditions. Broadly speaking, we collect smartphone and wearable sensor data, process it, and use it to create statistical and machine learning models that provide relevant behavioral or clinical insights. This is possible thanks to our team’s multi-disciplinary nature with expertise in psychology, statistics, computer science, software engineering, and data science.

Recently, we needed to hire a couple of data science interns from the local master’s program, and I was in charge of leading the technical part of the interviews. This was an excellent opportunity for me to pilot the type of technical interview that I’d like to experience based on the points I summarized above and the lab’s needs.

I divided our interviews into two 30-minute stages, one to talk about one of the candidate’s past projects and the other to find out how they would approach a data science problem that represents the kind of work we do.

For the first stage, we asked applicants to submit in advance a past data science project that they would like to discuss with us. I want to clarify that we accepted any industry, school, or hobby code repository and did not judge its purpose or complexity. We don’t expect that everyone will have the time to work on side-projects in their free time or disclose code from a previous employer. However, their sample project allowed us to understand some of the person’s experience with data science and software engineering practices like data cleaning, modeling, documentation, version control, variable and function naming, code comments, code formatting, and code refactoring. If any of these aspects were missing or seemed unsatisfactory, we made a note and ask about them during the interview.

When the time came for the first part of our face-to-face chat (where we talked about their chosen project), we focused on their hard skills (technical expertise, domain knowledge, and problem-solving abilities), soft skills (communication, multi-disciplinary collaboration, feedback reception), and traits like proactiveness, enthusiasm, motivation, clarity of thought, independence and attention to detail. This is our take on what Neil refers to as a candidate’s “holistic” view. Crucially, having technical and non-technical members from our team present made it easier to discuss and evaluate our candidates. More specifically, we inquired people about their role in previous teams (if any), their approach to learning, and their thought process to choose the best tool for the job. We also prompted them to explain complex non-technical concepts to everyone in the interview panel and to talk more about their experience interacting with past “clients” (teachers, fellow students, or any other stakeholders for those with experience in Industry). One of the advantages of this setup was that it allowed everyone to interact in a work environment very similar to what we experience every day while planning, implementing, executing, analyzing, and publishing a health intervention or monitoring study.

In the second part of the interview, we asked participants the following question: how would you implement a sleep classifier based on smartphone and Fitbit data? Even if this problem appears simple at first sight, numerous decisions and considerations can be taken into account along the way. For example, we can talk about missing data, feature engineering, data resampling, data imputation, class imbalance, type of model (population or individual), hyper-parameter tuning, model choice, baselines, cross-validation, evaluation metrics, etc. Consequently and to foster the discussion, we always dropped clues, clarifications, and follow up questions.

We did not expect our interviewees to reach a comprehensive solution or write any code (it took us weeks to finish a publishable solution, and the first part of the interview already would have given us an idea of their programming skills). Instead, we wanted to know more about their thinking process. We paid particular attention to the candidate’s understanding of the problem (do they ask relevant questions?), creativity (how do they suggest tackling this problem?), experience (are they levering solutions to past problems?), technical expertise (what programming language, libraries, or methods would they like to use?), and communication skills (can they engage the whole team in the discussion?).

This process fits well within our workflow and our team’s characteristics, and we hope that by sharing it, you can adapt it to your needs and provide a better experience for your candidates.

DEV Community

Human Friendly Data Science Interviews

Top comments (0)