As I mentioned in my first post, I have just finished an extensive tech job search, which featured eight on-sites, along with countless phone screens and informal chats. I was interviewing for a combination of data science and software engineering (machine learning) positions, and I got a pretty good sense of what those interviews are like.

During the interview phase of the process, your recruiter is on your side and can usually tell you what types of interviews you’ll have. Even if the recruiter is reluctant to share that, common practices in the industry are a good guide to what you’re likely to see.

Data science roles generally fall into two broad ares of focus: statistics and machine learning. I only applied to the latter category, so that’s the type of position discussed in this post.

Always: Often: You will encounter a similar set of interviews for a machine learning software engineering position, though more of the questions will fall in the coding category.

figuring out which product to recommend to a user, which users are going to stop using the site, which ad to display, etc.), but can also be a toy example (e.g.

recommending board games to a friend). This type of interview doesn’t depend on much background knowledge, other than having a general understanding of machine learning concepts (see below).

In the latter case, you’ll generally have a discussion with the interviewer about some plausible definitions (e.g., what does it mean for a user to “stop using the site”?).

Think about what might be predictive of the variable you are trying to predict, and what information you would actually have available. I’ve found it helpful to give context around what I’m trying to capture, and to what extent the features I’m proposing reflect that information.

But maybe some purchases were mistakes, and you vowed to never buy a book like that again. Well, Amazon knows how you’ve interacted with your Kindle books. If there’s a book you started but never finished, it might be a positive signal for general areas you’re interested in, but a negative signal for the particular author.

The project doesn’t have to be directly related to the position you’re interviewing for (though it can’t hurt), but it needs to be the kind of work you can have an in-depth technical discussion about.

Should I look over the syntax for training a model in scikit-learn?) I also had one recruiter tell me I’d be analyzing “big data”, which was a bit intimidating (am I going to be working with distributed databases or something?) until I discovered at the interview that the “big data”

