Today it’s been a year since I started working as a data scientist.

I wonder what other people who’ve made this sort of transition - from some social science to data science - have learned.

You need to run controlled experiments or convince the reviewers that you chose the right instruments or that you matched on the right confounders.

Yes, in a way, the categories do cause the word counts - “let me be clear” appears a lot in Obama’s speeches because they are Obama’s.

When you remove causality from the equation you suddenly have a lot more time to work on other, potentially more interesting aspects of data analysis.

Which is of course ok - political scientists are trying to understand how the world works - but is also frustrating: debates never end.

That leaves ample room for subjectivity, so it’s never entirely clear who are the winners and losers.

There is less ideological bias Hopefully I don’t need to convince you that social scientists are overwhemingly on the left of the ideology spectrum.

If you are in political science you probably lean left, so you’re unlikely to have seen this happen to one of your papers.

Reviewers (ideally) assess theory consistency, of course, but then we’re back to the subjectivity and bias problems I discussed before.

Programming matters Political scientists’ code is rigid: each script is meant to produce a pre-determined set of estimates, using a pre-determined dataset.

It doesn’t matter much if the code could be faster or prettier: as long as it replicates what you did in your paper you have fulfilled your duty.

Data scientists’ code is flexible: your code needs to accept a variety of inputs and return a variety of outputs.

What if the regression tool you’re using under the hood (say, R’s lm() function) returns an error because the chosen subset is too small?

It’s about both depth and breadth: you need a firmer grasp of basic programming concepts, like conditionals and functions (that’s the depth) and you need to learn web development, SQL, NoSQL, database administration, messaging, server maintenance, security, testing (that’s the breadth).

You survived Social Origins of Dictatorship and Democracy’s 592-page discussion of regime change, class, and modernization - surely you’re destined to higher purposes.

But if you don’t mind getting your hands dirty then programming can be a lot of fun - and at least as intelectually rewarding than political science.

As in political science, it’s largely about hypothesis testing - if the code isn’t working because of XYZ then if I try ABC I should get this result, otherwise I should get that other result.

Except that there is a finish line: you’ll never really know what makes democracy work but you’ll eventually figure out which regex matches the string you’re looking for.

And you will get to the finish line or die trying - you can’t just declare regex to be a meaningless social construct and move on.

You lose freedom Wait, don’t quit grad school just yet - there’s a lot to miss about academia.

Don’t get the wrong idea: I thoroughly enjoy what I’m doing right now (I help automate cartel detection;

