# AI News, Data science: how is it different to statistics ?

## Data science: how is it different to statistics ?

I believe that statistics is a crucial part of data science, but at the same time, most statistics departments are at grave risk of becoming irrelevant.

In this first column, I’ll discuss why I think data science isn’t just statistics, and highlight important parts of data science that are typically considered to be out of bounds for statistics research.

think there are three main steps in a data science project: you collect data (and questions), analyze it (using visualization and models), then communicate the results.

It’s rare to walk this process in one direction: often your analysis will reveal that you need new or different data, or when presenting results you’ll discover a flaw in your model.

Good questions are crucial for good analysis, but there is little research in statistics about how to solicit and polish good questions, and it’s a skill rarely taught in core PhD curricula.

Organizing data into the right ‘shape’ is essential for fluent data analysis: if it’s in the wrong shape you’ll spend the majority of your time fighting your tools, not questioning the data.

Communication is not a mainstream thread of statistics research (if you attend the JSM, it’s easy to come to the conclusion that some academic statisticians couldn’t care less about the communication of results).

Statistics research focuses on data collection and modelling, and there is little work on developing good questions, thinking about the shape of data, communicating results or building data products.

