AI News, BOOK REVIEW: Three Phases for Assessing Data Science Use Cases

Three Phases for Assessing Data Science Use Cases

In this post, Don Miner covers how to identify, evaluate, prioritize, and pick which data science problems to work on next.

Yet, it is incredibly important to do thoughtful planning in data science because data science brings its own unique challenges that the business world is still struggling to fit into their existing processes.

In this post, I’ll share some thoughts on how to decide which data science use cases to work on first, or next, based on what has been successful for me as a data science consultant helping companies, from Fortune 500s to startups.

The three phases I’ll be talking in more depth about in this post are: This process I’ll be describing in this post is designed to be able to be flexible enough to be taken to whatever situation you and your business are in.

It uses a square chart that plots use cases’ risk versus their level of effort and uses the size of the bubble as the value.

Unfortunately for us, usually the highest value targets are high effort and high risk, so it’s a matter of deciding how much effort you can put into something, what your tolerance for risk is, and how much value you need today.

First, I’ll talk about building a good list of use cases, then I’ll talk about how to measure effort, value, and risk for each, and then all that’s left is making our plot!

Building a healthy list will help put everything on the table, get everyone on the same page, and will let you compare and contrast the value, risk, and level of effort for multiple different kinds of use cases.

The best way to overcome these biases is collaborative communication between everyone: data scientists, DBAs of the legacy systems, lines of business owners, and end users.

The DBAs probably have a good sense of “unsolved problems” that have been computationally impossible, like a massive join between two data sets, or “just not a good fit” to database technologies, like natural language processing use cases.

In software engineering, you have an idea, you have a pretty good idea of what you want to build, you know how to build it, and you take a guess at how long it’ll take.

Planning data science is more about putting forth the appropriate amount of effort and doing the best that can be done in a specific time frame, rather than reaching some sort of goal.

At the end of this phase of the exercise, you’ll be assigning a value score to each use case, an order of magnitude of effort to each use case, and a percentage chance of something bad happening for risk for each use case.

So to figure out the value of a data science use case, you’ll need to consider a few questions: Think about all these questions for each use case and sum them up into a general sense of how valuable gaining knowledge of this particular problem will be.

I’ve found that estimating the order of magnitude of one data scientist’s time is much easier while still providing value to explain to everyone in the organization how hard it will be to get the outcome.

Again like measuring value, an experienced data scientist or some outside help will really go a long way here in appropriately assessing risk.

like to assign risk a rough % score, which tells us about what the chances are of the whole use case going well (i.e., we got a valuable outcome).

In general, we should move towards high value, low effort, and low risk, but unfortunately for us these factors are typically related.

A lot of times, we’ll need to choose risk to get a very valuable outcome, or pick a low effort and low value use case just to show some incremental value and prove out a concept.  There are some obvious choices in which use cases should be tackled first.

Think about what’s important to your organization right now in terms of being able to handle risk, the need to drive value, and how many human resources you have to throw at the problem.

hope by reading this post you have learned that it is possible to rationally think about which data science use cases to tackle next, with or without my particular process.

Space: The Next Trillion Dollar Industry

Get $20 off your Away suitcase by going to and using the code “wendover” at check out Watch my appearance on StarTalk ..

Customer Successes with Machine Learning (Google Cloud Next '17)

TensorFlow is rapidly democratizing machine intelligence. Combined with the Google Cloud Machine Learning platform, TensorFlow now allows any developer ...

The Mind After Midnight: Where Do You Go When You Go to Sleep?

We spend a third of our lives asleep. Every organism on Earth—from rats to dolphins to fruit flies to microorganisms—relies on sleep for its survival, yet science is ...

The Science of Pain Management

We all experience pain in our lives, but can the cure be worse than the condition? In this seminar, Harvard Medical School experts explore the science of pain, ...

Controversy of Intelligence: Crash Course Psychology #23

You can directly support Crash Course at Subscribe for as little as $0 to keep up with everything we're doing. Also, if you ..

Moral Math of Robots: Can Life and Death Decisions Be Coded?

A self-driving car has a split second to decide whether to turn into oncoming traffic or hit a child who has lost control of her bicycle. An autonomous drone needs ...

The Public Policy Challenges of Artificial Intelligence

A Conversation with Dr. Jason Matheny Director, Intelligence Advanced Research Projects Activity (IARPA) Eric Rosenbach (Moderator) Co-Director, Belfer ...

Facebook CEO Mark Zuckerberg testifies before Congress on data scandal

Facebook CEO Mark Zuckerberg will testify today before a U.S. congressional hearing about the use of Facebook data to target voters in the 2016 election.

The Science of Improving Patient Safety

Dr. Pronovost discusses core scientific and practical principles that every clinician should know regarding patient safety. Learn More about patient safety and ...

Introduction to Public Health Surveillance

Ever wonder what public health surveillance is or why it's important? Learn how we use public health surveillance to follow disease patterns and stop diseases ...