On Monday I—humbly—joined a group of NYC's most sophisticated thinkers on all things data for a half-day unconference to help O'Reily organize their upcoming Strata conference.
One of the best sessions I attended focused on issues related to teaching data science, which inevitably led to a discussion on the skills needed to be a fully competent data scientist.
The difficulty in defining these skills is that the split between substance and methodology is ambiguous, and as such it is unclear how to distinguish among hackers, statisticians, subject matter experts, their overlaps and where data science fits.
4 Ways to Spot a Fake Data Scientist
Editor’s Note, 2016: Even a year after publication, this blog post continues to generate a fair amount of traffic and discussion, with many asking what our complete criteria are for discerning who is a data scientist.
With the media inflating Data Scientists’ already-high salaries (I’ve yet to see a newly-graduated data scientist making $300,000+, as has been reported), data scientists have captured the imagination of job seekers thinking that they can write Hadoop on their resume and get a 50% raise.
Although it is becoming slightly more common for data scientists to have a quantitative Bachelor’s or Master’s layered with a top-tier bootcamp instead of a PhD, without a strong foundation in a technically rigorous program, it is very difficult to master all of the statistical and computer science concepts and skills necessary to be hired as a data scientist.
If a professional cannot provide clear examples of their experience with unstructured data, or mentions data science projects, but keeps their involvement very vague, then they are probably not a data scientist.
Purely academic or research background – Now, this is not to say that someone with a stellar academic or research background won’t make a great corporate data scientist, but a key component to being a data scientist in a corporate setting is business acumen.
List of basic business skills – If I see a list of tools on a “data scientist” resume like Omniture, Google Analytics, SPSS, Excel, or any other Microsoft Office tool, you can be sure that I will take a harder look at whether or not this professional makes the grade.
What Data Visualization Should Do: Simple Small Truth
Yesterday the good folks at IA Ventures asked me to lead off the discussion of data visualization at their Big Data Conference.
I was rather misplaced among the high-profile venture capitalists and technologist in the room, but I welcome any opportunity to wax philosophically about the power and danger of conveying information visually.
began my talk by referencing the infamous Afghanistan war PowerPoint slide because I believe it is a great example of spectacularly bad visualization, and how good intentions can lead to disastrous result.
Sticking with that theme, yesterday I focused on three key things—I think—data visualization should do: The emphasis is added to highlight the goal of all data visualization;
This is a network hairball, and while it is possible to observe some structural properties in this example, many more subtle aspects of the data are lost in the mess.
Next, I used the weighted in-degree of each node as a color scale for the edges, i.e., the dark the blue the higher the in-degree of the node the edges connect to.
The original visualization is on the left, and while many people found it useful, its primary weakness is the inability to distinguish among the various attack types represented on the map.
Finally, to show the danger of data deception I replicated a chart published at the Monkey Cage a few months ago on the sagging job market for political science professors.
As you can see from the visualization on the right, by scaling the y-axis from zero the decline is much less dramatic, though still relatively troubling for those of us who will be going on the job market in the not distant future.
Chapter 6Clinical Reasoning, Decisionmaking, and Action: Thinking Critically and Clinically
One of the hallmark studies in nursing providing keen insight into understanding the influence of experience was a qualitative study of adult, pediatric, and neonatal intensive care unit (ICU) nurses, where the nurses were clustered into advanced beginner, intermediate, and expert level of practice categories.
Aristotle linked experiential learning to the development of character and moral sensitivities of a person learning a practice.50 New nurses/new graduates have limited work experience and must experience continuing learning until they have reached an acceptable level of performance.51 After that, further improvements are not predictable, and years of experience are an inadequate predictor of expertise.52 The most effective knower and developer of practical knowledge creates an ongoing dialogue and connection between lessons of the day and experiential learning over time.
In doing so, the nurse thinks reflectively, rather than merely accepting statements and performing procedures without significant understanding and evaluation.34 Expert nurses do not rely on rules and logical thought processes in problem-solving and decisionmaking.39 Instead, they use abstract principles, can see the situation as a complex whole, perceive situations comprehensively, and can be fully involved in the situation.48 Expert nurses can perform high-level care without conscious awareness of the knowledge they are using,39, 58 and they are able to provide that care with flexibility and speed.
Through a combination of knowledge and skills gained from a range of theoretical and experiential sources, expert nurses also provide holistic care.39 Thus, the best care comes from the combination of theoretical, tacit, and experiential knowledge.59, 60 Experts are thought to eventually develop the ability to intuitively know what to do and to quickly recognize critical aspects of the situation.22 Some have proposed that expert nurses provide high-quality patient care,61, 62 but that is not consistently documented—particularly in consideration of patient outcomes—and a full understanding between the differential impact of care rendered by an “expert”
In fact, several studies have found that length of professional experience is often unrelated and even negatively related to performance measures and outcomes.63, 64 In a review of the literature on expertise in nursing, Ericsson and colleagues65 found that focusing on challenging, less-frequent situations would reveal individual performance differences on tasks that require speed and flexibility, such as that experienced during a code or an adverse event.