AI News, The best books on Computer Science for Data Scientists

The best books on Computer Science for Data Scientists

First let’s talk about your journey to data and computer science.

It started off by going straight from high school to medical school in New Zealand, and realizing after a year that I didn’t want to be a doctor.

I spent some time in high school running databases for people, and also learned some PHP to create websites.

I was surprised by the gap between the very theoretical aspect of a computer science degree, and what I had actually experienced when I programmed.

For example, I took a class on algorithms, in which I learned that doubling the speed of a program wasn’t particularly interesting, which I thought was a ridiculous thing to say (ironically this is one of the few courses that is still useful to me today).

usually you’re assigned a position as a teaching assistant for a course in your department, but instead I got a consulting position where I helped students from other departments to do statistical analysis.

what was hard was getting data organised in a way that made sense, instead of constantly fighting to get it in the right form, and then visualising it to understand what was going on.

It became obvious that my repeated efforts to reshape and visualise data could be wrapped up in useful packages.

Your most famous contribution to data science, beyond the world of R, is what’s known as ‘tidy data,’

The idea of tidy data is to get people to store data in a consistent way, so that all of their tools can work with it efficiently, without having to wrangle and reshape it every time.

The basic concept is very simple: when you’re working with data, make sure that each column is a variable, so that each row is an observation.

It’s a rephrasing of the second and third normal forms in relational databases, which were my original programming background.

they sort of make sense if you’ve spent years working on databases, but most people simply won’t understand them.

Obviously your biggest contributions have been in R, but you’ve also worked on projects that try to bridge R and Python, and a lot of code you write behind the scenes uses C++.

If data scientists want to build solid computer science fundamentals for themselves, do you think that they should learn another general-purpose programming language beyond R?

As a programmer, I think it’s intellectually satisfying to learn about programming languages, and see how other languages think about instructions and data.

But many data science courses nowadays will try to teach Bash, SQL, Python and R all in the same course, which I think is bad idea;

“As a programmer, it’s intellectually satisfying to learn about programming languages, and how they think about instructions and data” “As a programmer, it’s intellectually satisfying to learn about programming languages, and how they think about instructions and data” Pragmatically, if you’re a data scientist, learning the basics of SQL is really important.

Then I think you’re better off specializing in one of these two and getting really good at it, rather than spreading yourself too thin and being mediocre at several languages.

it needs some, but, by and large, what it needs is engineers who know how to use programming languages and achieve a goal, rather than thinking about the atomic constituents of computer science.

They’re incredibly useful languages, but ones that computer scientists generally disdain, because they’re not theoretically pure or beautiful.

For example R does a lot of things that are very unusual among programming languages, and some of them could be considered mistakes, but a lot of them exist because R is trying to achieve a particular objective, and was thus designed following specific and sensible constraints.

You should not make that decision based on the technical merits of each language, but instead based on the community of people who use it and are trying to solve problems like yours.

The community of people using Scheme today is small, and somewhat esoteric, but there are interesting ideas to be learned in the language anyway.

When you identify a new problem, it helps you to come up with ideas, for example to use breadth-first search, or a binary tree, etc.

Similarly, is it important for data scientists to study those topics, not necessarily because they’ll need to use them often, but in order to acquire the intuition that something requiring n*log(n) computations is preferable to something requiring n² ones?

A lot of statistical theory is about measuring what happens to mathematical properties when some variable x goes to infinity, without thinking about what then happens to computational properties.

But if your algorithm needs n² computations, it doesn’t matter if x goes to infinity, because you’ll never be able to compute that.

It really helped me on my journey as a software engineer, to be able to write quality code day in and day out, and be confident that it’s going to work correctly.

It’s something that we never really talked about in my computer science education, and it’s certainly something that statisticians rarely think about.

The idea of unit tests is the same as double-entry bookkeeping: if you record everything in two places, the chances of you making a mistake in both places on the same item are very low.

So the easier it is to read your code and understand what’s going on, the easier it will be to add new features in the future.

and past-you doesn’t respond to emails” The third part would be to make sure that it’s fast enough, so that it doesn’t become a bottleneck.

It can be easy (and fun!) to get carried away with this, and obsess with writing code that’s exponentially faster.

The important thing is to make sure that nothing is overly slowing down execution, to the point of interrupting the flow of your analysis, or meaning that your program has to run overnight.

This book gave me the tools to analyze a text and identify the reasons why it doesn’t work, for example stating the topic of a paragraph only in the middle of it.

“Writing well and describing things well is very valuable to a good programmer, and even more to a data scientist” “Writing well and describing things well is very valuable to a good programmer, and even more to a data scientist” Knowing how to write clearly helps you to write code clearly, and also helps you writing good documentation and explain the intent of what you’re doing.

It doesn’t matter how wonderful your data analysis is, if you can’t explain to somebody else what you’ve done, why it makes sense, and what to take away from it.

You make this distinction between writing for computers and writing for humans, but one of the characteristics of your work has been to use elements of style and clarity to enhance the R language.

You often talk about the importance of semantics and grammar in code, for example in ggplot2, your data visualization package that’s based on the theory of grammar of graphics.

It’s also visible in the way that the tidyverse has completely changed the way data scientists write code in R, including the iconic ‘pipe’.

It talks about the idea of writing a small language inside another language, to express ideas in a specific domain, and the idea of ‘fluent’ interfaces, that you can read and write as if they were human language.

There have actually been attempts, for example by Apple, to write programming languages that were exactly like human language, which I think is a mistake because human language is terribly inefficient, and relies on things like tone and body language to clarify ambiguity.

It can take simple forms, like thinking of functions as verbs, and objects as nouns, so you can draw on the grammatical intuition that comes from human language.

Of course it raises many problems, the biggest one being that 75% of the resources available on sites like StackOverflow are in English, so the answers wouldn’t be universal anymore.

I’m very interested to see where that goes, and how useful it can be to aspiring data scientists everywhere, especially when R is quickly democratizing access to the subject, well beyond the academic world.

Soon We Won't Program Computers. We'll Train Them Like Dogs

Before the invention of the computer, most experimental psychologists thought the brain was an unknowable black box.

The so-called cognitive revolution started small, but as computers became standard equipment in psychology labs across the country, it gained broader acceptance.

By the late 1970s, cognitive psychology had overthrown behaviorism, and with the new regime came a whole new language for talking about mental life.

Psychologists began describing thoughts as programs, ordinary people talked about storing facts away in their memory banks, and business gurus fretted about the limits of mental bandwidth and processing power in the modern workplace.

As software has eaten the world, to paraphrase venture capitalist Marc Andreessen, we have surrounded ourselves with machines that convert our actions, thoughts, and emotions into data—raw material for armies of code-wielding engineers to manipulate.

Facebook's Mark Zuckerberg has gone so far as to suggest there might be a “fundamental mathematical law underlying human relationships that governs the balance of who and what we all care about.” In 2013, Craig Venter announced that, a decade after the decoding of the human genome, he had begun to write code that would allow him to create synthetic organisms.

“It is becoming clear,” he said, “that all living cells that we know of on this planet are DNA-software-driven biological machines.” Even self-help literature insists that you can hack your own source code, reprogramming your love life, your sleep routine, and your spending habits.

(In Bloomberg Businessweek, Paul Ford was slightly more circumspect: “If coders don't run the world, they run the things that run the world.” Tomato, tomahto.) But whether you like this state of affairs or hate it—whether you're a member of the coding elite or someone who barely feels competent to futz with the settings on your phone—don't get used to it.

This approach is not new—it's been around for decades—but it has recently become immensely more powerful, thanks in part to the rise of deep neural networks, massively distributed computational systems that mimic the multilayered connections of neurons in the brain.

In February the company replaced its longtime head of search with machine-learning expert John Giannandrea, and it has initiated a major program to retrain its engineers in these new techniques.

“By building learning systems,” Giannandrea told reporters this fall, “we don't have to write these rules anymore.” But here's the thing: With machine learning, the engineer never knows precisely how the computer accomplishes its tasks.

And as these black boxes assume responsibility for more and more of our daily digital tasks, they are not only going to change our relationship to technology—they are going to change how we think about ourselves, our world, and our place within it.

Rubin is excited about the rise of machine learning—his new company, Playground Global, invests in machine-learning startups and is positioning itself to lead the spread of intelligent devices—but it saddens him a little too.

You can't cut your head off and see what you're thinking.” When engineers do peer into a deep neural network, what they see is an ocean of math: a massive, multilayer set of calculus problems that—by constantly deriving the relationship between billions of data points—generate guesses about the world.

They largely ignored, even vilified, early proponents of machine learning, who argued in favor of plying machines with data until they reached their own conclusions.

For the past two decades, learning to code has been one of the surest routes to reliable employment—a fact not lost on all those parents enrolling their kids in after-school code academies.

“I was pointing out how different programming jobs would be by the time all these STEM-educated kids grow up.” Traditional coding won't disappear completely—indeed, O'Reilly predicts that we'll still need coders for a long time yet—but there will likely be less of it, and it will become a meta skill, a way of creating what Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence, calls the “scaffolding” within which machine learning can operate.

Just as Newtonian physics wasn't obviated by the discovery of quantum mechanics, code will remain a powerful, if incomplete, tool set to explore the world.

If the rise of human-written software led to the cult of the engineer, and to the notion that human experience can ultimately be reduced to a series of comprehensible instructions, machine learning kicks the pendulum in the opposite direction.

Over the past few years, as networks have grown more intertwined and their functions more complex, code has come to seem more like an alien force, the ghosts in the machine ever more elusive and ungovernable.

“One can imagine such technology outsmarting financial markets, out-inventing human researchers, out-manipulating human leaders, and developing weapons we cannot even understand,” wrote Stephen Hawking—sentiments echoed by Elon Musk and Bill Gates, among others.

But discoveries in the field of epigenetics suggest that genetic material is not in fact an immutable set of instructions but rather a dynamic set of switches that adjusts depending on the environment and experiences of its host.

Venter may believe cells are DNA-software-driven machines, but epigeneticist Steve Cole suggests a different formulation: “A cell is a machine for turning experience into biology.” And now, 80 years after Alan Turing first sketched his designs for a problem-solving machine, computers are becoming devices for turning experience into technology.

We analyzed thousands of coding interviews. Here’s what we learned.

Either way, the first step in the interviewing process is usually to read a bunch of online interview guides (especially if they’re written by companies you’re interested in) and to chat with friends about their experiences with the interviewing process (both as an interviewer and interviewee).

As a result, we’re able to collect quite a bit of interview data and analyze it to better understand technical interviews, the signal they carry, what works and what doesn’t, and which aspects of an interview might actually matter for the outcome.

Each interview, whether it’s practice or real, starts with the interviewer and interviewee meeting in a collaborative coding environment with voice, text chat, and a whiteboard, at which point they jump right into a technical question.

During these interviews, we collect everything that happens, including audio transcripts, data and metadata describing the code that the interviewee wrote and tried to run, and detailed feedback from both the interviewer and interviewee about how they think the interview went and what they thought of each other.

If you’re curious, you can see what the feedback forms for interviewers and interviewees look like below — in addition to one direct yes/no question, we also ask about a few different aspects of interview performance using a 1–4 scale.

Before getting into the thick of it, it’s worth noting that the conclusions below are based on observational data, which means we can’t make strong causal claims… but we can still share surprising relationships we’ve observed and explain what we found so you can draw your own conclusions.

Although we don’t do a pairwise comparison for every possible pair of languages, the data below suggest that generally speaking, there aren’t statistically significant differences between the success rate when interviews are conducted in different languages.

Learnable Programming

Fortunately, there are giant shoulders to stand on here -- programming systems that were carefully and beautifully designed around the way people think and learn.

In Logo, the programmer draws pictures by directing the 'turtle', an onscreen character which leaves a trail as it moves: Watch just two minutes of this video -- the children, and the beardy guy talking: That's Seymour Papert explaining the Logo turtle.

For example, to figure out how to draw a circle, a learner will walk around in circles for a bit, and quickly derive a 'circle procedure' of taking a step forward, turning a bit, taking a step forward, turning a bit.

After teaching it to herself, the learner can then teach it to the computer.* * Here, the learner has derived and implemented the differential equation for a circle, without knowing what a differential equation is.

The turtle is the in-computer embodiment of the programmer herself, a 'self', like the player-character in a video game, and thereby allows the learner to transfer her knowledge of her own body into knowledge of programming.

it simply evolved as a thin layer over the metaphors used in the underlying machine architecture, such as 'storing to memory'.* * Alan Kay in The Early History of Smalltalk: 'Assignment statements -- even abstract ones -- express very low-level goals...

In order to program the behavior of an object, the programmer casts herself into the role of that object (to the extent of referring to the object as 'self'!) and thinks of herself as carrying on a conversation with other objects.

Unlike a typical programming language, where an 'object' is an abstract ethereal entity floating inside the computer, every object in HyperCard has a 'physical presence' -- it has a location on a particular card, it can be seen, it can be interacted with.

Everything is visible and tangible -- electricity is not some abstract voltage reading, but can be seen directly as orange fire, flowing through wires.

Because this metaphor carries no computational power (you cannot compute by filling in pixels), all computation occurs outside the bounds of the metaphor.

In this example of a bouncing-ball animation -- -- the simulated properties of the ball (position, velocity) are not associated with the picture of the ball onscreen.

They are computed and stored abstractly as 'numbers' in 'variables', and the ball is merely a shadow that is cast off by this ethereal internal representation.

To draw a face consisting of four circles, we can teach the turtle a subprocedure for drawing a circle, and then apply that subprocedure four times.

For the first time I thought of the whole as the entire computer, and wondered why anyone would want to divide it up into weaker things called data structures and procedures.

In his wonderful essay Why Functional Programming Matters, John Hughes argues that decomposition lies at the heart of the power of languages like Haskell: When writing a modular program to solve

Because all source code, if any, is embedded in individual objects in the form of scripts, and because scripts use loose, relative references to other objects, groups of related objects can be transplanted much more easily and successfully than in other systems.* * HyperCard is seen by some as 'what the web should have been'.

Any user can remix their software with copy and paste, thereby subtly transitioning from user to creator, and often eventually from creator to programmer.

The programmer cannot simply grab a friend's bouncing ball and place it alongside her own bouncing ball -- variables must be renamed or manually encapsulated;

You copy some red text from a website, paste it into your email, and everything else in your email turns red: This is exactly what can happen when copying and pasting lines of Processing code, because Processing's way of handling color is inherently leaky: Experienced programmers might look at this example and consider this a programmer's error, because this is 'just how code works.'

Designing a system that supports recomposition demands long and careful thought, and design decisions that make programming more convenient for individuals may be detrimental to social creation.

Below are four array methods from Apple's Cocoa framework, and the equivalent JavaScript methods: Cocoa follows strong grammatical conventions which immediately convey the meanings of methods.

Many verbs, such as 'fill' and 'stroke', do not.* The programmer constructs a color using a noun ('color'), and constructs an image using a verb ('createImage').

The Coming Software Apocalypse

Still, most software, even in the safety-obsessed world of aviation, is made the old-fashioned way, with engineers writing their requirements in prose and programmers coding them up in a programming language like C.

Margaret Hamilton, a celebrated software engineer on the Apollo missions—in fact the coiner of the phrase “software engineering”—told me that during her first year at the Draper lab at MIT, in 1964, she remembers a meeting where one faction was fighting the other about transitioning away from “some very low machine language,” as close to ones and zeros as you could get, to “assembly language.” “The people at the lowest level were fighting to keep it.

No wonder, he said, that “people are not so easily transitioning to model-based software development: They perceive it as another opportunity to lose control, even more than they have already.” The bias against model-based design, sometimes known as model-driven engineering, or MDE, is in fact so ingrained that according to a recent paper, “Some even argue that there is a stronger need to investigate people’s perception of MDE than to research new MDE technologies.” Which sounds almost like a joke, but for proponents of the model-based approach, it’s an important point: We already know how to make complex software reliable, but in so many places, we’re choosing not to.

“Human intuition is poor at estimating the true probability of supposedly ‘extremely rare’ combinations of events in systems operating at a scale of millions of requests per second,” he wrote in a paper.

the code faithfully implements the intended design, but the design fails to correctly handle a particular ‘rare’ scenario.” Newcombe was convinced that the algorithms behind truly critical systems—systems storing a significant portion of the web’s data, for instance—ought to be not just good, but perfect.

This is why he was so intrigued when, in the appendix of a paper he’d been reading, he came across a strange mixture of math and code—or what looked like code—that described an algorithm in something called “TLA+.” The surprising part was that this description was said to be mathematically precise: An algorithm written in TLA+ could in principle be proven correct.

That is, before you write any code, you write a concise outline of your program’s logic, along with the constraints you need it to satisfy (say, if you were programming an ATM, a constraint might be that you can never withdraw the same money twice from your checking account).

Code makes you miss the forest for the trees: It draws your attention to the working of individual pieces, rather than to the bigger picture of how your program fits together, or what it’s supposed to do—and whether it actually does what you think.

Because they never learned it.” Lamport sees this failure to think mathematically about what they’re doing as the problem of modern software development in a nutshell: The stakes keep rising, but programmers aren’t stepping up—they haven’t developed the chops required to handle increasingly complex problems.

“In the 15th century,” he said, “people used to build cathedrals without knowing calculus, and nowadays I don’t think you’d allow anyone to build a cathedral without knowing calculus.

And I would hope that after some suitably long period of time, people won’t be allowed to write programs if they don’t understand these simple things.” Newcombe isn’t so sure that it’s the programmer who is to blame.

For one thing, he said that when he was introducing colleagues at Amazon to TLA+ he would avoid telling them what it stood for, because he was afraid the name made it seem unnecessarily forbidding: “Temporal Logic of Actions” has exactly the kind of highfalutin ring to it that plays well in academia, but puts off most practicing programmers.

“They google, and they look on Stack Overflow” (a popular website where programmers answer each other’s technical questions) “and they get snippets of code to solve their tactical concern in this little function, and they glue it together, and iterate.” “And that’s completely fine until you run smack into a real problem.” In the summer of 2015, a pair of American security researchers, Charlie Miller and Chris Valasek, convinced that car manufacturers weren’t taking software flaws seriously enough, demonstrated that a 2014 Jeep Cherokee could be remotely controlled by hackers.

They took advantage of the fact that the car’s entertainment system, which has a cellular connection (so that, for instance, you can start your car with your iPhone), was connected to more central systems, like the one that controls the windshield wipers, steering, acceleration, and brakes (so that, for instance, you can see guidelines on the rearview screen that respond as you turn the wheel).

And while some of this code—for adaptive cruise control, for auto braking and lane assist—has indeed made cars safer (“The safety features on my Jeep have already saved me countless times,” says Miller), it has also created a level of complexity that is entirely new.

“I think the autonomous car might push them,” Ledinot told me—“ISO 26262 and the autonomous car might slowly push them to adopt this kind of approach on critical parts.” (ISO 26262 is a safety standard for cars published in 2011.) Barr said much the same thing: In the world of the self-driving car, software can’t be an afterthought.

Functional programming in Javascript is an antipattern

After a few months writing Clojure I began writing Javascript again.

don’t think there’s a way to avoid this kind of thinking when writing Javascript using any combination of React, Redux, ImmutableJS, lodash, and functional programming libraries like lodash/fp and ramda.

need the following in my head at all times: If I manage to keep that in my head, I still run into a tangle of questions like the ones above.

Some functions return new values, while others mutate the existing ones.

So is the fact that presumably, on top of Javascript, I need to know lodash, its function names, its signatures, its return values.

I have to know more function names, signatures, return values, and import more basic functions.

As a result of learning its API and its overall way of thinking, I can know how to solve a problem in 2 seconds using Immutable.

Like ramda and lodash, I have a larger set of functions I need to know about — what they return, their signatures, their names .

I also need to divide all the functions I know about into two categories: ones that work with Immutable, and ones that don’t.

don’t have exact figures, but I think it’s safe to say I could be more productive if I didn’t have to wonder things like “What function can I use here?” and “Should I mutate this variable?” They have nothing to do with the problem I’m trying to solve, or the feature I’m trying to implement.

The only way I can find to avoid this is to not go down the path in the first place — don’t use ImmutableJS, immutable data structures, immutable data as a concept in Redux/React, or ramda, or lodash.

If you identify or agree at all with what I’ve said (and if you don’t, that’s fine), then I think it’s worth 5 minutes, a day, or even a week to consider: What might be the long-term costs of staying on the Javascript path versus taking a different one?

It was designed from the ground up as a functional programming language that operates on immutable data structures.

And for whatever it’s worth, the 2017 StackOverflow survey found Clojure developers are the highest paid of all languages on average worldwide.

That perception may exist, but there are Clojurescript equivalents to everything in Javascript: re-frame is Redux, reagent is React, figwheel is Webpack/hot reloading, leiningen is yarn/npm, Clojurescript is Underscore/Lodash.

You can add Clojurescript to an existing codebase, rewrite old code one file at a time, and continue to interact with the old code from the new.

Solving Programming Problems

Ge the Code Here: To finish off my Java Algorithm tutorial, I thought it would be interesting to cover solving programming problems in ..

Top 5 Programming Languages to Learn to Get a Job at Google, Facebook, Microsoft, etc.

Which programming language to learn first? Watch this video to find out! In this video, I talk about the top 5 programming languages I'd recommend for you to ...

The First Programming Languages: Crash Course Computer Science #11

Get your first two months of CuriosityStream free by going to and using the promo code “crashcourse”. So we ended last ..

Coding is not difficult | Mark Zukerberg

It's a code.org, short film on the need of teaching coding in schools. Listen to big techies like Mark Zukerberg, Bill Gates, and many gaints, explain the importance ...

What Programming Language Should I Learn First?

Start learning python by building projects in under 5 minutes TODAY – Even if you're a complete beginner... READ FULL ..

The basics of BASIC, the programming language of the 1980s.

Support this channel on Patreon: Visit my website at: In this episode, 4 vintage computer enthusiasts .

Elon Musk On Programming

Elon Musk shares why and how he got into computer programming at a young age. You should check out a website that ..

Syntax Vs Semantics - Programming Languages

This video is part of an online course, Programming Languages. Check out the course here:

0.2: How and why should you learn Programming? - Processing Tutorial

This videos discusses some reasons why you might want to learn to program the computer? And what are some ways you might learn? Support this channel on ...

Which programming language should you learn first?

Software engineer Preethi Kasireddy answers your questions! Question: “I want to get started with programming but I don't know know where to start.