# AI News, The real prerequisite for machine learning isn&#8217;t math, it&#8217;s data analysis

## The real prerequisite for machine learning isn&#8217;t math, it&#8217;s data analysis

When beginners get started with machine learning, the inevitable question is &#8220;what are the prerequisites?

You need all of the following: – Calculus – Differential equations – Mathematical statistics – Optimization – Algorithm analysis – and – and – and &#8230;&#8230;..

If you&#8217;re intimidated by the math, I have some good news for you: in order to get started building machine learning models (as opposed to doing machine learning theory), you need less math background than you think (and almost certainly less math than you&#8217;ve been told that you need).

In fact, even if you can get by without having a masterful understanding of calculus and linear algebra, there are other prerequisites that you absolutely need to know (thankfully, the real prerequisites are much easier to master).

Moreover, the incentives shape the training of people entering academia: students in an academic environment are trained to be productive largely as scholars and researchers.

In an academic environment, individuals are rewarded (largely) for producing novel research, and in the context of ML, that truly does require a deep understanding of the mathematics that underlies machine learning and statistics.

They imagine that data scientists spend their days pensively standing at a whiteboard, scribbling math equations between sips of coffee.

If we&#8217;re talking about entry level data scientists to intermediate level data scientists, I&#8217;d estimate that they spend less than 5% of their time actually doing mathematics.

(And quite frankly, most entry-level data scientists won&#8217;t spend much of their time on ML.) When you build a model, you will spend very, very little time doing any math.

For beginning practitioners (i.e., hackers, coders, software engineers, and people working as data scientists in business and industry) you don&#8217;t need to know that much calculus, linear algebra, or other college-level math to get things done.

Although at high levels there are some data scientists who need deep mathematical skill, at a beginning level – I repeat – you do not need to know calculus and linear algebra in order to build a model that makes accurate predictions.

tools like R&#8217;s caret and Python&#8217;s scikit-learn – tools that do much of the hard math for you – you won&#8217;t be able to make these tools work without a solid understanding of exploratory data analysis and data visualization.

that&#8217;s sort of a shorthand way of saying &#8220;80% of your work will be getting data (from databases, spreadsheets, flat-files), performing exploratory data analysis, reshaping data, visualizing data to find insights, and using EDA.&#8221;

While this figure is about data science in general, it also applies to machine learning specifically: when you&#8217;re building machine learning models, 80% of your time will be spent getting data, exploring it, cleaning it, and analyzing results (using data visualization).

To be a little more blunt about it, if you don&#8217;t know calculus and linear algebra, you can still build useful models, but if you aren&#8217;t really proficient with data analysis, you&#8217;re screwed.

Many, if not most of the best data scientists and model-builders I know at several Fortune 500 companies aren&#8217;t particularly masterful at calculus, linear algebra or advanced math.

In particular, there are people at companies like Google and Facebook who are pushing the boundaries of machine learning – people working on bleeding edge tools.

I&#8217;ll write my full advice in another blog post, but I&#8217;ll briefly summarize it here: to get started learning practical machine learning, an entry level data scientist needs to have basic comfort working with numbers, calculating percentages, etc.

However, when people tell you that you absolutely need to know calculus, differential equations, optimization theory, linear algebra, and more just to get started building machine learning models, this is flat out wrong.

If you&#8217;re working in R, then I recommend that you learn the following: – ggplot2 for data visualization, including basic visualizations like scatterplots, histograms, bar charts – dplyr for aggregating and reshaping a dataset – Learn how to use ggplot and dplyr together for exploratory data analysis If you&#8217;re working in Python, learn the following: – Base python – Pandas, for aggregating and reshaping your data – Matplotlib for data visualization.

In particular, learn pyplot for basic visualizations, and use Seaborn for more advanced statistical graphics – Learn to use Pandas and data visualizations together for exploratory data analysis.

If you&#8217;re a beginner, and you want to get started with machine learning, you can get by without knowing calculus and linear algebra, but you absolutely can&#8217;t get by without data analysis.

## Mathematics for Machine Learning Specialization

Learn from world class experts and be part of a global community, sharing ideas, expertise and technology to find answers to the big scientific questions and tackle global challenges.

Imperial is a multidisciplinary space for education, research, translation and commercialisation, harnessing science and innovation to tackle global challenges.

## The Mathematics of Machine Learning

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products.

Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications.

Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.

Some online MOOCs and materials for studying some of the Mathematics topics needed for Machine Learning are: Finally, the main aim of this blog post is to give a well-intentioned advice about the importance of Mathematics in Machine Learning and the necessary topics and useful resources for a mastery of these topics.

## Machine Learning

Supervised machine learning builds a model that makes predictions based on evidence in the presence of uncertainty.

A supervised learning algorithm takes a known set of input data and known responses to the data (output) and trains a model to generate reasonable predictions for the response to new data.

Common algorithms for performing classification include support vector machine (SVM), boosted and bagged decision trees, k-nearest neighbor, Naïve Bayes, discriminant analysis, logistic regression, and neural networks.

Common regression algorithms include linear model, nonlinear model, regularization, stepwise regression, boosted and bagged decision trees, neural networks, and adaptive neuro-fuzzy learning.

## Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

What do I need to know to get started?” And once they start researching, beginners frequently find well-intentioned but disheartening advice, like the following: You need to master math.

If you’re intimidated by the math, I have some good news for you: in order to get started building machine learning models (as opposed to doing machine learning theory), you need less math background than you think (and almost certainly less math than you’ve been told that you need).

In fact, even if you can get by without having a masterful understanding of calculus and linear algebra, there are other prerequisites that you absolutely need to know (thankfully, the real prerequisites are much easier to master).

Moreover, the incentives shape the training of people entering academia: students in an academic environment are trained to be productive largely as scholars and researchers.

In an academic environment, individuals are rewarded (largely) for producing novel research, and in the context of ML, that truly does require a deep understanding of the mathematics that underlies machine learning and statistics.

They imagine that data scientists spend their days pensively standing at a whiteboard, scribbling math equations between sips of coffee.

If we’re talking about entry level data scientists to intermediate level data scientists, I’d estimate that they spend less than 5% of their time actually doing mathematics.

(And quite frankly, most entry-level data scientists won’t spend much of their time on ML.) When you build a model, you will spend very, very little time doing any math.

But most data scientists do spend a huge amount of their time getting data, cleaning data, and exploring data.

For beginning practitioners (i.e., hackers, coders, software engineers, and people working as data scientists in business and industry) you don’t need to know that much calculus, linear algebra, or other college-level math to get things done.

(Note that as this post continues, I’m going to use the term “data analysis” as a shorthand for “getting data, cleaning data, aggregating data, exploring data, and visualizing data.”)

Although at high levels there are some data scientists who need deep mathematical skill, at a beginning level – I repeat – you do not need to know calculus and linear algebra in order to build a model that makes accurate predictions.

Even if you use “off the shelf” tools like R’s caret and Python’s scikit-learn – tools that do much of the hard math for you – you won’t be able to make these tools work without a solid understanding of exploratory data analysis and data visualization.

It’s common knowledge among data scientists that “80% of your work will be data preparation.” This is true, although I want to clarify what this means.

When people say that “80% of your work will be data preparation” that’s sort of a shorthand way of saying “80% of your work will be getting data (from databases, spreadsheets, flat-files), performing exploratory data analysis, reshaping data, visualizing data to find insights, and using EDA.” While this figure is about data science in general, it also applies to machine learning specifically: when you’re building machine learning models, 80% of your time will be spent getting data, exploring it, cleaning it, and analyzing results (using data visualization).

To be a little more blunt about it, if you don’t know calculus and linear algebra, you can still build useful models, but if you aren’t really proficient with data analysis, you’re screwed.

Many, if not most of the best data scientists and model-builders I know at several Fortune 500 companies aren’t particularly masterful at calculus, linear algebra or advanced math.

In particular, there are people at companies like Google and Facebook who are pushing the boundaries of machine learning – people working on bleeding edge tools.

I’ll write my full advice in another blog post, but I’ll briefly summarize it here: to get started learning practical machine learning, an entry level data scientist needs to have basic comfort working with numbers, calculating percentages, etc.

However, when people tell you that you absolutely need to know calculus, differential equations, optimization theory, linear algebra, and more just to get started building machine learning models, this is flat out wrong.

If you’re a beginner, and you want to get started with machine learning, you can get by without knowing calculus and linear algebra, but you absolutely can’t get by without data analysis.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Mathematics of Machine Learning

Do you need to know math to do machine learning? Yes! The big 4 math disciplines that make up machine learning are linear algebra, probability theory, ...

How Machines Learn

How do all the algorithms around us learn to do their jobs? Bot Wallpapers on Patreon: Discuss this video: ..

5 Must Have Skills To Become Machine Learning Engineer

Hello Everyone!!! Let's check out what are the 5 must-have skills to become a machine learning engineer. First, let's understand what machine learning is.

Binary Numbers and Base Systems as Fast as Possible

Binary numbers, man... How do they work? Get a FREE 7 day trial for lynda.com here: Follow Taran on Twitter @taranvh

But what *is* a Neural Network? | Chapter 1, deep learning

Subscribe to stay notified about new videos: Support more videos like this on Patreon: Special .

Predicting Stock Prices - Learn Python for Data Science #4

In this video, we build an Apple Stock Prediction script in 40 lines of Python using the scikit-learn library and plot the graph using the matplotlib library.

How SVM (Support Vector Machine) algorithm works

In this video I explain how SVM (Support Vector Machine) algorithm works to classify a linearly separable binary data set. The original presentation is available ...

Learn Deep Learning in 6 Weeks

Deep Learning is the dark art of our times. Incredibly powerful, mysteriously accurate, and accessible to just about anyone. In this video, i've compiled an open ...

The 7 Steps of Machine Learning

How can we tell if a drink is beer or wine? Machine learning, of course! In this episode of Cloud AI Adventures, Yufeng walks through the 7 steps involved in ...

Intro - The Math of Intelligence

Welcome to The Math of Intelligence! In this 3 month course, we'll cover the most fundamental math concepts in Machine Learning. In this first lesson, we'll go ...