AI News, scikit-learn video #1: Intro to machine learning with scikit-learn

scikit-learn video #1: Intro to machine learning with scikit-learn

As a data science instructor and the founder of Data School, I spend a lot of my time figuring out how to distill complex topics like 'machine learning' into small, hands-on lessons that aspiring data scientists can use to advance their data science skills.

As a practitioner of machine learning, there's a lot to like about scikit-learn: It provides a robust set of machine learning models with a consistent interface, all of the functionality is thoughtfully designed and organized, and the documentation is thorough and well-written.

However, I personally believe that getting started with machine learning in scikit-learn is more difficult than in a language such R, as I explain here: In R, getting started with your first model is easy: read your data into a data frame, use a built-in model (such as linear regression) along with R's easy-to-read formula language, and then review the model's summary output.

My primary goal with this video series, 'Introduction to machine learning with scikit-learn', is to help motivated individuals to gain a thorough grasp of both machine learning fundamentals and the scikit-learn workflow.

(The series does presume basic familiarity with Python, though next week I'll suggest some resources for learning Python if you're new to the language.) For those who successfully master the basics (or are already intermediate-level scikit-learn users), my secondary goal is to dive into more advanced functionality later in the series.

Introduction to machine learning in Python with scikit-learn (video series)

In the data science course that I teach for General Assembly, we spend a lot of time using scikit-learn, Python's library for machine learning.

I love teaching scikit-learn, but it has a steep learning curve, and my feeling is that there are not many scikit-learn resources that are targeted towards machine learning beginners.

Thus I decided to create a series of scikit-learn video tutorials, which I launched in April in partnership with Kaggle, the leading online platform for competitive data science!

My goal with this series is to help motivated individuals to gain a thorough grasp of both machine learning fundamentals and the scikit-learn workflow.

And although the series does assume that you have some familiarity with Python, the second video contains my suggested resources for learning Python if you're just getting started with the language.

Introduction to machine learning with scikit-learn

This video series will teach you how to solve machine learning problems using Python's popular scikit-learn library.

The original notebooks (shown in the video) used Python 2.7 and scikit-learn 0.16, and can be downloaded from the archive branch.

Once you complete this video series, I recommend enrolling in my online course, Machine Learning with Text in Python, to gain a deeper understanding of scikit-learn and Natural Language Processing.

Getting started in scikit-learn with the famous iris dataset

We'll explore the famous 'iris' dataset, learn some important machine learning terminology, and discuss the four key requirements for working with data in scikit-learn.This is the third video in the series: 'Introduction to machine learning with scikit-learn'.

22 must watch talks on Python for Deep Learning, Machine Learning & Data Science (from PyData 2017, Amsterdam)

Python is increasingly gaining popularity among machine learning and data science communities across the world –

It probably has the most developed ecosystem for deep learning, a collection of awesome libraries like pandas and scikit learn and an awesome community.

Speaker: Roelof Pieters Duration: 00:33:45 hrs Roelof talks about basics of deep learning with the explosion of research and experiments that deal with creativity and artificial intelligence.

Like, dancing moves, freestyle raps, impressionist paintings and showed some of the exciting possibilities new technologies offer for creative use and explorations of human-machine interaction where the main theorem is “augmentation, not automation”.

Speaker : Maciej kula Duration : 00:32:55 hrs Neural Networks are constantly replacing every other machine learning algorithm in real life systems and recommendation systems are no exception.

In this tutorial, the speaker starts from the advantages of neural networks in recommender systems and goes through various machine learning models used in recommender systems including Factorization models, Bilinear Neural Networks and sampled loss functions.

They explain in detail what are the challenges they faced while approaching the problem, what kind of hardware they utilize and then technically define their pipeline end-to-end.

Speaker : Carsten van Weelden, Beata Nyari Duration : 00:29:42 hrs In this talk, the speakers explains how they solved the problem of classifying job titles into a job ontology with more than 5000 different classes.

Speaker : Dafne van Kuppevelt Duration : 00:22:47 hrs Deep learning is a state of the art method for many tasks, such as image classification and object detection.

For researchers that have time series data, but are not an expert on deep learning, the barrier can be high to start using deep learning.

The speaker then explains mcfly, an open source python library, to help machine learning novices explore the value of deep learning for time series data.

Speaker : Maxim Lapan Duration : 00:28:27 hrs In this talk the speaker gives a practical introduction into deep reinforcement learning methods, used to solve complex applications like control problems in robotics, play Atari games, self-driving car control and lots more.

Deep Reinforcement Learning is a very hot topic, successfully applied in lots of areas which require planning of actions in complex, noisy and partially-observed environments.

Concrete examples vary from playing arcade games, navigating websites, helicopter, quadrocopter and car control, protein folding and lots of others.

He explains different ways to scale your tasks on top of these technologies like data munging in spark and model building in H2O or using a mix of both for data munging and model building.

“Vaex” enables calculating statistics for a billion samples per second and “ipyvolume” enables to interactively visualise and explore these billion sample tables for high dimensional spaces.

helps us to visualize higher dimensional data in the notebook interactively which can render 3d volumes and up to a million glyphs (scatter plots and quiver) in the (Jupyter) notebook as a widget.

can be used together to explore and visualize any large tabular data set, or separately to calculate statistics, and render 3d plots in the notebook and outside.

He also discusses that sometimes it might be really tough to actually estimate the conversion by just looking at the numbers especially in cases when the company is growing exponentially.

The videos explains how to build machine learning models using AWS and python on data from sensor after suitable preprocessing which can be further used to predict significant information regarding time series data.

In this tutorial, Gilles Louppe demonstrates the use of Bayesian optimization algorithm using a newly built package Scikit-optimize which provides an easy-to-use set of tools to serve the purpose.

Speaker : Giovanni Lanzani Duration : 00:35:13 hrs With the data science and machine learning industry growing at a fast pace and all the companies incorporating these self-learning tools in their businesses, we always strive for developing the best models with the highest achievable accuracy.

Speaker : Ruben Mak Duration : 00:38:51 hrs A/B testing in business is a very good way to test which of your variants of product is performing the best and in turn improve the business outcome.

Shortly discussing the frequentist calculations of an A/B test and common problems in it, he uses this to explain Bayesian Statistics and more specifically hierarchical Bayes to further reduce the probability of making errors in multiple comparisons.

This tutorial discusses the basic concepts of Natural Language Processing like vectorization of words, bag of words, word count as binomial frequency and deriving intelligence from it with the help of an example data set of 200,000 songs.

There are very simple yet interesting insights about different languages regarding the most commonly used letters or whether a language uses long words or shorter ones to express the feelings.

Scikit Learn Tutorial | Machine Learning with Python | Python for Data Science Training | Edureka

Python Certification Training for Data Science : This Edureka video on "Scikit-learn Tutorial" introduces you to machine learning ..

Scikit Learn Machine Learning SVM Tutorial with Python p. 2 - Example

In this machine learning tutorial, we cover a very basic, yet powerful example of machine learning for image recognition. The point of this video is to get you ...

Data science in Python: pandas, seaborn, scikit-learn

In this video, we'll cover the data science pipeline from data ingestion (with pandas) to data visualization (with seaborn) to machine learning (with scikit-learn).

Classification using Pandas and Scikit-Learn

Skipper Seabold This will be a tutorial-style talk demonstrating how to use ..

Introduction - Learn Python for Data Science #1

Welcome to the 1st Episode of Learn Python for Data Science! This series will teach you Python and Data Science at the same time! In this video we install ...

Machine Learning with Text in scikit-learn (PyData DC 2016)

Although numeric data is easy to work with in Python, most knowledge created by humans is actually raw, unstructured text. By learning how to transform text ...

Training a machine learning model with scikit-learn

Now that we're familiar with the famous iris dataset, let's actually use a classification model in scikit-learn to predict the species of an iris! We'll learn how the ...

Machine Learning for Time Series Data in Python | SciPy 2016 | Brett Naul

The analysis of time series data is a fundamental part of many scientific disciplines, but there are few resources meant to help domain scientists to easily explore ...

Regression Training and Testing - Practical Machine Learning Tutorial with Python p.4

Welcome to part four of the Machine Learning with Python tutorial series. In the previous tutorials, we got our initial data, we transformed and manipulated it a bit ...