AI News, 13 Machine Learning Data Set Collections

13 Machine Learning Data Set Collections

With this tool, you can create a map display of scene locations with markers that show each scene’s metadata.

NASA NEX is a collaboration and analytical platform that combines state-of-the-art supercomputing, Earth system modeling, workflow management and NASA remote-sensing data.

Through NEX, users can explore and analyze large Earth science data sets, run and share modeling algorithms, collaborate on new or existing projects and exchange workflows and results within and among other science communities.

The 1000 Genomes Project is an international collaboration which has established the most detailed catalogue of human genetic variation, including SNPs, structural variants, and their haplotype context.

The final phase of the project sequenced more than 2500 individuals from 26 different populations around the world and produced an integrated set of phased haplotypes with more than 80 million variants for these individuals.

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples.

It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

When benchmarking an algorithm it is recommendable to use a standard test database (data set) for researchers to be able to directly compare the results.

The Best Way to Prepare a Dataset Easily

In this video, I go over the 3 steps you need to prepare a dataset to be fed into a machine learning model. (selecting the data, processing it, and transforming it).

Natalie Hockham: Machine learning with imbalanced data sets

Classification algorithms tend to perform poorly when data is skewed towards one class, as is often the case when tackling real-world problems such as fraud ...

Prepare your dataset for machine learning (Coding TensorFlow)

Interested in learning how to use JavaScript in the browser? In the last episode of Coding TensorFlow, we showed you a very basic ML scenario in the browser ...

How to get/download datasets to process the data in spark or bigdata?

Sample Datasets in R||R Tutorials

Many times when we need to do exercises or practice of R commands, we look for sample data and many times it becomes hard to get it. To solve this scenario, ...

Building dataset - p.4 Data Analysis with Python and Pandas Tutorial

In this part of Data Analysis with Python and Pandas tutorial series, we're going to expand things a bit. Let's consider that we're multi-billionaires, ...

Google Dataset Search for machine learning and AI researchers

A tool to help researcher in machine learning and AI, #Google has released a new indexing system, aka search engine to find dataset. #dataset in an important ...

What is a dataset?

Scientists collect all sorts of information in all different kinds of ways. One of the most common ways scientists collect information is through the Rectangular ...

Introduction to Data Mining: Basic Data Types

Continuing with data fundamentals, we introduce you to the three data set types, Record, Ordered, and Graph. -- At Data Science Dojo, we're extremely ...

Getting started in scikit-learn with the famous iris dataset

Now that we've set up Python for machine learning, let's get started by loading an example dataset into scikit-learn! We'll explore the famous "iris" dataset, learn ...