AI News, Setting up a Machine Learning Farm in the Cloud with Spot Instances + Auto Scaling

Setting up a Machine Learning Farm in the Cloud with Spot Instances + Auto Scaling

If you have a problem like this, you’re already familiar with some of the details, such as how much RAM is needed to compute a feature, whether the machine learning algorithm chosen can vectorize each instance independently, or whether it needs to process multiple instances.

So for now, let’s say we have a problem where each instance can be vectorized independently from all the other instances (and all of the features can be computed within the RAM limitations of a single machine).  We’ll choose a typical master-worker coordination pattern where a master node queues instances and worker nodes that pull from that queue and vectorize instances.

Before I start setting up auto-scaling, I’ll create AWS images for the master and worker nodes. Then I’ll test out the application on the image and configure the worker node so that on boot it pulls a unit of work from the master on boot.

Creating an auto-scaled spot instance cluster Alright, we’ve figured out how we plan to distribute the computation of our problem over a cluster of machines.  We’ve chosen an instance type that meets our price/performance requirements, tested out code on it, and created Amazon Machine Images for each of the roles.  We’ll boot up our master server, and the idea is that the workers will be auto-scaled up.

Amazon lets you specify policies that scale in absolute number of machines or relative (percentage of current cluster size).  How fast you add/subtract machines from the cluster will be determined by what you choose in the next step.

Alarms are events that trigger an action. There’s a single command to both define the alarm condition and associate it with the scaling policy we just created (using the super long policy ID from the previous step).

If you’re running an online learning system and want to reduce system latency, then you want to scale up quickly and scale down slowly (and pay the cost of over-provisioning) to reduce the queue size. The lower bound on responsiveness to load increases is going to be the boot time of the system —

(This happens rarely, but if you can’t be interrupted, it’s something you have to plan for.) A simple script can monitor the spot price and, if it’s above the on-demand price, replace your launch configuration with on-demand instances.

How to use auto scaling your applications with AWS -

Accelerate progress up the cloud curve with Cloud Academy's digital training solutions. Build a culture of cloud with technology and guided learning ...

AWS - Shared, Dedicated Instances & Dedicated Host Differences - EC2 Tenancy Models

This video tutorial explains you the difference between: - Shared Instances, - Dedicated Instances and - Instances on Dedicated Hosts. Don't remain confused ...

AWS Essentials: Instance Types

See this course and others at Linux Academy: Instance types are where ..

AWS EBS DEMO - Resizing & Changing Type, EBS Snapshot, Attach & Detach EBS

1. What are different typed of EBS volumes? 2. How to resize and change EBS volume type (modifying EBS volume)? 3. Taking EBS Snapshot. 4. Attaching and ...

Latanya Sweeney: When anonymized data is anything but anonymous

Relatively simple data science experiments can yield major insights and have a significant impact. Many experiments in data science are expensive and time ...

AWS Storage - S3 vs EBS vs EFS Comparison | When to use?

AWS Storage options - - Difference between Object and Block storage - Use-cases and differences between S3, EBS & EFS - When to use and when not to?

Deploying Amazon EC2 Container Service (ECS) - Simple Cloud Hosting on AWS

Amazon EC2 Container Service (ECS) is a container management service that supports Docker containers and allows you to easily run applications on a ...

Inside Microsoft Azure datacenter hardware and software architecture with Mark Russinovich

Microsoft Azure has achieved massive, global scale, with 40 announced regions consisting of over 150 datacenters, and it is growing fast. It delivers the promise ...

TechSnacks 25: Auto Scaling in EC2 (English)

In this demo, I show how to create an Auto Scaling group in Amazon EC2 and how Auto Scaling can help with self healing of an infrastructure.

AWS - Security Groups DEMO - Inbound and Outbound Rules - Security on Cloud

This tutorial explains the usage and working of Security Groups on AWS. - This acts as an additional layer of Firewall apart from OS level firewall on EC2.