AI News, How to automate creating high end virtual machines on AWS for data science projects

How to automate creating high end virtual machines on AWS for data science projects

You can either spend hours waiting for an algorithm to finish on a regular PC/laptop, spend about $1000 on buying a high-end PC or get a VM on the cloud providers.

AWS has a web console interface for creating resources but it can get time consuming and repetitive to use the interface for one time machines.

Furthermore there’s a hassle with installing the required software packages everytime (a process called configuration), getting the machine details (public DNS name, public IP etc) I’ll be using the Terraform orchestration tool to quickly set up and configure the required Virtual Machine server as fast as possible to minimize time lost on the trivial activities and maximize the value for the money paid for the server.

You can provision VMs, create subnets, assign security groups and pretty much perform any action that any cloud provider allows.

We just need to set it to the path variable (for Linux/macOS instructionscan be found here and for Windows here) so that it is accessible from our system in any path.

After we have this has finished we can confirm that it is ready to be used by running the terraform command and we should get something like the following: Now we can move on the using the tool.

For this purpose of this project we will give the AdministratorAccess permission to this user, however when used in a professional setting it is advised to only allow permissions that a user needs (like AmazonEC2FullAccess if a user will only be creating EC2 instances).

We add the following to the credentials file after replacing ACCESS_KEY and SECRET_KEY and then save it: We also restrict access to this file only to the current user: The next step is to create a key pair so that terraform can access the newly created VMS.

The Amazon credentials are for accessing and allowing the AWS service to create the resources required, while this key pair will be used for accessing the newly created Virtual Machines.

Then restrict the permissions: Now we ready to use this key pair either via a direct ssh to our instances, or for terraform to use this to connect to the instances and run some scripts.

We have allowed incoming traffic (ingress) to ports 22 and 8888 which are used for ssh access and by jupyter notebook respectively from any IP address.

note that as mentioned in the previous paragraph, the block declaration starts with the resource word, followed by the type of resource the block defines (here aws_security_group), and we give the name jupyter_notebook_sg to this resource.

The one I picked here is the m4.xlarge instance which has 4 virtual CPUs, 16 GB of RAM and costs about $0.2 per hour at the time of writing, however this price is usually variant on the region. Tags

The first provisioner is a file provisioner which copies files to the resource (we use it to copy the script we created), while the second one remote-exec runs a shell command on the VM once it has been created (we make the script executable and then run it).

The configuration script is really basic: We run a system update, then install git, vim, python3 and python3 pip, and the jupyter notebook.

The last 3 lines are of interest as they create the jupyter notebook specific configuration file, and assigns values to allow_origin and ip which allow access to the notebook from any server.

Then we actually run the terraform apply and the resource creation begins: After a while the resources have been created, we can see that Terraform has provided the public DNS name as mentioned, and we can ssh to the machine: We are ready to start the Jupyter Notebook: Once it has started Jupyter provides a URL to access it, however we need to substitute the wildcard IP with the machine DNS name so in this case the URL to access the Notebook will be

Creating a single machine is just a very simple case but if you get the hang of you can create multi node architecture with big data systems like Hadoop, Spark etc.

Create VPC in Google Cloud using Terraform

Create VPC in Google Cloud using Terraform. Link to files on Github :

OpenDev 10.2017 | Reproducible infrastructure with Terraform and Microsoft Azure

Nic Jackson, Developer Advocate at HashiCorp, speaks at the 2nd edition of Azure OpenDev, a live community-focused series of technical demonstrations ...

Terraform in your delivery pipeline (Anton Babenko)

Talk given at Full Stack Fest 2017: Do the last step(s) and bring infrastructure management into CI/CD processes. We know what to do ..

Infrastructure as code: Leverage Ansible and Terraform on Microsoft Azure - BRK2199

Explore the different patterns of mutable and immutable infrastructure as code in Azure. Terraform is a leading DevOps tool that leverages infrastructure as code ...

Hybrid multi-cloud strategies using Terraform OSS with Azure : Build 2018

85% of enterprises have a multi-cloud strategy. Terraform is emerging as the open source standard to provision infrastructure as code across Azure and ...

INFRASTRUCTURE & OPERATIONS - Seamlessly migrating your networks to GCP

Recorded on Mar 23 2016 at GCP NEXT 2016 in San Francisco. Seamlessly migrate, instantiate and connect your enterprise networks to the cloud with Google ...

Secure Access to AWS Services Using AWS Identity and Access Management (IAM) Roles

To get started with creating AWS IAM role and attaching to EC2 instances, navigate to the AWS IAM console - Learn how to use AWS ..

Accessing Kubernetes API on Google Container Engine

Future, Faster: Unlock the Power of Web Components with Polymer (Google I/O '17)

It took a few years and a couple of trips around the block, but Web Components have truly arrived: they're natively supported in Chrome and Safari today, with ...

Partnering on open source: Vagrant and GCP

GCP Blog kicking off HashiCorp mini-series: **See more resource links below** HashiCorp's main site: Kelsey .