AI News, Revolutions

Revolutions

This is an eclectic collection of interesting blog posts, software announcements and data applications I've noted over the past month or so.

The ecosystem for ONNX (the open standard for exchange of neural network models) expands, with official support for Core ML, NVIDIA TensorRT 4, and the Snapdragon Neural Processing Engine.

Oracle acquires machine learning platform Datascience.com, a cloud workspace platform for data science projects and workloads.

Deep neural networks encoded by FPGAs (field-programmable gate arrays) can dramatically reduce the time to classify images (as just one application example).

The TWIML AI podcast series on differential privacy, a technique for collecting data in such a way that it reduces the impact on privacy even in the event of a leak.

This hands-on lab walks through the process of building an image recognizer using transfer learning with MobilenetV1.

How to Develop a Currency Detection Model using Azure Machine Learning, with details on how the real-time banknote recognition capability of the Seeing AI application was implemented in CoreML.

AI news from Microsoft’s Build developers conference

At Microsoft’s Build developers conference in Seattle this week, the company is unveiling a series of new and updated tools that will help developers incorporate artificial intelligence into their processes and applications, regardless of their background and training in the fast-emerging field of AI.

limited preview will allow customers to bring Project Brainwave to the edge, meaning customers could take advantage of that computing speed in their own businesses and facilities, even if their systems lack a network or Internet connection.

Azure Machine Learning also is announcing new Azure Machine Learning Packages, which are sets of algorithms that enable data scientists to easily build, train, fine tune and deploy highly accurate and efficient models for computer vision, text analytics and financial forecasting.

The improvements include the ability to customize models for specific speaking styles and the vocabulary of an industry, and to create a unique brand voice, for example for an interactive bot on a customer’s e-commerce website.

For example, a new feature in preview allows users to train models to identify objects within images – to pick out the trained object and show its location within the image.

Project Personality Chat makes intelligent agents more complete and conversational by handling small talk in a consistent tone and reducing fallback responses such as “I don’t understand.” Project Personality Chat also allows developers to give their agents a personality, from professional to humorous, that aligns with a brand voice.

In addition to all these new AI tools for developers, Microsoft also announced AI Lab, a new collection of AI projects designed to enable developers to get started with AI by exploring, experiencing, learning and coding the latest Microsoft AI technology innovations.

All the Build announcements illustrate how new technologies are rapidly changing the way people live, learn and work – progress that comes with an opportunity and responsibility to make sure technology is used well.

Diving deep into what’s new with Azure Machine Learning

Earlier today, we disclosed a set of major updates to Azure Machine Learning designed for data scientists to build, deploy, manage, and monitor models at any scale.

This post covers the learnings we’ve had with Azure Machine Learning so far, the trends we’re seeing from our customers today, the key design points we’ve considered in building these new features, and dive into the new capabilities.

Before the term was in use, we enabled serverless training of experiments built by graphically composing from a rich set of modules, and then deploying these as a web service with the push of a button.

It has been incredibly rewarding to see how the service has been used by our customers including: Over time we’ve worked with many customers who are looking for the next level of power and control and the capabilities we announced today address those desires.

This demand for AI by developers will only increase, further pushing organizations to provide easy to consume AI built on their data, as the way we write software evolves around these new capabilities.

While we see customers consolidating large amounts of data into data lakes and using tools like Spark for preparing and analyzing their data, the models they produce need to be deployed to a variety of form factors.

Whether looking to address latency or the ability to support scoring even while disconnected, customers want the flexibility to train a model anywhere but control their deployment to place scoring as close to the event as possible.

Over time, this innovation will result in mature toolchains, all the way to the hardware level that are optimized for specific workloads, letting customers tune and tradeoff cost, latency, and control.

Given the learnings we’ve had, we’ve anchored our design on the following four points to shape these new capabilities The services and tools we build must operate at scale, and we see customers encountering challenges with at least five different dimensions of “scale.”

For us, this means embracing container based deployment of models to enable customers fine-grained control, as well as being able to use services such as Azure Container Service to provide a scalable hosting layer in Azure.

Much as source control systems have evolved for software development to flexibly support a variety of teams and processes, our system needs to support the AI development lifecycle as teams continue to grow.

Any service and tool that we build needs to enable data scientists to pick and choose from the ecosystem and use those tools, and we must build it in a way that provides a consistent experience for training, deployment, and management as these evolve.

When we look at the key areas of friction for data science teams, we consistently hear about challenges in: We believe that by eliminating the friction in each of these steps, and between these steps, teams will be able to increase their rate of experimentation.

It’s critical that our customers can have flexibility in their deployment form factor, including: Given these design points, we’ve released the following new capabilities for Azure Machine Learning The Azure Machine Learning Experimentation service allows developers and data scientists to increase their rate of experimentation.

With every project backed by a Git repository, and with a simple command line tool for managing experimentation and training runs, every execution can track the code, configuration, and data that’s used for the run.

More importantly, the outputs of that experiment, from model files, log output, and key metrics are tracked, giving you a powerful repository with the history of how your model evolves over time.

Python libraries from Machine Learning Server (revoscalepy and microsoftml) available with Azure Machine Learning include the Pythonic versions of Microsoft’s Parallel External Memory Algorithms (linear and logistic regression, decision tree, boosted tree and random forest) and the battle tested ML algorithms and transforms (deep neural net, one class SVM, fast tree, forest, linear and logistic regressions).

We know that data science isn’t a linear process, and the Experimentation service lets you look back in time to compare experiments that produced the right results.

Models are exposed via web services written in Python, giving you the ability to add more advanced logic, custom logging, state management, or other code into the web service execution pipeline.

When deploying models at scale on an Azure Container Service cluster, we’ve built a hosting infrastructure optimized for model serving, that handles automatic scaling of containers, as well as efficiently routing requests to available containers.

Retraining scenarios, where a deployed model is monitored and then updated after being trained on new data, are possible, enabling continuous improvement of models based on new data.

The Azure Machine Learning Workbench also hosts Jupyter notebooks that can be configured to target local or remote kernels, enabling iterative development within the notebook on your laptop, or hooked up to a massive Spark cluster running on HDInsight.

We want to reduce the time and effort to acquire data for modeling, and we want to fundamentally change the pace with which data scientists can prepare and understand data, and accelerate the time to get to “doing data science.”

We have combined a variety of techniques, using advanced research from Microsoft Research on program synthesis (PROSE) and data cleaning, to create a data wrangling experience that drastically reduces the time that needs to be spent getting data prepared.

With the inclusion of a simple set of libraries for handling data sources, data scientists can focus on their code, not on changing file paths and dependencies when they move between environments.

By building these experiences together, the data scientist can leverage the same tools in the small and in the large, as they scale out transparently across our cloud compute engines, simply by choosing target environments for execution.

This extension provides a rich set of capabilities for building models with deep learning frameworks including Microsoft Cognitive Toolkit (CNTK), Google TensorFlow, Theano, Keras and Caffe2, while integrating with the Experimentation service for executing jobs locally and in the cloud, and for deployment with the Model Management services.

Data scientists on our team have put together detailed scenario walkthroughs, complete with sample data, for you to get started on some interesting challenges or adapt their techniques to your next problem, including: We’re constantly working on and refreshing the documentation, if you have a comment or suggestion, please let us know.

Revolutions

This is the first edition of a monthly roundup of news about Artificial Intelligence, Machine Learning and Data Science.

This is an eclectic collection of interesting blog posts, software announcements, applications and events I've noted over the past month or so.

Google releases DeepLab on Github, a trained CNN model to assign semantic labels to every pixel of an image.

Google releases Lucid, a neural-network visualization library designed to help with the interpretability of vision systems.

The Geo AI Data Science VM, including ESRI ArcGIS Pro and a land cover classification tutorial dataset, is now available in the Azure Marketplace.

podcast interview about Project InnerEye, an innovative machine learning tool that helps radiologists identify and analyze 3-D images of cancerous tumors.

Microsoft Research has developed a system to translate news articles from Chinese to English with the same accuracy as human translators.

An overview of LUIS, Microsoft's cloud-based service for developing language- and speech-based applications.

First look at What’s New in Azure Machine Learning

Take in the huge set of capabilities announced at Ignite for the next generation of the Azure Machine Learning platform. Build and deploy ML applications in the ...

Enabling AI in your solutions with Azure

Watch this complete tour of AI, machine learning and cognitive services in Microsoft Azure. Artificial intelligence is the process of empowering machines to ...

Microsoft Data Platform – SQL Server 2017 and Azure Data Services

All around us, data is driving digital transformation. Companies that invest heavily in cloud, data and AI have nearly double the operating margin of those that do ...

What's new in Microsoft AI with Paige Bailey

Speaker: Paige Bailey (@DynamicWebPaige) In this episode we talk to Paige Bailey about some of the new artificial intelligence stuff that was announced at the ...

AI for Earth : Analyzing Global Data with Azure : Build 2018

Microsoft has publicly committed $50 million over 5 years for artificial intelligence projects that support clean water, agriculture, climate, and biodiversity. But our ...

TWC9: Go SDK for Azure, PWA tips, important AI papers, and more

This week on Channel 9, Christina's out of a cast (yay!) but still wearing a splint (boo!) and back to share the latest developer news, including: * 0:20 [Build ...

Project Hanover: Using Technology to Personalize Cancer Treatment

Cancer experts will sometimes tell you that every patient's cancer is a snowflake, each case unique in its own way. With oncologists needing to consider all the ...

Machine learning at scale : Build 2018

AI/ML expert Paige Bailey takes you on a tour of the powerful services available on Azure. You'll see how to take your predictive model to production, ...

New Smartphones Read Your Mind, Using AI to Predict What you do!

New Smartphones Read Your Mind, Using AI to Predict What you do! Less ..

Inside a Google data center

Joe Kava, VP of Google's Data Center Operations, gives a tour inside a Google data center, and shares details about the security, sustainability and the core ...