AI News, QA – How Reliable are your Machine Learning Systems?

QA – How Reliable are your Machine Learning Systems?

In this post, you will learn about different aspects of creating a machine learning system with high reliability.

Let’s look into the details related to both the aspects: Fault tolerance of ML systems could be defined as the behavior of the system when the model performance starts degrading beyond the acceptable limits.

As part of the model training automation process, the following would need to be achieved: In order to avoid the bad models to move into the production, the different form of quality checks would need to be performed on different aspects of ML models such as the following: The container technology along with workflow tools could be used to achieve a repeatable model training process.

While model reliability in production is related with fault-tolerance and recoverability of models, the model training reliability would mean the repeatability of ML training/testing process which is associated with automation of ML training/testing processes.

How Machine Learning is Impacting Oil and Gas (Cloud Next '18)

New advances in machine learning are helping organizations optimize operations, maximize output, and automate manual processes. However, the scalability ...

Case Study: How a Large Brewery Uses Machine Learning for Preventive Maintenance (Cloud Next '18)

Learn how machine learning is used to optimize the beer manufacturing process. This use case has a direct impact on the production line and identifying ...

Lifecycle of a machine learning model (Google Cloud Next '17)

In this video, you'll hear lessons learned from our experience with machine learning and how having a great model is only a part of the story. You'll see how ...

How to Build Flexible, Portable ML Stacks with Kubeflow and Elastifile (Cloud Next '18)

Building any production-ready machine learning system involves various components, often mixing vendors, and hand-rolled solutions. Connecting and ...

Crossing the Chasm: Patterns to Develop, Operationalize, and Maintain ML Models (Cloud Next '18)

In many organizations, data scientists develop machine learning models and data/ML engineers put them into production. The chasm between the two roles ...

Leveraging Machine Learning for Fraud Analytics (Cloud Next '18)

We will showcase how we can build advance accelerators for Fraud Analytics solutions leveraging Google Stack. We will demonstrate how these accelerators fill ...

Introducing ML.NET : Build 2018

ML.NET is aimed at providing a first class experience for Machine Learning in .NET. Using ML.NET, .NET developers can develop and infuse custom AI into ...

RGA 10 Quick Start Guide Chapter 3: Crow‐AMSAA (NHPP) Model

In this video, you'll analyze the data from the last chapter, but using the Crow-AMSAA model instead. Then you'll use reports and overlay plots to compare the ...

Safe Reinforcement Learning in Robotics with Bayesian Models | Felix Berkenkamp

Felix Berkenkamp presents work on safe exploration of parameter space in robotics using Gaussian Processes. Reinforcement learning is a powerful paradigm ...

Reliability 4 - Markov chains and Petri nets

This part of the presentation describes the mathematical models that can be used for reliability analysis: Markov chains and Petri nets. See also my blog: ...