AI News, How VW Predicts Churn with GPU-Accelerated Machine Learning and Visual Analytics
- On Sunday, June 3, 2018
- By Read More
How VW Predicts Churn with GPU-Accelerated Machine Learning and Visual Analytics
The reason for this is that while each technology in the process leverages GPU’s beautifully on theirown,if data has to leave the GPU to move to the next system in the process, this can have significant latency implications.
So, keeping the data in a GPU buffer through the exploration, extraction, preprocessing, model training, validation, and prediction makes it much faster and simpler.
MapD and Anaconda, another GOAi founding member, are involved in development of pythonic clients such aspymapd(interface toMapD'sSQL engine supporting DBAPI 2.0),pygdf(Python interface to access and manipulate the GPU Dataframe) along with our core platform modules MapD Core SQL engine and MapD Immerse, visual analytics tool.
With the help of Apache Arrow, an efficient data interchange is created between MapD and pygdf to leverage various machine learning tools like h2o.ai, PyTorch, and others.
The example used in this post, the Customer Automotive Churn dataset (which focuses on a real-world problem of customer vehicle churning) has been obtained from Volkswagen as a result of our joint collaboration on implementing analytics workflow on GPUs.
Assuming that you have loaded the churn dataset into MapD, let’s start to build some charts in MapD Immerse, which by default starts on https://localhost:9092 .
The capability used in this post to display charts from different tables in one dashboard is limited to MapD Immerse Enterprise edition, but one can use the Community Edition to create a separate dashboard for each chart.
It can be observed that car models produced in early years, especially models 8, 10, and 11, are more prone to churn.
Each of the two queries had 21 feature columns, and the combined queries had 1.7 Million data points and extraction of data using two queries combined took just 0.45 seconds.
The reason being with the help of Arrow, pointers of memory buffer holding data on GPUs are being passed from MapD to pandas which give us back pygdf dataframe.
So, it took roughly 1.3 seconds to train model with approximately 1.7 million data points, 14 seconds to cross-validate and just 0.3 seconds to copy the data into pandas dataframe.
We can repeat this process multiple times, and the best part about it is that we can find the results within a few minutes as compared to hours assessing and assembling data with the traditional means.
- On Thursday, January 17, 2019
O'Reilly AI NYC 2017 : Learn how a GPU database helps you deploy an easy-to-use scalable AI solution
Artificial intelligence's promise is to change how we work and live. With cognitive applications in healthcare, retail, financial services, manufacturing, and ...
Accelerate Tableau & Power BI with Kinetica GPU Database
Kinetica combines GPU's brute-force compute with the simplicity of a relational database for millisecond response to queries on massive data sets. Speed up BI ...
Scalable Machine Learning in R and Python with H2O
This is a recording of the first East Bay AI and Deep Learning meetup hosted at WeWork Berkeley on May 3, 2017. Please excuse the audio as this session was ...