AI News, Machine Learning Goes Open Source
Machine Learning Goes Open Source
Two years back GigaOm’s Derrick Harris opined that “it’s difficult to imagine a new tech company launching that doesn’t at least consider using machine learning models to make its product or service more intelligent.”
But engineers at Google, Twitter and new startups have largely been forced to roll their own machine learning libraries and systems.
What’s been missing are open-source projects that provide essential building blocks for easily embedding machine learning into applications.
The Apache Software Foundation has sought to change this with Apache Mahout, and now PredictionIOjust raised $2.5 million in an effort to take open-source machine learning even further.
sat down with PredictionIOfounder Simon Chan to better understand the market and why open source matters in the complex world of machine learning.
RW:Machine learning sounds great, but historically hasn’t worked as advertised, or it’s required extensive engineering resources to pull off.
PredictionIO is showcased by Github as one of the most popular open source machine learning projects in the world—thousands of developers are engaged in making it better.
Currently contributions include SDKs (e.g., for iOS, .NET, Node.js) and plugins (e.g., for Magento and Drupal), but we’re also seeing new engines and algorithms run on top of our infrastructure.
We’re also working on some exciting projects yet to be announced in domains such as mobile health and gaming with applications that include churn analysis and trend detection.
The PredictionIO Machine Learning Project Propels Intelligent Development
PredictionIO, one of the latest Apache Software Foundation projects elevated to top-level status, like many such projects, grew out of frustration.
“Every time we needed to build something intelligent — doing a news recommendation or resume analysis, product matching, something like that — we had to build the machine learning tech stack from scratch every time,” said Simon Chan, co-founder of PredictionIO, along with Donald Szeto, Kenneth Chan and Thomas Stone.
Companies building a similar application can simply download your template and run it as is or modify some of the components, said Chan, now senior director of product management for Salesforce Einstein.
Building a recommendation engine, a project that typically could take a team of team of expert data scientists months to do can be accomplished in a couple of weeks with one or two engineers with PredictionIO, he said in a blog post.
While ASF has machine learning libraries such as Mahout and MLlib, machine learning remains in early stages and needs researchers, developers and companies to contribute to it to make it successful, according to Chan.
At the recent Dreamforce ’17 conference, for instance, staff from Ulster Bank in Scotland talked about how it uses PredictionIO and Einstein for their Next Best Offer predictions — using past customer behavior to know which banking services to suggest next.
Airbnb announces Aerosolve, an open-source machine learning software package
Home-rental company Airbnb today introduced a new open-source project called Aerosolve.
The new tool, announced at Airbnb’s 2015 OpenAir developer conference in San Francisco, powers new pricing tips for hosts, which was also announced today.
“This library is meant to be used with sparse, interpretable features such as those that commonly occur in search (search keywords, filters) or pricing (number of rooms, location, price),”
“It is not as interpretable with problems with very dense non-human interpretable features such as raw pixels or audio samples.”
The rapid evolution of open-source machine learning
When millions of people across the world tuned in to watch DeepMind’s machine beat the human Go world champion Lee Sedol, they also witnessed a historic victory for open-source.
Christopher Bishop published a research paper in 1995 called Neural Networks for Pattern Recognition that presented the corpus of techniques that took machine learning from a statistical science to one inspired by the biological networks in our brains.
Geoffrey Hinton noted in the foreward that Bishop “has wisely avoided the temptation to try to cover everything and has omitted interesting topics such as reinforcement learning, Hopfield Networks and Boltzmann machines in order to focus on the types of neural networks that are most widely used in practical applications”.
By mid–2014, PredictionIO had released an open-source machine learning server — the first to provide a full stack solution with tools to build, deploy and optimise machine learning models.
Google received the most press and developer attention with TensorFlow, which comes as no surprise as machine learning has been at the core of the search and advertising products that enabled Google to build one of the most innovative and successful technology companies in history.
By the time the year drew to a close, an A-list of Silicon Valley entrepreneurs, investors and world class researchers had formed OpenAI, a nonprofit with an ultimate aim of solving AI in a way that Elon Musk’s well cited fear of strong AI under the control of bad actors.
From my perspective, there are three reasons for tech giants release open-source machine learning projects: When a startup releases an open-source deep tech project, it generates awareness, some of which converts into paid customers and recruitment.
With the commoditization of the full AI technology stack, the focus shifts from core machine learning technologies to building the best models — and this requires a vast amount of data and domain experts to create to train the models.
While banks are now moving toward using open-source technologies, it will still take a while for the culture the change to the extent that they also contribute directly to projects in the same way that employees of technology companies now regularly do.
One of our core principles is to give developers and data scientists the best tools for the job, regardless of whether the component technologies are built by Seldon engineers or come part of another project.
- On Wednesday, January 16, 2019
sfspark.org: Peng Ye, Building Machine Learning Pipeline Using Aerosolve
Aerosolve is an open source machine learning library built by machine learning engineers on the Pricing and Availability team at Airbnb. From the project's ...
Intro to Azure ML & Cloud Computing
Azure Machine Learning Studio is a fully featured graphical data science tool in the cloud. You will learn how to upload, analyze, visualize, manipulate, and ...
Architectures for Big Data Analytics and Data Mining Platforms- Panel Discussion, 20120827
How are cutting edge, stellar companies advancing the state of the art ..