AI News, O'Reilly

O'Reilly

Singh is CEO and co-founder of Ayasdi, a company that leverages machine intelligence software to automate and accelerate discovery of data insights.

Author of numerous patents and publications in top mathematics and computer science journals, Singh has developed key mathematical and machine learning algorithms for topological data analysis.

While at TI, I got to work on a project using clusters of specialized chips called Digital Signal Processors (DSPs) to solve computationally hard math problems.

There, I was able to apply some of my DSP work to solving partial differential equations and demonstrate that a fluid dynamics researcher need not buy a supercomputer anymore;

I then spent some time in mechanical engineering building similar GPU-based partial differential equation solvers for mechanical systems.

DB: Can you tell us about the evolution of topology, broadly speaking, and share some insights as to why it is so useful for unifying disparate areas in machine intelligence?

For example, if you have the equation for a circle, topology is the area of math that allows you to say that, for example, “Oh, a circle is a single connected thing;

and it has a simple connectivity structure.” Over the course of its development over the last 300 years, it has become the study of mapping one space into another.

Furthermore, within supervised algorithms, there are two types: algorithms that take an input vector to predict a number, and algorithms that take a vector to produce a class label.

What unifies these four distinct functions is they all produce functional mappings from an input space to an output space.

So, if you want to combine the results of these various learning algorithms together, topology allows you to do that, while still maintaining guarantees about the underlying shape or distributions.

Topology, even though it utilizes all these machine learning algorithms under the covers, allows you to discover the underlying shape of the data so that you don’t have to assume it.

Machine learning algorithms were developed as a methodology to extract value from increasingly large and complex data sets.

Topology addresses this issue of increasing data complexity, by the comprehensive investigation of your data set with any algorithm or combination of algorithms, and presents an objective result (i.e., no information loss).

It constructs a network in which every node in said network contains a subset of your data, and two nodes are connected to each other if they share some data.

If you think about it from a tabular perspective, you feed it your table, and the output is this graph representation in which every node is a subset of the rows.

The first is that irrespective of the underlying machine learning algorithms that have been combined in a particular investigation, the output will always look like this graph.

The second is that this network form is very computable — i.e., you can easily build things on top of it: recommender systems, piecewise linear models, gradient operators, and so on.

DB: Is it fair to generalize that when performing a topological investigation, the first order of business is using some form of dimensionality reduction algorithm?

Once you reduce the data, compact it, and get the benefit of being cognizant of the topology, you’re able to maintain the shape while uncovering relationships in and among the data.

DB: Given you can map the same data into different views/representations, is there a single view that’s analytically superior toward the goal of understanding any particular problem?

You somehow have to say, “Okay, this is the set of parameters that I’m going to choose.” You can imagine in all of those settings, it’s very helpful to have tools that tell you the stability range of these parameters.

Topological data analysis as a framework for machine intelligence

Gunnar Carlsson is a professor of mathematics (emeritus) at Stanford University and is cofounder and president at Ayasdi, which is commercializing products based on machine intelligence and topological data analysis.

Originally, his work focused on the pure aspects of the field, but in 2000 he began work on the applications of topology to the analysis of large and complex datasets, which led to a number of projects, notably a multi-university initiative funded by the Defense Advanced Research Projects Agency.

O'Reilly

Singh is CEO and co-founder of Ayasdi, a company that leverages machine intelligence software to automate and accelerate discovery of data insights.

Author of numerous patents and publications in top mathematics and computer science journals, Singh has developed key mathematical and machine learning algorithms for topological data analysis.

While at TI, I got to work on a project using clusters of specialized chips called Digital Signal Processors (DSPs) to solve computationally hard math problems.

There, I was able to apply some of my DSP work to solving partial differential equations and demonstrate that a fluid dynamics researcher need not buy a supercomputer anymore;

I then spent some time in mechanical engineering building similar GPU-based partial differential equation solvers for mechanical systems.

DB: Can you tell us about the evolution of topology, broadly speaking, and share some insights as to why it is so useful for unifying disparate areas in machine intelligence?

For example, if you have the equation for a circle, topology is the area of math that allows you to say that, for example, “Oh, a circle is a single connected thing;

and it has a simple connectivity structure.” Over the course of its development over the last 300 years, it has become the study of mapping one space into another.

Furthermore, within supervised algorithms, there are two types: algorithms that take an input vector to predict a number, and algorithms that take a vector to produce a class label.

What unifies these four distinct functions is they all produce functional mappings from an input space to an output space.

So, if you want to combine the results of these various learning algorithms together, topology allows you to do that, while still maintaining guarantees about the underlying shape or distributions.

Topology, even though it utilizes all these machine learning algorithms under the covers, allows you to discover the underlying shape of the data so that you don’t have to assume it.

Machine learning algorithms were developed as a methodology to extract value from increasingly large and complex data sets.

Topology addresses this issue of increasing data complexity, by the comprehensive investigation of your data set with any algorithm or combination of algorithms, and presents an objective result (i.e., no information loss).

It constructs a network in which every node in said network contains a subset of your data, and two nodes are connected to each other if they share some data.

If you think about it from a tabular perspective, you feed it your table, and the output is this graph representation in which every node is a subset of the rows.

The first is that irrespective of the underlying machine learning algorithms that have been combined in a particular investigation, the output will always look like this graph.

The second is that this network form is very computable — i.e., you can easily build things on top of it: recommender systems, piecewise linear models, gradient operators, and so on.

DB: Is it fair to generalize that when performing a topological investigation, the first order of business is using some form of dimensionality reduction algorithm?

Once you reduce the data, compact it, and get the benefit of being cognizant of the topology, you’re able to maintain the shape while uncovering relationships in and among the data.

DB: Given you can map the same data into different views/representations, is there a single view that’s analytically superior toward the goal of understanding any particular problem?

You somehow have to say, “Okay, this is the set of parameters that I’m going to choose.” You can imagine in all of those settings, it’s very helpful to have tools that tell you the stability range of these parameters.

Modern data analysis presents a variety of challenges, including the size, the dimensionality, the complexity, and the multiple-modality of the data.

In topological data analysis, one leverages the fact that the shape of the data often reflects important and interpretable patterns within, although topological techniques alone typically cannot match the predictive power of machine learning.

Days 1-3: Introductory tutorial on applied topology and machine learning The first three days of the bootcamp will include an introductory tutorial on applied topology, on machine learning, and on the marriage between the two.

The featured topic from applied topology will be persistent homology, and the featured topic from machine learning will be classical algorithms such as clustering, support vector machines (SVM), and random forests.

Days 4-5: Research conference on topology and machine learning The final two days of the bootcamp will feature a research conference on current trends in topology and machine learning.

#ODSC Meetup | Topological Data Analysis: New Perspectives on Machine Learning - by Jesse Johnson

Abstract: When looked at from the right perspective, many of the ideas and algorithms involved in machine learning/data science can be thought of as ...

"Topological Data Analysis for the Working Data Scientist" - Anthony Bak @ Trulia

Abstract This meetup is a continuation of the two Introduction to Topological Data Analysis (TDA) meetups done last year. Anthony will begin with a short review ...

Topological Data Analysis: potential applications to computer vision

Topological Data Analysis quantifies hidden topological structures in big raw noisy data. The flagship tool (persistent homology) summarises the underlying ...

Allison Gilmore, Data Scientist, Ayasdi @ MLconf SF

A Role for Topology in Data Science: The mathematical discipline of topology offers a new approach to data analysis that is especially important in today's world ...

Using Topological Data Analysis on your BigData

Synopsis: Topological Data Analysis (TDA) is a framework for data analysis and machine learning and represents a breakthrough in how to effectively use ...

Exploring Generative 3D Shapes Using Autoencoder Networks

We propose a new algorithm for converting unstructured triangle meshes into ones with a consistent topology for machine learning applications. We combine the ...

Autodesk Generative Design

What if you can come up with thousands of options for a single design without drawing? This is generative design - harnessing massive computing power, ...

Introduction to topological data analysis (English audio)

Michael Lesnick describes the incursion of modern algebraic topology in the analysis of databases. He also explains the development of tools for studying the ...

Tim Poston - Keynote: Data Comes in Shapes

Data comes in shapes. The study of shape is geometry, in as many dimensions as you have variables. You can't visualise them all, but you can see in 2D and ...