Facebook's AI chief researching new breed of semiconductor

(Bloomberg) --Facebook Inc.’s chief AI researcher has suggested the company is working on a new class of semiconductor that would work very differently than most existing designs.

Yann LeCun said that future chips used for training deep learning algorithms, which underpin most of the recent progress in artificial intelligence, would need to be able to manipulate data without having to break it up into multiple batches.

In April, Bloomberg reported that Facebook was hiring a hardware team to build its own chips for a variety of applications, including artificial intelligence as well as managing the complex workloads of the company’s vast datacenters.

For the moment the most commonly-used chips for training neural networks -- a kind of software loosely based on the way the human brain works -- are graphical processing units from companies such as Nvidia Corp., originally designed to handle the computing intensive workloads of rendering images for video games.

LeCun said that for the moment, GPUs would remain important for deep learning research, but the chips were ill-suited for running the AI algorithms once they were trained, whether that was in datacenters or on devices like mobile phones or home digital assistants.

Facebook is Working on Its Own Custom AI Silicon

Facebook’s chief AI researcher, Yann LeCun, has stated that the company is working on its own custom AI silicon, with the goal of building far more efficient methods of processing neural networks in hardware and boosting performance, addressable problems, and energy efficiency.

Its themes call for expanding the role of AI from language translation to content policing, the goal of creating smarter devices that can differentiate between, say, weeds and roses, and giving computers what we typically call “common sense.”

According to its reporting, LeCun is focused on creating chips that don’t have to break data sets into small batches for processing, but instead, work with larger amounts of information without this step.

If you want to mow an area (or vacuum a carpet), you don’t need to teach the device how to differentiate between what to mow or clean nearly as much as you’d have to teach it if you wanted it to specifically avoid non-weed plants.

The problem with specialty microprocessor architectures, historically speaking, is that even if you had an idea for a particularly clever way to execute a specific type of instructions, the speed of general purpose computation was accelerating quickly enough to eat most of your market advantage before your product could be built.

If it took three years to bring your part to market, you’re up against the 66MHz Pentium, a CPU more than 2x faster by clock and instruction set improvements than your initial comparison point.

This one-two punch of unbeatable economy of scale and rapid-fire compute improvements explains why general-purpose computation took over the market from specialty architectures and why it’s maintained its lock on the market ever since.

The reason they’re such an exception is that the nature of a graphics workload is so different from a general purpose computational workload that you’d never build a GPU to handle the tasks of a serial CPU or vice-versa.

So long as Intel (or AMD, IBM, or any other general-purpose CPU vendor) could kick out double-digit performance improvements every 12-18 months, the effort of investing in a 3-5 year architectural research project was too uncertain to justify.

