AI News, Chasing complexity

Chasing complexity

In his junior year of high school, Ryan Williams transferred from the public school near his hometown of Somerville, Alabama — “essentially a courthouse and a couple gas stations,” as he describes it — to the Alabama School of Math and Science in Mobile.

Eventually, his frustrated teacher pulled a heavy white book off of a shelf, dumped it dramatically on Williams’s desk, and told him to look up the problem described in the final chapter.

That may sound obscure, but when he published his result, in 2010, the complexity theorist Scott Aaronson — then at MIT, now at the University of Texas — wrote on his blog, “The result is one of the most spectacular of the decade.” “We all knew the goal was to walk on the moon (i.e., prove P≠NP and related separations),” Aaronson later added, “and what Ryan has done is to build a model rocket that gets a couple hundred feet into the air, whereas all the previous rockets had suffered from long-identified technical limitations that had kept them from getting up more than 50 feet.

… It’s entirely plausible that those improvements really are a nontrivial step on the way to the moon.” Basic principles Williams is the son of a mother who taught grade school and a father who ran his own construction firm and whose family indoctrinated Williams into one side of a deep Alabamian social divide — the side that roots for Auburn in the annual Auburn-Alabama football game.

He would say, ‘Point the level here and see if it’s grade.’” In first grade, having scored highly on a standardized test the year before, Williams began taking a class one day a week at a school for gifted children on the opposite side of the county.

Within weeks, he had proven that calculating optimal k-anonymity — the minimum number of redactions necessary to protect the privacy of someone’s personal data — was an NP-complete problem, meaning that it was (unless someone proves P equal to NP) prohibitively time consuming to compute.

But in fact, Williams argues, the problems are more symmetric than they first appear, because establishing an algorithm’s minimum running time requires generalizing about every possible instance of a particular problem that it will ever have to face.

“Reasoning about lower bounds just seems really hard, but yet, when it comes to designing algorithms to solve the problem, it’s somehow just more natural for people to think about,” Williams says.

Maybe if you phrased the problem the right way, it would become an algorithmic problem.” Computational jiu-jitsu Any NP-complete problem can be represented as a logic circuit — a combination of the elementary computing elements that underlie all real-world computing.

“We were gradually building up some steam with slightly better, slightly better lower bounds, but it completely stopped in its tracks because of this one pesky little class that nobody could get a handle on.” Since Williams’s breakthrough paper, both he and other complexity theorists have used his technique for translating between algorithms and lower bounds to prove results about other classes of problems.

arXiv.org > cs > arXiv:1402.0054

bibtex Amir Abboud Virginia Vassilevska Williams Bookmark (what is this?) Computer Science > Data Structures and Algorithms Popular conjectures imply strong lower bounds for dynamic problems Amir Abboud, Virginia Vassilevska Williams (Submitted on 1 Feb 2014) We consider several well-studied problems in dynamic algorithms and prove that sufficient progress on any of them would imply a breakthrough on one of five major open problems in the theory of algorithms: 1.

The problems we consider include dynamic versions of bipartite perfect matching, bipartite maximum weight matching, single source reachability, single source shortest paths, strong connectivity, subgraph connectivity, diameter approximation and some nongraph problems such as Pagh's problem defined in a recent paper by Patrascu [STOC 2010].

A New Map Traces the Limits of Computation

For more than 40 years, researchers had been trying to find a better way to compare two arbitrary strings of characters, such as the long strings of chemical letters within DNA molecules.

If a better method to calculate this “edit distance” could be found, researchers would be able to quickly compare full genomes or large data sets, and computer scientists would have a powerful new tool with which they could attempt to solve additional problems in the field.

Yet in a paper presented at the ACM Symposium on Theory of Computing, two researchers from the Massachusetts Institute of Technology put forth a mathematical proof that the current best algorithm was “optimal” — in other words, that finding a more efficient way to compute edit distance was mathematically impossible.

The media’s confusion about edit distance reflects a murkiness in the depths of complexity theory itself, where mathematicians and computer scientists attempt to map out what is and is not feasible to compute as though they were deep-sea explorers charting the bottom of an ocean trench.

This algorithmic terrain is just as vast — and poorly understood — as the real seafloor, said Russell Impagliazzo, a complexity theorist who first formulated the exponential-time hypothesis with Ramamohan Paturi in 1999.

“If a funding agency read that [Boston Globe headline] and took it to heart, then I don’t see any reason why they’d ever fund work on edit distance again,” he said.

The ambiguity over its truth or falsity also reveals the basic practices of theoretical computer science, in which math and logic often marshal “strong evidence,” rather than proof, of how algorithms behave at a fundamental level.

Whether by assuming SETH’s validity or, in Williams’ case, trying to refute it, complexity theorists are using this arcane hypothesis to explore two different versions of our universe: one in which precise answers to tough problems stay forever buried like needles within a vast computational haystack, and one in which it may be possible to speed up the search for knowledge ever so slightly.

This relationship between NP-complete problems is central to the “P versus NP” conjecture, the most famous unsolved problem in computer science, which seeks to define the limits of computation in mathematical terms.

(The informal version: if P equals NP, we could quickly compute the true answer to almost any question we wanted, as long as we knew how to describe what we wanted to find and could easily recognize it once we saw it, much like a finished jigsaw puzzle.

The vast majority of computer scientists believe that P does not equal NP.) The P versus NP problem also helps draw an informal line between tractable (“easy”) and intractable (“hard”) computational procedures.

By assuming that certain problems are computationally intractable under precise constraints, researchers can make airtight inferences about the properties of other problems, even ones that look unrelated at first.

SETH speaks directly about the hardness of NP-complete problems, but some surprising reductions have connected it to important problems in the complexity class P — the territory of so-called easy or efficiently solvable problems.

The result was striking because edit distance, while theoretically an easy problem in the complexity class P, would take perhaps 1,000 years to run when applied to real-world tasks like comparing genomes, where the number of symbols is in the billions (as opposed to book and back).

The key word, of course, is “if.” Indyk readily concedes that their result is not an unconditional impossibility proof, which is “the holy grail of theoretical computer science,” he said.

Two years ago, he took a tactic that he had used to try and refute SETH and applied it to the “all-pairs shortest paths” problem, a classic optimization task “taught in every undergraduate computer science curriculum,” he said.

Lance Fortnow, a complexity theorist and chairman of the Georgia Institute of Technology’s School of Computer Science, called Williams’ proof “the best progress in circuit lower bounds in nearly a quarter century.” The Map and the Territory In addition to these peripheral benefits, attacking SETH head-on helps researchers like Williams make progress in one of the central tasks of theoretical computer science: mapping the territory.

While he didn’t prove that edit distance is impossible to solve more efficiently, he did prove that this theoretically tractable problem is fundamentally connected to the intrinsic hardness of NP-complete problems.