AI News, (PDF) Artificial Intelligence's Turn of Philosophy artificial intelligence

Unpredictability of Artificial Intelligence?

With increases in the capabilities of artificial intelligence, over the last decade, a significant number of researchers have realized the importance in not only creating capable intelligent systems, but also making them safe and secure [1-6].

In this post, we concentrate on a poorly understood concept of unpredictability of intelligent systems [17], which limits our ability to understand the impact of intelligent systems we are developing and is a challenge for software verification and intelligent system control, as well as AI Safety in general.

In theoretical computer science, and in software development in general, many well-known impossibility results are well-established, and some of them are strongly related to the subject of this blog post.

For example, Rice’s Theorem states that no computationally effective method can decide if a program will exhibit a particular non-trivial behavior, such as producing a specific output [18].

Unpredictability of AI, one of many impossibility results in AI Safety also known as Unknowability [22] or Cognitive Uncontainability [23], is defined as our inability to precisely and consistently predict what specific actions an intelligent system will take to achieve its objectives, even if we know terminal goals of the system.

it simply points out a general limitation on how well such efforts can perform, and is particularly pronounced with advanced generally intelligent systems (superintelligence) in novel domains.

That means they can make the same decisions as the superintelligence, which makes them as smart as superintelligence, but that is a contradiction, as superintelligence is defined as a system smarter than any person is.

The amount of unpredictability can be formally measured via the theory of Bayesian surprise, which measures the difference between posterior and prior beliefs of the predicting agent [24-27].

A simple heuristic is to estimate the amount of surprise as proportionate to the difference in intelligence between the predictor and the predicted agent.

Developers of famous intelligent systems such as Deep Blue (Chess) [31, 32], IBM Watson (Jeopardy) [33], and AlphaZero (Go) [34, 35] did not know what specific decisions their AI is going to make for every turn.

they may know the ultimate goals of their system, but they do not know the actual step-by-step plan it will execute, which of course has serious consequences for AI Safety [36-39].

In harder and most real-world cases, even the overall goal of the system may not be precisely known or may be known only in abstract terms, aka to “make the world better.”

While in some cases the terminal goal(s) could be learned, even if you can learn to predict an overall outcome with some statistical certainty, you cannot learn to predict all the steps to the goal a system of superior intelligence would take.

“Vinge's Principleimplies that when an agent is designing another agent (or modifying its own code), it needs to approve the other agent's design without knowing the other agent's exact future actions.”

we can usually predict the outcome of common physical processes without knowing specific behavior of particular atoms, just like we can typically predict overall behavior of the intelligent system without knowing specific intermediate steps.

complex AI agents often exhibit inherent unpredictability: they demonstrate emergent behaviors that are impossible to predict with precision—even by their own programmers.

In fact, Alan Turing and Alonzo Church showed the fundamental impossibility of ensuring an algorithm fulfills certain properties without actually running said algorithm.

There are fundamental theoretical limits to our ability to verify that a particular piece of code will always satisfy desirable properties, unless we execute the code, and observe its behavior.”

Schmidhuber, J., Simple algorithmic theory of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes.

Nick Bostrom | Wikipedia audio article

This is an audio version of the Wikipedia Article: Nick Bostrom Listening is a more natural way of learning, when compared to reading. Written language only ...