AI News, Introduction to Computers/AI
Introduction to Computers/AI
There is no single unifying theory of Artificial intelligence, however a number of approaches can be seen below.
'The phrase “The Turing Test” is most properly used to refer to a proposal made by Turing (1950) as a way of dealing with the question whether machines can think.'
Many problems arise in the process of using probability to rank and solve problems, most of these problems arise from the fact that in many cases these rankings are subjective or relate to moral choices.
An important concept of 'Weak AI' is brute force: a technique for solving a complex problem by using a computer's fast processing capability to repeat a simple procedure many times. For
example, a chess playing program will calculate all the possible moves that can apply to a given situation and then choose the best one;
Strong AI (artificial intelligence) claims that computers can be made to think on a level that is at least equal to humans and possibly even be conscious of themselves.
A more or less flexible or efficient approach can be taken depending on the requirements established, which influences how artificial the intelligent behaviour appears.
Applications of automata theory have included imitating human comprehension and reasoning skills using computer programs, duplicating the function of muscles and tendons by hydraulic systems or electric motors, and reproducing sensory organs by electronic sensors such as smoke detectors.
Example: Writing an E with a stylus on my Palm and the Palm recognizes the drawing formating as an E by noticing the similarity - but not exactness - that the user draws an E
in facial recognition to identify faces, and in hand-writing recognition, fingerprint identification, robot vision, and automatic voice recognition.
This capacity to learn from experience, analytical observation, and other means, results in a system that can continuously self-improve and thereby offer increased efficiency and effectiveness.
(Example: Writing an E with a stylus on my Palm and the Palm recognizes the drawing formating as an E by noticing how a specific users always writes an E and adding this extra information to the heuristics already provided)
This serves as a means to write natural and expert systems programs which use AI to rank possibilities of answers to users questions.
is the process of converting a speech signal to a sequence of words in the form of digital data, by means of an algorithm implemented as a computer program.
'Speaker Recognition' is a bit different then speech recognition in that it looks to decipher the voice, tone, pitch etc of a person rather then the actual words that the person is saying.
In computer science and linguistics, parsing (as it is formally called 'syntactic analysis')is the process of analyzing a sequence of words (tokens) to determine its grammatical structure with respect to a given format.
The muscle force pushes air out of the lungs (shown schematically as a piston pushing up within a cylinder) and through the trachea.
When the vocal cords are tensed, the air flow causes them to vibrate, producing so-called voiced speech sounds.
When the vocal cords are relaxed, in order to produce a sound, the air flow either must pass through a constriction in the vocal tract and thereby become turbulent, producing so-called unvoiced sounds, or it can build up pressure behind a point of the total closure within the vocal tract, and when the closure is opened, the pressure is suddenly and abruptly release, causing a brief transient sound.
software agent is a piece of software that acts on behalf of you or implies your authority to decide and activate themselves if an action is appropriate.
They can collect information from the web according to topics you select for, or in the future it will even make dentist appointments for you according to openings in your calendar and the dentist's calendar. They
system that allows the user to inquire about a topic and the system will produce results based on facts and rules It queries the relationships between knowledge items to find the users most likely to answer
The driverless car concept embraces an emerging family of highly automated cognitive and control technologies, ultimately aimed at a full 'taxi-like' experience for car users, but without a human driver.
It can achieved in various ways, either dedicated routes of travel that are pre-planned, or vehicles that are capable of precisely recognizing and executing drive commands.
There are four key areas to focus design on: sensors, navigation, motion planning and control of the vehicle itself.
Artificial intelligence (AI), sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and other animals.
In computer science AI research is defined as the study of 'intelligent agents': any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.
Colloquially, the term 'artificial intelligence' is applied when a machine mimics 'cognitive' functions that humans associate with other human minds, such as 'learning' and 'problem solving'.
The scope of AI is disputed: as machines become increasingly capable, tasks considered as requiring 'intelligence' are often removed from the definition, a phenomenon known as the AI effect, leading to the quip, 'AI is whatever hasn't been done yet.'
The traditional problems (or goals) of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception and the ability to move and manipulate objects.
Many tools are used in AI, including versions of search and mathematical optimization, artificial neural networks, and methods based on statistics, probability and economics.
This raises philosophical arguments about the nature of the mind and the ethics of creating artificial beings endowed with human-like intelligence which are issues that have been explored by myth, fiction and philosophy since antiquity.
In the twenty-first century, AI techniques have experienced a resurgence following concurrent advances in computer power, large amounts of data, and theoretical understanding;
and AI techniques have become an essential part of the technology industry, helping to solve many challenging problems in computer science, software engineering and operations research.
The study of mathematical logic led directly to Alan Turing's theory of computation, which suggested that a machine, by shuffling symbols as simple as '0' and '1', could simulate any conceivable act of mathematical deduction.
The success was due to increasing computational power (see Moore's law), greater emphasis on solving specific problems, new ties between AI and other fields (such as statistics, economics and mathematics), and a commitment by researchers to mathematical methods and scientific standards.
According to Bloomberg's Jack Clark, 2015 was a landmark year for artificial intelligence, with the number of software projects that use AI within Google increased from a 'sporadic usage' in 2012 to more than 2,700 projects.
He attributes this to an increase in affordable neural networks, due to a rise in cloud computing infrastructure and to an increase in research tools and datasets.
An AI's intended goal function can be simple ('1 if the AI wins a game of Go, 0 otherwise') or complex ('Do actions mathematically similar to the actions that got you rewards in the past').
this is similar to how animals evolved to innately desire certain goals such as finding food, or how dogs can be bred via artificial selection to possess desired traits.
Some of the 'learners' described below, including Bayesian networks, decision trees, and nearest-neighbor, could theoretically, if given infinite data, time, and memory, learn to approximate any function, including whatever combination of mathematical functions would best describe the entire world.
In practice, it is almost never possible to consider every possibility, because of the phenomenon of 'combinatorial explosion', where the amount of time needed to solve a problem grows exponentially.
The third major approach, extremely popular in routine business AI applications, are analogizers such as SVM and nearest-neighbor: 'After examining the records of known past patients whose temperature, symptoms, age, and other factors mostly match the current patient, X% of those patients turned out to have influenza'.
A fourth approach is harder to intuitively understand, but is inspired by how the brain's machinery works: the artificial neural network approach uses artificial 'neurons' that can learn by comparing itself to the desired output and altering the strengths of the connections between its internal neurons to 'reinforce' connections that seemed to be useful.
Therefore, to be successful, a learner must be designed such that it prefers simpler theories to complex theories, except in cases where the complex theory is proven substantially better.
Many systems attempt to reduce overfitting by rewarding a theory in accordance with how well it fits the data, but penalizing the theory in accordance with how complex the theory is.
A toy example is that an image classifier trained only on pictures of brown horses and black cats might conclude that all brown patches are likely to be horses.
instead, they learn abstract patterns of pixels that humans are oblivious to, but that linearly correlate with images of certain types of real objects.
Humans also have a powerful mechanism of 'folk psychology' that helps them to interpret natural-language sentences such as 'The city councilmen refused the demonstrators a permit because they advocated violence'.
For example, existing self-driving cars cannot reason about the location nor the intentions of pedestrians in the exact way that humans do, and instead must use non-human modes of reasoning to avoid accidents.
By the late 1980s and 1990s, AI research had developed methods for dealing with uncertain or incomplete information, employing concepts from probability and economics.
These algorithms proved to be insufficient for solving large reasoning problems, because they experienced a 'combinatorial explosion': they became exponentially slower as the problems grew larger.
In addition, some projects attempt to gather the 'commonsense knowledge' known to the average person into a database containing extensive knowledge about the world.
by acting as mediators between domain ontologies that cover specific knowledge about a particular knowledge domain (field of interest or area of concern).
They need a way to visualize the future—a representation of the state of the world and be able to make predictions about how their actions will change it—and be able to make choices that maximize the utility (or 'value') of available choices.
A sufficiently powerful natural language processing system would enable natural-language user interfaces and the acquisition of knowledge directly from human-written sources, such as newswire texts.
Modern statistical NLP approaches can combine all these strategies as well as others, and often achieve acceptable accuracy at the page or paragraph level, but continue to lack the semantic understanding required to classify isolated sentences well.
Besides the usual difficulties with encoding semantic commonsense knowledge, existing semantic NLP sometimes scales too poorly to be viable in business applications.
is the ability to use input from sensors (such as cameras (visible spectrum or infrared), microphones, wireless signals, and active lidar, sonar, radar, and tactile sensors) to deduce aspects of the world.
a giant, fifty-meter-tall pedestrian far away may produce exactly the same pixels as a nearby normal-sized pedestrian, requiring the AI to judge the relative likelihood and reasonableness of different interpretations, for example by using its 'object model' to assess that fifty-meter pedestrians do not exist.
Advanced robotic arms and other industrial robots, widely used in modern factories, can learn from experience how to move efficiently despite the presence of friction and gear slippage.
the paradox is named after Hans Moravec, who stated in 1988 that 'it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility'.
Moderate successes related to affective computing include textual sentiment analysis and, more recently, multimodal affect analysis (see multimodal sentiment analysis), wherein AI classifies the affects displayed by a videotaped subject.
Some computer systems mimic human emotion and expressions to appear more sensitive to the emotional dynamics of human interaction, or to otherwise facilitate human–computer interaction.
These early projects failed to escape the limitations of non-quantitative symbolic logic models and, in retrospect, greatly underestimated the difficulty of cross-domain AI.
Many researchers predict that such 'narrow AI' work in different individual domains will eventually be incorporated into a machine with artificial general intelligence (AGI), combining most of the narrow skills mentioned in this article and at some point even exceeding human ability in most or all these areas.
One high-profile example is that DeepMind in the 2010s developed a 'generalized artificial intelligence' that could learn many diverse Atari games on its own, and later developed a variant of the system which succeeds at sequential learning.
hypothetical AGI breakthroughs could include the development of reflective architectures that can engage in decision-theoretic metareasoning, and figuring out how to 'slurp up' a comprehensive knowledge base from the entire unstructured Web.
Finally, a few 'emergent' approaches look to simulating human intelligence extremely closely, and believe that anthropomorphic features like an artificial brain or simulated child development may someday reach a critical point where general intelligence emerges.
For example, even specific straightforward tasks, like machine translation, require that a machine read and write in both languages (NLP), follow the author's argument (reason), know what is being talked about (knowledge), and faithfully reproduce the author's original intent (social intelligence).
A problem like machine translation is considered 'AI-complete', because all of these problems need to be solved simultaneously in order to reach human-level machine performance.
When access to digital computers became possible in the middle 1950s, AI research began to explore the possibility that human intelligence could be reduced to symbol manipulation.
in the 1960s and the 1970s were convinced that symbolic approaches would eventually succeed in creating a machine with artificial general intelligence and considered this the goal of their field.
Economist Herbert Simon and Allen Newell studied human problem-solving skills and attempted to formalize them, and their work laid the foundations of the field of artificial intelligence, as well as cognitive science, operations research and management science.
Their research team used the results of psychological experiments to develop programs that simulated the techniques that people used to solve problems.
Unlike Simon and Newell, John McCarthy felt that machines did not need to simulate human thought, but should instead try to find the essence of abstract reasoning and problem-solving, regardless of whether people used the same algorithms.
His laboratory at Stanford (SAIL) focused on using formal logic to solve a wide variety of problems, including knowledge representation, planning and learning.
found that solving difficult problems in vision and natural language processing required ad-hoc solutions – they argued that there was no simple and general principle (like logic) that would capture all the aspects of intelligent behavior.
When computers with large memories became available around 1970, researchers from all three traditions began to build knowledge into AI applications.
By the 1980s, progress in symbolic AI seemed to stall and many believed that symbolic systems would never be able to imitate all the processes of human cognition, especially perception, robotics, learning and pattern recognition.
This coincided with the development of the embodied mind thesis in the related field of cognitive science: the idea that aspects of the body (such as movement, perception and visualization) are required for higher intelligence.
Within developmental robotics, developmental learning approaches are elaborated upon to allow robots to accumulate repertoires of novel skills through autonomous self-exploration, social interaction with human teachers, and the use of guidance mechanisms (active learning, maturation, motor synergies, etc.).
Artificial neural networks are an example of soft computing --- they are solutions to problems which cannot be solved with complete logical certainty, and where an approximate solution is often sufficient.
Much of traditional GOFAI got bogged down on ad hoc patches to symbolic computation that worked on their own toy models but failed to generalize to real-world results.
However, around the 1990s, AI researchers adopted sophisticated mathematical tools, such as hidden Markov models (HMM), information theory, and normative Bayesian decision theory to compare or to unify competing architectures.
The shared mathematical language permitted a high level of collaboration with more established fields (like mathematics, economics or operations research).[d]
Compared with GOFAI, new 'statistical learning' techniques such as HMM and neural networks were gaining higher levels of accuracy in many practical domains such as data mining, without necessarily acquiring semantic understanding of the datasets.
The increased successes with real-world data led to increasing emphasis on comparing different approaches against shared test data to see which approach performed best in a broader context than that provided by idiosyncratic toy models;
In AGI research, some scholars caution against over-reliance on statistical learning, and argue that continuing research into GOFAI will still be necessary to attain general intelligence.
These algorithms can be visualized as blind hill climbing: we begin the search at a random point on the landscape, and then, by jumps or steps, we keep moving our guess uphill, until we reach the top.
Two popular swarm algorithms used in search are particle swarm optimization (inspired by bird flocking) and ant colony optimization (inspired by ant trails).
Fuzzy set theory assigns a 'degree of truth' (between 0 and 1) to vague statements such as 'Alice is old' (or rich, or tall, or hungry) that are too linguistically imprecise to be completely true or false.
Fuzzy logic is successfully used in control systems to allow experts to contribute vague rules such as 'if you are close to the destination station and moving fast, increase the train's brake pressure';
Probabilistic algorithms can also be used for filtering, prediction, smoothing and finding explanations for streams of data, helping perception systems to analyze processes that occur over time (e.g., hidden Markov models or Kalman filters).
Complicated graphs with diamonds or other 'loops' (undirected cycles) can require a sophisticated method such as Markov Chain Monte Carlo, which spreads an ensemble of random walkers throughout the Bayesian network and attempts to converge to an assessment of the conditional probabilities.
Otherwise, if no matching model is available, and if accuracy (rather than speed or scalability) is the sole concern, conventional wisdom is that discriminative classifiers (especially SVM) tend to be more accurate than model-based classifiers such as 'naive Bayes' on most practical data sets.
A simple 'neuron' N accepts input from multiple other neurons, each of which, when activated (or 'fired'), cast a weighted 'vote' for or against whether neuron N should itself activate.
one simple algorithm (dubbed 'fire together, wire together') is to increase the weight between two connected neurons when the activation of one triggers the successful activation of another.
In the 2010s, advances in neural networks using deep learning thrust AI into widespread public consciousness and contributed to an enormous upshift in corporate AI spending;
The main categories of networks are acyclic or feedforward neural networks (where the signal passes in only one direction) and recurrent neural networks (which allow feedback and short-term memories of previous input events).
Neural networks can be applied to the problem of intelligent control (for robotics) or learning, using such techniques as Hebbian learning ('fire together, wire together'), GMDH or competitive learning.
However, some research groups, such as Uber, argue that simple neuroevolution to mutate new neural network topologies and weights may be competitive with sophisticated gradient descent approaches.
For example, a feedforward network with six hidden layers can learn a seven-link causal chain (six hidden layers + output layer) and has a 'credit assignment path' (CAP) depth of seven.
Deep learning has transformed many important subfields of artificial intelligence, including computer vision, speech recognition, natural language processing and others.
In 2006, a publication by Geoffrey Hinton and Ruslan Salakhutdinov introduced another way of pre-training many-layered feedforward neural networks (FNNs) one layer at a time, treating each layer in turn as an unsupervised restricted Boltzmann machine, then using supervised backpropagation for fine-tuning.
Over the last few years, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer.
In 1992, it was shown that unsupervised pre-training of a stack of recurrent neural networks can speed up subsequent supervised learning of deep sequential problems.
The main areas of competition include general machine intelligence, conversational behavior, data-mining, robotic cars, and robot soccer as well as conventional games.
The 'imitation game' (an interpretation of the 1950 Turing test that assesses whether a computer can imitate a human) is nowadays considered too exploitable to be a meaningful benchmark.
High-profile examples of AI include autonomous vehicles (such as drones and self-driving cars), medical diagnosis, creating art (such as poetry), proving mathematical theorems, playing games (such as Chess or Go), search engines (such as Google search), online assistants (such as Siri), image recognition in photographs, spam filtering, prediction of judicial decisions
With social media sites overtaking TV as a source for news for young people and news organisations increasingly reliant on social media platforms for generating distribution,
In 2016, a ground breaking study in California found that a mathematical formula developed with the help of AI correctly determined the accurate dose of immunosuppressant drugs to give to organ patients.
Another study is using artificial intelligence to try and monitor multiple high-risk patients, and this is done by asking each patient numerous questions based on data acquired from live doctor to patient interactions.
The team supervised the robot while it performed soft-tissue surgery, stitching together a pig's bowel during open surgery, and doing so better than a human surgeon, the team claimed.
However, Google has been working on an algorithm with the purpose of eliminating the need for pre-programmed maps and instead, creating a device that would be able to adjust to a variety of new surroundings.
Some self-driving cars are not equipped with steering wheels or brake pedals, so there has also been research focused on creating an algorithm that is capable of maintaining a safe environment for the passengers in the vehicle through awareness of speed and driving conditions.
Financial institutions have long used artificial neural network systems to detect charges or claims outside of the norm, flagging these for human investigation.
For example, AI based buying and selling platforms have changed the law of supply and demand in that it is now possible to easily estimate individualized demand and supply curves and thus individualized pricing.
Other theories where AI has had impact include in rational choice, rational expectations, game theory, Lewis turning point, portfolio optimization and counterfactual thinking.
report by the Guardian newspaper in the UK in 2018 found that online gambling companies were using AI to predict the behavior of customers in order to target them with personalized promotions.
Developers of commercial AI platforms are also beginning to appeal more directly to casino operators, offering a range of existing and potential services to help them boost their profits and expand their customer base.
He argues that sufficiently intelligent AI, if it chooses actions based on achieving some goal, will exhibit convergent behavior such as acquiring resources or protecting itself from being shut down.
If this AI's goals do not reflect humanity's – one example is an AI told to compute as many digits of pi as possible – it might harm humanity in order to acquire more resources or prevent itself from being shut down, ultimately to better achieve its goal.
For this danger to be realized, the hypothetical AI would have to overpower or out-think all of humanity, which a minority of experts argue is a possibility far enough in the future to not be worth researching.
Jobs at extreme risk range from paralegals to fast food cooks, while job demand is likely to increase for care-related professions ranging from personal healthcare to the clergy.
The field of machine ethics is concerned with giving machines ethical principles, or a procedure for discovering a way to resolve the ethical dilemmas they might encounter, enabling them to function in an ethically responsible manner through their own ethical decision making.
The field was delineated in the AAAI Fall 2005 Symposium on Machine Ethics: 'Past research concerning the relationship between technology and ethics has largely focused on responsible and irresponsible use of technology by human beings, with a few people being interested in how human beings ought to treat machines.
In contrast to computer hacking, software property issues, privacy issues and other topics normally ascribed to computer ethics, machine ethics is concerned with the behavior of machines towards human users and other machines.
Research in machine ethics is key to alleviating concerns with autonomous systems—it could be argued that the notion of autonomous machines without such a dimension is at the root of all fear concerning machine intelligence.
Humans should not assume machines or robots would treat us favorably because there is no a priori reason to believe that they would be sympathetic to our system of morality, which has evolved along with our particular biology (which AIs would not share).
I think the worry stems from a fundamental error in not distinguishing the difference between the very real recent advances in a particular aspect of AI, and the enormity and complexity of building sentient volitional intelligence.'
The philosophical position that John Searle has named 'strong AI' states: 'The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds.'
Technological singularity is when accelerating progress in technologies will cause a runaway effect wherein artificial intelligence will exceed human intellectual capacity and control, thus radically changing or even ending civilization.
Ray Kurzweil has used Moore's law (which describes the relentless exponential improvement in digital technology) to calculate that desktop computers will have the same processing power as human brains by the year 2029, and predicts that the singularity will occur in 2045.
Invisible, this auxiliary lobe answers your questions with information beyond the realm of your own memory, suggests plausible courses of action, and asks questions that help bring out relevant facts.
In the 1980s, artist Hajime Sorayama's Sexy Robots series were painted and published in Japan depicting the actual organic human form with lifelike muscular metallic skins and later 'the Gynoids' book followed that was used by or influenced movie makers including George Lucas and other creatives.
Sorayama never considered these organic robots to be real part of nature but always unnatural product of the human mind, a fantasy existing in the mind even when realized in actual form.
Chapter 10 - Human-Computer Interaction
This chapter draws heavily from the topic of human–computer interactions (HCIs) and its relation to artificial intelligence (AI).
It grew out of the work on human factors or ergonomics with the intellectual aim of analyzing tasks that people perform with computers and the practical concerns of designing more usable and reliable computer systems.
HCI can provide techniques to model people's interactions with computers, guidelines for software design, methods to compare the usability of computer systems, and ways to study the effect of introducing new technology into organizations.
AI research follows two distinct, and to some extent competing, methods, the symbolic (or “top-down”) approach, and the connectionist (or “bottom-up”) approach.
The top-down approach seeks to replicate intelligence by analyzing cognition independent of the biological structure of the brain, in terms of the processing of symbols—whence the symbolic label.
(Tuning adjusts the responsiveness of different neural pathways to different stimuli.) In contrast, a top-down approach typically involves writing a computer program that compares each letter with geometric descriptions.
In The Organization of Behavior (1949), Donald Hebb, a psychologist at McGill University, Montreal, Canada, suggested that learning specifically involves strengthening certain patterns of neural activity by increasing the probability (weight) of induced neuron firing between the associated connections.
This hypothesis states that processing structures of symbols is sufficient, in principle, to produce artificial intelligence in a digital computer and that, moreover, human intelligence is the result of the same type of symbolic manipulations.
Using Artificial Intelligence to Augment Human Intelligence
By creating user interfaces which let us work with the
representations inside machine learning models, we can give
different visions of computing – have helped inspire and
determine the computing systems humanity has ultimately
world’s first general-purpose electronic computer, was
commissioned to compute artillery firing tables for the United
numerical problems, such as simulating nuclear explosions,
predicting the weather, and planning the motion of rockets.
machines operated in a batch mode, using crude input and output
of computers as number-crunching machines, used to speed up
calculations that would formerly have taken weeks, months, or more
In the 1950s a different vision of what computers are for began to
real-time interactive systems, with rich inputs and outputs, that
humans could work with to support and expand their own
(IA) deeply influenced many others, including researchers such as
and led to many of the key ideas of modern computing systems.
ideas have also deeply influenced digital art and music, and
fields such as interaction design, data visualization,
artificial intelligence (AI): competition for funding, competition
on building systems which put humans and machines to work
together, while AI has focused on complete outsourcing of
often framed in terms of matching or surpassing human performance:
This essay describes a new field, emerging today out of a
name artificial intelligence augmentation (AIA): the use
of AI systems to help develop new methods for intelligence
This new field introduces important new fundamental
questions, questions not associated to either parent field.
Our essay begins with a survey of recent technical work hinting at
artificial intelligence augmentation, including work
which can be used to explore and visualize generative machine
Such interfaces develop a kind of cartography of
generative models, ways for humans to explore and make meaning
extent are these new tools enabling creativity?
to generate ideas which are truly surprising and new, or are the
ideas cliches, based on trivial recombinations of existing ideas?
Can such systems be used to develop fundamental new interface
How will those new primitives change and expand the
Let’s look at an example where a machine learning model makes a
imagine you’re a type designer, working on creating a new
wish to experiment with bold, italic, and condensed variations.
Let’s examine a tool to generate and explore such variations, from
Of course, varying the bolding (i.e., the weight), italicization
instead of building specialized tools, users could build their own
tool merely by choosing examples of existing fonts.
suppose you wanted to vary the degree of serifing on a font.
the following, please select 5 to 10 sans-serif fonts from the top
machine learning model running in your browser will automatically
infer from these examples how to interpolate your starting font in
In fact, we used this same technique to build the earlier bolding
following examples of bold and non-bold fonts, of italic and
To build these tools, we used what’s called a generative
instance, if the font is 646464 by 646464 pixels, then we’d expect
to need 64×64=4,09664 \times 64 = 4,09664×64=4,096 parameters to describe a single
We do this by building a neural network which takes a small number
of input variables, called latent variables, and produces
have 404040 latent space dimensions, and map that into the
In other words, the idea is to map a low-dimensional space into a
The generative model we use is a type of neural network known as
changing the latent variables used as input, it’s possible to get
apparent complexity in a glyph, which originally required 4,0964,0964,096
The generative model we use is learnt from a training set of more
a close approximation to any desired font from the training set,
sense, the model is learning a highly compressed representation of
In fact, the model doesn’t just reproduce the training fonts.
examples, the neural net learns an abstract, higher-level model of
generalize beyond the training examples already seen, to produce
small number of training examples, and use that exposure to
That is, for any conceivable font – whether existing or
to find latent variables corresponding exactly to that font.
course, the model we’re using falls far short of this ideal
it’s useful to keep in mind what an ideal generative model would
description of what appear to be complex phenomena, reducing large
scientific theories sometimes enable us to generalize to discover
laws of thermodynamics and statistical mechanics enable us to find
possible to generalize, predicting unexpected new phases of
For example, in 1924, physicists used thermodynamics and
statistical mechanics to predict a remarkable new phase of matter,
occupy identical quantum states, leading to surprising large-scale
that instance, we take the average of all the latent vectors for
given font bolder, we simply add a little of the bolding vector to
the corresponding latent vector, with the amount of bolding vector
some generative models the latent vectors satisfy some constraints
a bolding vector, an italicizing vector, a condensing vector, and
example, where we start with an example glyph, in the middle, and
To understand these benefits, consider a naive approach to
bolding, in which we simply add some extra pixels around a glyph’s
non-expert’s way of thinking about type design, an expert does
results of this naive thickening procedure versus what is actually
left stroke is only changed slightly by bolding, while the right
fonts, bolding doesn’t change the height of the font, while the
For example, a naive bolding tool would rapidly fill in the
enclosed negative space in the enclosed upper region of the letter
to some trouble to preserve the enclosed negative space, moving
The heuristic of preserving enclosed negative space is not a
what’s going on isn’t just a thickening of the font, but rather
Thus, the tool expands ordinary people’s ability to
The font tool is an example of a kind of cognitive technology.
The ideas shown in the font tool can be extended to other domains.
manipulate images of human faces using qualities such as
Such generative interfaces provide a kind of cartography of
generative models, ways for humans to explore and make meaning
We saw earlier that the font model automatically infers relatively
deep principles about font design, and makes them available to
More broadly, we can ask why attribute vectors work, when they
For the attribute vector to work requires that taking any starting
font, we can construct the corresponding bold version by adding
priori there is no reason using a single constant vector to
(unbold, bold) we could train a machine learning algorithm to take
as input the latent vector for the unbolded version and output the
data about font weights, the machine learning algorithm could
are just an extremely simple approach to doing these kinds of
For these reasons, it seems unlikely that attribute vectors will
we can still expect interfaces offering operations broadly similar
to those sketched above, allowing access to high-level and
Let’s look at another example using machine learning models to
an interface to generate images of consumer products such as
programmer to write a program containing a great deal of knowledge
this, Zhu et al train a generative model using 505050
roughly sketch the shape of a shoe, the sole, the laces, and so
The visual quality is low, in part because the generative model
Zhu et al used is outdated by modern (2017) standards
– with more modern models, the visual quality would be much
overall shape of the shoe changes considerably when the sole is
white sole, and the red coloring filled in everywhere on the
from the underlying generative model, in a way we’ll describe
case it becomes possible to sketch in just the colors associated
grass, the outline of a mountain, some blue sky, and snow on the
still to find a low-dimensional latent space which can be used to
represent (say) all landscape images, and map that latent space to
latent space as a compact way of describing landscape images.
stroke as a constraint on the image, picking out a subspace of the
latent space, consisting of all points in the latent space whose
The way the interface works is to find a point in the latent space
the latent space to move the image around in meaningful ways.
the font tool provide ways of understanding and navigating a
high-dimensional space, keeping us on the natural space of fonts
can internalize the interface operations as new primitive elements
learn to think in terms of the difference they want to apply,
richer than the traditional way non-experts think about shoes
they get little practice in thinking this way, or seeing the
enables easier exploration, the ability to develop idioms and the
ability to plan, to swap ideas with friends, and so on.
artillery shell in such-and-such a wind [and so on]?”;
This is a conception common to both the early view of computers as
view of an AI as an oracle, able to solve some large class of
they think using language, forming chains of words in their heads,
mathematics into their thinking, using algebraic expressions or
In each case, we’re thinking using representations invented by
other people: words, graphs, maps, algebra, mathematical diagrams,
explaining how to represent geometric ideas using algebra, and
This enabled a radical change and expansion in how we think about
have formerly impossible thoughts such as: “let’s apply the
instance of a more general class of thought: “computer, [new
type of action] this [new type of representation for a newly
systems can enable the creation of new cognitive technologies.
Things like the font tool aren’t just oracles to be consulted when
discover, to provide new representations and operations, which can
possible, one where it helps us invent new cognitive technologies
In this essay we’ve focused on a small number of examples, mostly
users to rapidly build new musical instruments and artistic
We’ve argued that machine learning systems can help create
representations and operations which serve as new primitives in
Historically, important new media forms often seem strange when
Of course, strangeness for strangeness’s sake alone is not
music: all revealed genuinely new ways of making meaning.
representations sharpen up such insights, eliding the familiar to
strange: it shows relationships you’ve never seen before.
“user friendly”, i.e., simple and immediately useable
using a cliched interface may be easy and fun, it’s an ease
that is fine, but for deeper tasks, and for the longer term, you
underlying a subject, revealing a new world to the user.
appeared strange can instead becomes comfortable and familiar,
particularly rich source of insights when reified in an
Aspirationally, as we’ve seen, our machine learning models will
help us build interfaces which reify deep principles in ways
discover deep principles about the world, recognize those
sometimes discover relatively deep principles, like the
preservation of enclosed negative space when bolding a font.
which takes advantage of such principles, it’d be better if the
model automatically inferred the important principles learned, and
explore only the natural space of images, does that mean we’re
generating anything truly new, from doing truly creative work?
To answer these questions, it’s helpful to identify two different
creativity doesn’t fit so neatly into two distinct categories.
the model nonetheless clarifies the role of new interfaces in
designer, for example, consists of competent recombination of the
creative choices to meet the intended design goals, but not
For such work, the generative interfaces we’ve been discussing are
but models soon appeared that were better adapted to
unfair to single out any small set of papers, and to omit the many
plausible these generative interfaces will become powerful tools
The second mode of creativity aims toward developing new
new principles which enabled people to see in new ways.
natural images, or natural fonts, and thus actively prevent us
Such a model can’t directly generate an image based
on new fundamental principles, because such an image wouldn’t look
discovered by the model may contain ideas going beyond what humans
generative models, but seems a worthwhile aspiration for future
system is trained on pairs of images, e.g., pairs showing the
it can be shown a set of edges and asked to generate an image for
called, confusingly, a generator – this is not meant in the
same sense as our earlier generative models – that takes as
input the constraint image, and produces as output the filled-in
network, whose job is to distinguish between pairs of images
generated from real data, and pairs of images generated by the
crucial difference: there is no latent vector input to the
al experimented with adding such a latent vector to
in training, the network is forced to improvise, doing the best it
merger of knowledge inferred from the training data, together with
relatively simple ideas – like the bread- and beholder-cats
– can result in striking new types of images, images not
underestimate the depth of interface design, often regarding it as
handed off to others, while the hard work is to train some machine
developing the fundamental primitives human beings think and
As discussed earlier, in one common view of AI our computers will
continue to get better at solving problems, but human beings will
neural interfaces, or indirectly through whole brain emulation.
humanity, helping us invent new cognitive technologies, which
cognitive technologies will, in turn, speed up the development of
can help develop more powerful ways of thinking, but there’s at
Of course, over the long run it’s possible that machines will
case, cognitive transformation will still be a valuable end, worth
The interface-oriented work we’ve discussed is outside the
narrative used to judge most existing work in artificial
impressive feats like beating human champions at games such as
difficult-to-measure criterion: is it helping humans think and
long-term test of success will be the development of tools which
If you see mistakes or want to suggest changes, please create an issue on GitHub.
Diagrams and text are licensed under Creative Commons Attribution CC-BY 4.0 with the source available on GitHub, unless noted otherwise.
The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.
For attribution in academic contexts, please cite this work as BibTeX citation
- On Tuesday, June 25, 2019
What is Artificial Intelligence Exactly?
Subscribe here: Check out the previous episode: Become a Patreo
Google's Deep Mind Explained! - Self Learning A.I.
Subscribe here: Become a Patreon!: Visual animal AI: .
Artificial Intelligence in Recruiting
A whistlestop tour of the artificial intelligence products used in recruiting.
Things You Should Know About Artificial Intelligence (A.I)
Robots are becoming more and more advanced each and everyday and there may come a point in time where robots are treated with similar rights as humans.
How Will Artificial Intelligence Affect Your Life | Jeff Dean | TEDxLA
In the last five years, significant advances were made in the fields of computer vision, speech recognition, and language understanding. In this talk, Jeff Dean ...
Top 10 Hottest Artificial Intelligence Technologies 2017
Natural Language Generation: Producing text from computer data. Currently used in customer service, report generation, and summarizing business intelligence ...
What is Artificial Intelligence (or Machine Learning)?
What is AI? What is machine learning and how does it work? You've probably heard the buzz. The age of artificial intelligence has arrived. But that doesn't mean ...
What is Artificial Intelligence?
Artificial intelligence popularly known as AI is a branch of computer science, where machines or software are used to simulate human ..
Create Artificial Intelligence - EPIC HOW TO
What other EPIC stuff do you want to learn? ▻▻ Subscribe! Visit Wisecrack: Philosophy of THE PURGE: .
Artificial intelligence & the future of education systems | Bernhard Schindlholzer | TEDxFHKufstein
Dr. Bernhard Schindlholzer is a technology manager working on Machine Learning and E-commerce. In this talk he gave at TEDx FHKufstein, Bernhard ...