Hutter e Legg - A Formal Measure of Machine Intelligence
Hutter e Legg - A Formal Measure of Machine Intelligence
14 April 2006
1
work which we use to construct our formal measure think abstractly, comprehend complex ideas,
of intelligence. This framework is formally defined learn quickly and learn from experience.”
in Section 4. In Section 5 we use our developed – L. S. Gottfredson and 52 expert signatories
formalism to produce a formal definition of intelli-
gence. Section 7 closes with a short summary. These definitions have certain common features;
A preliminary sketch of the ideas in this paper ap- in some cases they are explicitly stated, while in
peared in the poster [LH05]. It can be shown that others they are more implicit. Perhaps the most
the intelligence measure presented here is in fact a elementary feature is that intelligence is seen as a
variant of the Intelligence Order Relation that ap- property of an entity which is interacting with an
pears in the theory of AIXI, the provably optimal external environment, problem or situation. Indeed
universal agent [Hut04]. A long journal version of this much is common to practically all proposed def-
this paper is being written in which we give the pro- initions of intelligence. As we will be referring back
posed measure of machine intelligence and its rela- to these concepts regularly, we will refer to the en-
tion to other such tests a much more comprehensive tity whose intelligence is in question as the agent,
treatment. and the external environment, problem or situation
Naturally, we expect such a bold initiative to be that it faces as the environment. An environment
met with resistance. However, we hope that the could be a large complex world in which the agent
reader will appreciate the value of our approach: exists, similar to the usual meaning, or something
With a formally precise definition put forward we as narrow as a game of tic-tac-toe.
aim to better our understanding of what is a noto- The second common feature of these definitions
riously subjective and slippery concept. is that an agent’s intelligence is related to its abil-
ity to succeed in an environment. This implies that
the agent has some kind of an objective. Perhaps
2 The concept of intelligence we could consider an agent intelligent, in an ab-
stract sense, without having any objective. How-
Although definitions of human intelligence given by
ever without any objective what so ever, the agent’s
experts in the field vary, most of their views clus-
intelligence would have no observable consequences.
ter around a few common perspectives. Perhaps
Intelligence then, at least the concrete kind that in-
the most common perspective, roughly stated, is to
terests us, comes into effect when an agent has an
think of intelligence as being the ability to success-
objective to apply its intelligence to. Here we will
fully operate in uncertain environments by learn-
refer to this as its goal.
ing and adapting based on experience. The follow-
The emphasis on learning, adaption and experi-
ing often quoted definitions, which can be found
ence in these definitions implies that the environ-
in [Ste00], [Wec58], [Bin37] and [Got97], all express
this notion of intelligence but with different empha- ment is not fully known to the agent and may con-
tain surprises and new situations which could not
sis in each case:
have been anticipated in advance. Thus intelligence
• “The capacity to learn or to profit by experi- is not the ability to deal with one fixed and known
ence.” – W. F. Dearborn environment, but rather the ability to deal with
some range of possibilities which cannot be wholly
• “Ability to adapt oneself adequately to rela- anticipated. This means that an intelligent agent
tively new situations in life.” – R. Pinter may not be the best possible in any specific envi-
ronment, particularly before it has had sufficient
• “A person possesses intelligence insofar as he time to learn. What is important is that the agent
has learned, or can learn, to adjust himself to is able to learn and adapt so as to perform well over
his environment.” – S. S. Colvin a wide range of specific environments.
• “We shall use the term ‘intelligence’ to mean Although there is a great deal more to this topic
the ability of an organism to solve new prob- than we have presented here, the above brief anal-
lems. . . .” – W. V. Bingham ysis gives us the necessary building blocks for our
informal working definition of intelligence:
• “A global concept that involves an individ-
ual’s ability to act purposefully, think ratio- Intelligence measures an agent’s ability to
nally, and deal effectively with the environ- achieve goals in a wide range of environ-
ment.” – D. Wechsler ments.
• “Intelligence is a very general mental ca- We realise that some researchers who study in-
pability that, among other things, involves telligence will take issue with this definition. Given
the ability to reason, plan, solve problems, the diversity of views on the nature of intelligence,
2
observation
a debate which is still being fought, this is un-
avoidable. Nevertheless, we are confident that our
proposed informal working definition is fairly main- reward
stream. We also believe that our definition captures
what we are interested in achieving in machines: A
agent environment
very general and flexible capacity to succeed when
faced with a wide range of problems and situations.
Even those who subscribe to different perspectives action
on the nature and correct definition of intelligence
will surely agree that this is a central objective for Figure 1: The agent and the environment interact
anyone wishing to extend the power and usefulness by sending action, observation and reward signals
of machines. It is also a definition that can be suc- to each other.
cessfully formalised.
3 The agent-environment
what causes different levels of reward to occur. In
framework a complex setting the agent might be rewarded for
winning a game or solving a difficult puzzle. From
In the previous section we identified three essential a broad perspective then, the goal is flexible. If the
components for our model of intelligence: An agent, agent is to succeed in its environment, that is, re-
an environment, and a goal. Clearly, the agent and ceive a lot of reward, it must learn about the struc-
the environment must be able to interact with each ture of the environment and in particular what it
other; specifically, the agent needs to be able to needs to do in order to get reward.
send signals to the environment and also receive
signals being sent from the environment. Similarly Not surprisingly, this is exactly the way in which
the environment must be able to receive and send we condition an animal to achieve a goal: by se-
signals to the agent. In our terminology we will lectively rewarding certain behaviours. In a narrow
adopt the agent’s perspective on these communica- sense the animal’s goal is fixed, perhaps to get more
tions and refer to the signals from the agent to the treats to eat, but in a broader sense this may require
environment as actions, and the signals from the doing a trick or solving a puzzle.
environment as perceptions.
What is missing from this setup is the goal. As In our framework we will include the reward sig-
discussed in the previous section, our definition of nal as a part of the perception generated by the
an agent’s intelligence requires there to be some environment. The perceptions also contain a non-
kind of goal for the agent to try to achieve. This reward part, which we will refer to as observations.
implies that the agent somehow knows what the This now gives us the complete system of interact-
goal is. One possibility would be for the goal to be ing agent and environment in Figure 1. The goal, in
known in advance and for this knowledge to be built the broad flexible sense, is implicitly defined by the
into the agent. The problem with this however is environment as this is what defines when rewards
that it limits each agent to just one goal. We need are generated. Thus in this framework, to test an
to allow agents which are more flexible than this. agent in any given way, it is sufficient to fully define
If the goal is not known in advance, the other the environment.
alternative is to somehow inform the agent of what
the goal is. For humans this is easily done using In artificial intelligence, this framework is used
language. In general however, the possession of a in the area of reinforcement learning [SB98]. By
sufficiently high level of language is too strong an appropriately renaming things, it also describes the
assumption to make about the agent. Indeed, even controller-plant framework used in control theory.
for something as intelligent as a dog or a cat, direct It is a widely used and very general structure that
explanation will obviously not work. can describe seemingly any kind of learning or con-
Fortunately there is another possibility. We can trol problem. The interesting point for us is that
define an additional communication channel with this type of framework follows naturally from our
the simplest possible semantics: A signal that indi- informal definition of intelligence. The only diffi-
cates how good the agent’s current situation is. We culty was how to deal with the notion of success, or
will call this signal the reward. The agent’s goal is profit. This requires the existence of some kind of
then simply to maximise the amount of reward it objective or goal, and the most flexible and elegant
receives, so in a sense its goal is fixed. This is not way to bring this into our framework is by using a
limiting though as we have not said anything about simple reward signal.
3
4 A formal framework for in- or “success” for an agent. Informally, we know that
the agent must try to maximise the amount of re-
telligence ward it receives, however this could mean several
different things.
Having made the basic framework explicit, we can
now formalise things. See [Hut04] for a more com- Example. Define the reward space R := {0, 1}, an
plete technical description along with many more action space Å := {0, 1} and an observation space
example agents and environments. that just contains the null string, O := {ε}. Now
The agent sends information to the environment define a simple environment,
by sending symbols from some finite set, for exam-
ple, Å := {lef t, right, f orwards, backwards}. We µ(rk |o1 . . . ak−1 ) := 1 − |rk − ak−1 |.
will call this set the action space and denote it by
Å. Similarly, the environment sends signals to the As the agent always get a reward equal to its action,
agent with symbols from a finite set called the per- the optimal agent for this environment is clearly
ception space, which we will denote P. The reward πopt (ak |o1 . . . rk ) := ak . Consider now two other
space, denoted by R, will always be a finite subset agents for this environment, π1 (ak |o1 . . . rk ) = 21
of the rational unit interval [0, 1]∩Q. Every percep- and
tion consists of two separate parts; an observation
and a reward. For example, we might have P := 1 for ak = 0 ∧ k ≤ 100,
1 for ak = 1 ∧ 100 < k ≤ 5000,
{(cold, 0.0), (warm, 1.0), (hot, 0.3), (roasting, 0.0)}. π2 (ak |o1 . . . rk ) := 1
for 5000 < k,
To denote symbols being sent we will use the 2
0 otherwise.
lower case variable names a, o and r for actions,
observations and rewards respectively. We will also For 1 ≤ k ≤ 100 the expected reward per cycle
index these in the order in which they occur, thus for π1 is higher than it is for π2 . Thus in the short
a1 is the agent’s first action, a2 is the second ac- term π1 is the most successful. On the other hand,
tion and so on. The agent and the environment for 100 < k ≤ 5000, π2 has switched to the opti-
will take turns at sending symbols, starting with mal strategy of always guessing that 1 head will be
the environment. This produces a history of obser- thrown, while π1 has not. Thus in the medium term
vations, rewards and actions which we will denote π2 is more successful. Finally, for k > 5000, both
by, o1 r1 a1 o2 r2 a2 o3 r3 a3 o4 . . .. Our restriction to fi- agents use random actions and thus in the limit they
nite action and perception spaces is deliberate as are equally successful.
an agent should not be able to receive or generate Which is the better agent? If you want to max-
information without bound in a single cycle in time. imise short term rewards, it is agent π1 . If you
Of course, the action and perception spaces can still want to maximise medium term rewards, then it is
be extremely large, if required. agent π2 . And if you only care about the long run,
Formally, the agent is a function, denoted by π, both agents are equally successful. Which agent you
which takes the current history as input and chooses prefer depends on your temporal preferences, some-
the next action as output. A convenient way of rep- thing which is currently outside of our formulation.
resenting the agent is as a probability measure over The standard way of formalising this in reinforce-
actions conditioned on the current history. Thus ment learning is to assume that the value of rewards
π(a3 |o1 r1 a1 o2 r2 ) is the probability of action a3 in decay geometrically into the future at a rate given
the third cycle, given that the current history is by a discount parameter γ ∈ (0, 1), that is,
o1 r1 a1 o2 r2 . A deterministic agent is simply one !
∞
that always assigns a probability of 1 to some ac- 1 X
tion for any given history. How the agent produces Vµπ (γ) := E γ i ri (1)
Γ i=1
the distribution over actions for any given history
is left completely open. Of course in artificial intel- where ri is the reward in cycle i ofPa given history,
ligence the agent will be a machine and so π will be the normalising constant is Γ := ∞ i
i=1 γ , and the
a computable function. expected value is taken over all histories of π and
The environment, denoted µ, is defined µ interacting. By increasing γ towards 1 we weight
in a similar way. Specifically, for any long term rewards more heavily, conversely by re-
k ∈ N the probability of ok rk , given the ducing it we balance the weighting towards short
current history o1 r1 a1 . . . ok−1 rk−1 ak−1 , is term rewards.
µ(ok rk |o1 r1 a1 . . . ok−1 rk−1 ak−1 ). For the mo- Of course this has not actually answered the
ment we will not place any further restrictions on question of how to weight near term rewards ver-
the environment. sus longer term rewards. Rather it has simply ex-
Our next task is to formalise the idea of “profit” pressed this weighting as a parameter. While that is
4
adequate for some purposes, what we would like is a Thus the most accurate framework would consist
single test of intelligence for machines, not a range of an agent, an environment and a separate goal sys-
of tests that vary according to some free parameter. tem that interpreted the state of the environment
That is, we would like the temporal preferences to and rewarded the agent appropriately. In such a set
be included in the model, not external to it. up the bounded rewards restriction would be a part
One possibility might be to use harmonic dis- of the goal system and thus the above philosophical
counting, γt := t12 . This has some nice properties, problem does not occur. However for our current
in particular the agent needs to look forward into purposes it is seem sufficient just to fold this goal
the future in a way that is proportional to its cur- mechanism into the environment and add an eas-
rent age [Hut04]. However an even more elegant ily implemented constraint to how the environment
solution is possible. may generate rewards.
If we look at the value function in Equation 1,
we see that geometric discounting plays two roles.
Firstly, it normalises the total reward received 5 A formal measure of intelli-
which makes the sum finite, in this case with a gence
maximum value of 1. Secondly, it weights the re-
ward at different points in the future which in effect We have now formally defined the space of agents,
defines a temporal preference. We can solve both how they interact with each other, and how we mea-
of these problems, without needing an external pa- sure the performance of an agent in any specific
rameter, by simply requiring that the total reward environment. Before we can put all this together
returned by the environment cannot exceed 1. For a into a single performance measure, we firstly need
reward summable environment µ we can now define to define what me mean by “a wide range of envi-
the value function to be simply, ronments.”
∞
! As our goal is to produce a measure of intelligence
π
X that is as broad and encompassing as possible, the
Vµ := E ri ≤ 1. (2)
space of environments used in our definition should
i=1
be as large as possible. Given that our environment
One way of viewing this is that the rewards re- is a probability measure with a certain structure, an
turned by the environment now have the temporal obvious possibility would be to consider the space
preference factored in and thus we do not need to of all probability measures of this form. Unfortu-
add this. The cost is that this is an additional con- nately, this extremely broad class of environments
dition that we place on the environments. Previ- causes problems. As the space of all probability
ously we required that each reward signal was in a measures is uncountably infinite, we cannot list the
finite subset of [0, 1]∩Q, now we have the additional members of this set, nor can we always describe en-
constraint that the sum is bounded. vironments in a finite way.
It may seem that there is a philosophical problem The solution is to require the environmental mea-
here. If an environment µ is an artificial game, like sures to be computable. Not only is this necessary
chess, then it seems fairly natural for µ to meet if we are to have an effective measure of intelligence,
any requirements in its definition, such as having a it is also not all that restrictive. There are an in-
bounded reward sum. However if we think of the finite number of environments in this set, with no
environment µ as being “the universe” in which the upper bound on their complexity. Furthermore, it is
agent lives, then it seems unreasonable to expect only the measure which describes the environment
that it should be required to respect such a bound. that must be computable. For example, although a
The flaw in this argument is that a “universe” does typical sequence of 1’s and 0’s generated by flipping
not have any notion of reward for particular agents. a coin is not computable, the probability measure
Strictly speaking, reward is an interpretation of which describes this process is computable. Thus,
the state of the environment. In humans this is built even environments which behave randomly are in-
in, for example, the pain that is experienced when cluded in our space of environments. This appears
you touch something hot. In which case, maybe to be the largest reasonable space of environments.
it should really be a part of the agent rather than Indeed, no physical system has ever been shown to
the environment? If we gave the agent complete lie outside of this set. If such a physical system was
control over rewards then our framework would be- found, it would overturn the Church-Turing thesis
come meaningless: The perfect agent could simply and alter our view of the universe.
give itself constant maximum reward. Indeed hu- How can we combine the agent’s performance
mans cannot easily do this either, at least not with- over all these environments? As there are an infinite
out taking drugs designed to interfere with their number of environments, we cannot simply take a
pleasure-pain mechanism. uniform distribution over them. Mathematically,
5
we must weight some environments more highly context of universally optimal learning agents. See
than others. If we consider the agent’s perspective [LV97] or [Hut04] for an overview of Kolmogorov
on the problem, this question is the same as asking: complex and universal prior distributions.
Given several different hypotheses which are consis- Putting this all together, we can now define our
tent with the data, which hypothesis should be con- formal measure of intelligence for arbitrary systems.
sidered the most likely? This is a frequently occur- Let E be the space of all programs that compute
ring problem in inductive inference where we must environmental measures of summable reward with
employ a philosophical principle to decide which respect to a prefix universal Turing machine U, let
hypothesis is the most likely. The most success- K be the Kolmogorov complexity function. The
ful approach is to invoke the principle of Occam’s intelligence of an agent π is defined as,
razor: Given multiple hypotheses which are consis- X
tent with the data, the simplest should be preferred. Υ(π) := 2−K(µ) Vµπ = Vξπ ,
This is generally considered the rational and intel- µ∈E
ligent thing to do. P
Consider for example the following type of ques- where ξ := µ∈E 2−K(µ) µ due to the linearity of
tion which commonly appears in intelligence tests. V . ξ is the Solomonoff-Levin universal a priori dis-
There is a sequence such as 2, 4, 6, 8, and the tribution generalised to reactive environments.
test subject needs to predict the next number. Of
course the pattern is immediately clear: The num-
bers are increasing by 2 each time. An intelligent
6 Properties of the intelli-
person would easily identify this pattern and predict gence measure
the next digit to be 10. However, the polynomial
2k 4 − 20k 3 + 70k 2 − 98k + 48 is also consistent with To better understand the performance of this mea-
the data, in which case the next number in the se- sure consider some example agents.
quence would be 58. Why then do we consider the A random agent. The agent with the lowest intel-
first answer to be more likely? It is because we use, ligence, at least among those that are not actively
perhaps unconsciously, the principle of Occam’s ra- trying to perform badly, would be one that makes
zor. Furthermore, the fact that the test defines this uniformly random actions. We will call this π rand .
as the correct answer shows that it too embodies In general such an agent will not be very successful
the concept of Occam’s razor. Thus, although we as it will fail to exploit any regularities in the envi-
don’t usually mention Occam’s razor when defining ronment, no matter how simple they are. It follows
rand
intelligence, the ability to effectively use Occam’s then that the values of Vµπ will typically be low
razor is clearly a part of intelligent behaviour. compared to other agents, and thus Υ(π rand ) will
Our formal measure of intelligence needs to re- be low.
flect this. Specifically, we need to test the agents in A very specialised agent. From the equation for
such a way that they are, at least on average, re- Υ, we see that an agent could have very low intelli-
warded for correctly applying Occam’s razor. For- gence but still perform extremely well at a few very
mally, this means that our a priori distribution over specific and complex tasks. Consider, for exam-
environments should be weighted towards simpler ple, IBM’s Deep Blue chess supercomputer, which
environments. The problem now becomes: How we will represent by π dblue . When µchess describes
dblue
should we measure the complexity of environments? the game of chess, Vµπchess is very high. However
chess
As each environment is computable, it can be rep- 2−K(µ ) is small, and for µ 6= µchess the value
resented by a program, or more formally, a binary function will be low relative to other agents as π dblue
string p ∈ B∗ on some prefix universal Turing ma- only plays chess. Therefore, the value of Υ(π dblue )
chine U. Thus we can use Kolmogorov complex- will be very low. Intuitively, this is because Deep
ity to measure the complexity of an environment Blue is too inflexible and narrow to have general
µ ∈ E, intelligence.
A general but simple agent. Imagine an agent
K(µ) := min∗ |p| : U(p) computes µ . that does very basic learning by building up a ta-
p∈B
ble of observation and action pairs and keeping
This measure is independent of the choice of U up to statistics on the rewards that follow. Each time
an additive constant that is independent of µ, thus, an observation that has been seen before occurs,
we simply pick one universal Turing machine U and the agent takes the action with highest estimated
fix it. The correct way to turn this into a prior expected reward in the next cycle with 90% prob-
distribution is by taking 2−K(µ) . This is known as ability, or a random action with 10% probability.
the algorithmic probability distribution and it has a We will call this agent π basic . It is immediately
number of important properties, particularly in the clear that many environments, both complex and
6
very simple, will have at least some structure that fined across all well defined environments, not just
such an agent would take advantage of. Thus for a specific test subset which an agent might adapt
basic rand
almost all µ we will have Vµπ > Vµπ and so to.
Υ(π basic ) > Υ(π rand ). Intuitively, this is what we Absolute. Υ gives us a single real absolute value,
would expect as π basic , while very simplistic, is unlike the pass-fail Turing test [Tur50]. This is im-
surely more intelligent than π rand . portant if we want to make distinctions between
A simple agent with more history. A natural ex- similar learning algorithms that are not close to hu-
tension of π basic is to use a longer history of actions, man level intelligence.
observations and rewards in its internal table. Let Wide range. As we have seen, Υ can measure per-
π 2back be the agent that builds a table of statis- formance from extremely simple agents right up to
tics for the expected reward conditioned on the last the super powerful AIXI agent. Other tests cannot
two actions, rewards and observations. It is imme- hand such an enormus range.
diately clear π 2back is a generalisation of π basic by General. The test is clearly non-specific to the
definition and thus will adapt to any regularity that implementation of the agent as the inner workings
π basic can adapt to. It follows then that in general of the agent is left completely undefined. It is also
2back basic
Vµπ > Vµπ and so Υ(π 2back ) > Υ(π basic ), as very general in terms of what senses or actuators
we would intuitively expect. the agent might have as all information exchanged
In a similar way agents of increasing complex- between the agent and the environment takes place
ity and adaptability can be defined which will have over basic Shannon like communication channels.
still greater intelligence. However with more com- Dynamic. One aspect of our test of intelligence
plex agents it is usually difficult to theoretically es- is that it is, in the terminology of intelligence test-
tablish whether one agent has more or less intel- ing, a highly dynamic test [SG02]. Normally in-
ligence than another. Nevertheless, it is hopefully telligence tests for humans only test the ability to
clear from these simple examples that the more flex- solve one-off problems. There are no dynamic as-
ible and powerful an agent is, the higher its machine pects to the test where the test subject has to in-
intelligence. teract with something and learn and adapt their
A human. For extremely simple environments, a behaviour accordingly. This makes it very hard to
human should be able to identify their simple struc- test things like the individual’s ability to quickly
ture and exploit this to maximise reward. For more pick up new skills and adapt to new situations. One
complex environments however it is hard to know way to overcome these problems is to use more so-
how well a human would perform without experi- phisticated dynamic tests. In these tests there is
mental results. an active tester who constantly interacts with the
Super-human intelligence. It can be easily proven test subject, much like what happens in our formal
that the theoretical AIXI agent [Hut04] is the max- intelligence measure.
imally intelligent agent with respect to Υ. AIXI Unbiased. The test is not weighted towards abil-
has been proven to have many universal optimal- ity in certain specific kinds of areas or problems,
ity properties, including being Pareto optimal and rather it is simply weighted towards simpler envi-
self-optimising in any environment in which this is ronments no matter what they are.
possible for a general agent. Thus it is clear that Fundamental. The test is based on the theory
agents with very high Υ must be extremely power- of information, Turing computation and complexity
ful. theory. These are all fundamental ideas which are
In addition to sensibly ordering many simple likely to remain very stable over time irrespective
learning agents, this formal definition has many sig- of changes in technology.
nificant and desirable properties: Formal. Unlike many tests of intelligence, Υ is
Valid. The most important property of a measure completely formally, mathematically, specified.
of intelligence is that it does indeed measure “intel- Objective. Unlike the Turing test which requires
ligence”. As Υ formalises a mainstream informal a panel of judges to decide if an agent is intelligent
definition, we believe that it is valid measure. or not, Υ is fee of such subjectivity.
Meaningful. An agent with a high Υ value must Our definition of intelligence also has some weak-
perform well over a very wide range of environ- nesses. One is the fact that the environmental dis-
ments, in particular it must perform well in almost tribution 2−K(µ) that we have used is invariant,
all simple environments. If such a agent existed, it up to a multiplicative constant, to changes in the
would clearly be very powerful and practically use- reference machine U. While this affords us some
ful. It also sensibly orders the intelligence of simple protection, it still means that the relative intelli-
learning agents. gence of agents can change if we change our refer-
Repeatable. We can test an agent using the Υ re- ence machine. One approach to this problem might
peatedly without problem. This is because it is de- be to limit the complexity of the reference machine,
7
for example by limiting its state-symbol complex- of machine intelligence in an elegant and powerful
ity. We expect that for highly intelligent machines way. Furthermore, more tractable measures of com-
that can deal with a wide range of environments plexity should lead to practical tests based on this
of varying complexity, the effect of changing from theoretical model.
one simple reference machine to another will be mi-
nor. For agents which are less complex than the Acknowledgments
reference machine however, such a change could be
significant. This was supported by SNF grant 200020-107616.
A theoretical problem is that our distribution
over environments is not computable. While this
is fine for a theoretical definition of intelligence,
References
it makes the measure impossible to directly im- [Bin37] W. V. Bingham. Aptitudes and aptitude testing.
plement. The solution is to use a more tractable Harper & Brothers, New York, 1937.
measure of complexity such as Levin’s Kt complex- [Got97] L. S. Gottfredson. Mainstream science on in-
ity [Lev73], or Schmidhuber’s Speed prior [Sch02]. telligence: An editorial with 52 signatories, history,
Both of these consider the complexity of an al- and bibliography. Intelligence, 24(1):13–23, 1997.
gorithm to be determined by both its description [HO00] J. Hernández-Orallo. Beyond the Turing
length and running time. Intuitively it also makes test. Journal of Logic, Language and Information,
good sense, because we would not usually consider 9(4):447–466, 2000.
a very short algorithm that takes an enormous [Hut04] M. Hutter. Universal Artificial Intelligence:
Sequential Decisions based on Algorithmic Prob-
amount of time to compute, to be a particularly
ability. Springer, Berlin, 2004. 300 pages,
simple one.
http://www.idsia.ch/∼ marcus/ai/uaibook.htm.
The only closely related work to ours is the C- [Joh92] W. L. Johnson. Needed: A new test of intelli-
Test [HO00]. While our intelligence measure is fully gence. SIGARTN: SIGART Newsletter (ACM Special
dynamic and interactive, the C-Test is a purely Interest Group on Artificial Intelligence), 3, 1992.
static sequence prediction test similar to standard [Lev73] L. A. Levin. Universal sequential search prob-
IQ tests for humans. The C-Test always ensures lems. Problems of Information Transmission, 9:265–
that each question has an unambiguous answer in 266, 1973.
the sense that there is always one consistent hypoth- [LH05] S. Legg and M. Hutter. A universal measure of
esis with significantly lower complexity than the al- intelligence for artificial agents. In Proc. 21st Interna-
ternatives. Perhaps this is useful for some kinds of tional Joint Conf. on Artificial Intelligence (IJCAI-
tests, but we believe that it is unrealistic and limit- 2005), number IDSIA-04-05, pages 1509–1510, Edin-
ing. Like our intelligence test, the C-Test also has burgh, 2005.
to deal with the problem of the incomputability of [LV97] M. Li and P. M. B. Vitányi. An introduction to
Kolmogorov complexity. By using Levin’s Kt com- Kolmogorov complexity and its applications. Springer,
plexity, the C-Test was able to compute a number of 2nd edition, 1997.
test problems which were used to test humans. The [Mah99] M. V. Mahoney. Text compression as a test
for artificial intelligence. In AAAI/IAAI, 1999.
“compression test”[Mah99] for machine intelligence
[SB98] R. Sutton and A. Barto. Reinforcement learn-
is similarly restricted to sequence prediction. We
ing: An introduction. Cambridge, MA, MIT Press,
consider the linguistic complexity tests of Treister- 1998.
Goren et. al. to be far too narrow. The psychome-
[Sch02] J. Schmidhuber. The Speed Prior: a new
tric approach of Bringsjord and Schimanski is only simplicity measure yielding near-optimal computable
appropriate if the machine has a sufficiently human- predictions. In Proc. 15th Annual Conference on
like intelligence. Computational Learning Theory (COLT 2002), Lec-
ture Notes in Artificial Intelligence, pages 216–228,
Sydney, Australia, July 2002. Springer.
7 Conclusions [SG02] R. J. Sternberg and E. L. Grigorenko, editors.
Dynamic Testing: The nature and measurement of
Given the obvious significance of formal definitions learning potential. Cambridge University Press, 2002.
of intelligence for research, and calls for more di- [Ste00] R. J. Sternberg, editor. Handbook of Intelli-
rect measures of machine intelligence to replace the gence. Cambridge University Press, 2000.
problematic Turing test and other imitation based [Tur50] A. M. Turing. Computing machinery and in-
tests [Joh92], very little work has been done in this telligence. Mind, October 1950.
area. In this paper we have attempted to tackle [Wec58] D. Wechsler. The measurement and appraisal
this problem head on. Although the test has a few of adult intelligence. Williams & Wilkinds, Balti-
more, 4 edition, 1958.
weaknesses, it also has many unique strengths. In
particular, we believe that it expresses the essentials