Artificial Stupidity and the Mechanistic Fallacy

Zoubin Ghahramani has pointed out the odd nature of the term “Artificial Intelligence”. As an example he says it would be non-sensical to talk about artificial flying. Things either fly or they don’t.

There is a danger to worrying too much about names, after a while they are just labels, they loose their constructive meaning (astrology anyone?). So it’s perhaps not a very worthy exercise to expand on this idea. However, a question at the ICML Deep Learning panel was about “Artificial Curiosity”. And whether it was a worthy subject for study. Opinions varied, my answer was that curiosity should be a consequence of doing the right thing, in pursuit of intelligence, rather than an aim in itself. In some sense pursuing “artificial curiosity” would be using “artificial” correctly, the curiosity would not arise as a consequence of creating an effective “intelligent” agent, but it would arise through a teleology: it comes into being artificially through design rather than as a consequence of the true goal.

Let’s pause and make a quick definition of intelligence. I like the following definition. Intelligence is the process of using computation to reduce energy consumption. Here by energy I mean “potential energy”, or low entropy energy. I think that’s a simple idea, but it becomes very complex quite quickly because in any environment that evolves intelligent agents the most complex thing within that environment becomes those agents. Saving energy ultimately involves a complex system of second guessing (and 3rd guessing etc) because to save energy optimally you need to predict how the environment will evolve. Agents (including youreself0 are inseparable from that environment. In the end you evolve strategies either to collaborate or compete, but any intelligent strategy would require you to model a highly complex environment to make any prediction (yes, now it starts to make sense the amount of brain-cycles people seem to spend on social and or political interaction, that then may be why many of us retreat to the simpler world of mathematics).

What saves us in the end? If the environment becomes more and more complex, then surely each agent needs to do more modelling to capture the complexity and improve its capacity to predict. In the Hitchiker’s Guide to the Galaxy Douglas Adams created a computer: “Deep Thought” … perhaps the first deep learning algorithm … ironically Adams created the name as a jokey reference to the film “Deep Throat”. The computer was the size of a city but was only able to give the answer, not the question itself. Instead Deep Thought offered to design the Earth to finally resolve what the question was. In my own mind when I originally read the book as a teenager, this seemed totally logical, you need a computer the size of life itself in order to understand life.

Of course, I was making what has more latterly become known as the “HBP fallacy”. The idea that if you simulate every aspect of something, then you understand it. This is somehow the opposite of the teleological fallacy associated with “artificial curiosity”. Its somehow a mechanistic fallacy.

To summarize, curiosity should be emergent in a complex system: directly simulating it is a teleological fallacy, but on the other hand trying to construct the entire intelligence by simulation would be a mechanistic fallacy. How would you even identify which part of it represented ‘curiosity’.

It seems people seem to struggle with the the challenges of the mechanistic fallacy, because they continue to re-visit it. How could such a fallacy exist? If I know each of the low level causes of a phenomena do I not understand the entire system? Well yes and no, you understand it at a particular (low) level of granularity, but you may well be interested in is more global properties of the system. For example, we don’t normally worry about the location and velocity of every molecule in a room, but we often care about their summary statistics: the temperature and pressure. Worrying about the molecules themselves is worrying at the wrong level of granularity when trying to decide whether to put a sweater on or not. If you started to simmulate the room in its detail to answer the question, then the resulting power requirements for your computation would probably raise the temperature of the room by a few degrees, rendering the need for a sweater unnecessary. Viewed as a whole it would not been intelligent because it would not have saved energy.

So how do we resolve this dichotomy? We do this by modelling. A model is an abstraction of a system. Correctly applied models answer questions at the appropriate level of abstraction to match the granularity required. The challenge of modern modelling is choosing how to model at the correct scale.

I believe that aspects like curiosity are emerging somewhere in the middle of these scales. To analogize to software systems they are properties of the ‘middleware’ of intelligence. Perhaps the same is true of aspects such as our sense of self and our belief in free will, maybe even conciousness itself (steady!). None of these concepts are precisely defined, but they characterise aspects that may be necessary in an evolved intelligent entity. Sense of self is needed for projecting our actions and their consequences into future imagined scenarios, similarly free will is a precondition for any speculative planning. Maybe all these properties are emergent when we seek sensible (evolved) strategies to collaborating and competing in complex environments.

Similar modelling challenges also apply when we try and perform science on complex systems. Modelling of brains (and other biological systems) presents questions at multiple scales. They are particularly challenging because there is a number where something important may happen. There are many scales between the molecular and the macroscopic levels (nano, micro, milli). At each of these scales we want to predict, we need models that are appropriate and interchangeable.

Imagine you are asked to find out how the internet works, but you only have access to YouTube and/or the ability to measure signals on the wires leaving the house. Alternatively let’s pretend you have to understand how a computer works, but you only get to measure voltages on the motherboard or click on the mouse and observe the screen.

Doing science in these areas is a major challenge, not least because there are many different potential viable paths between a computer screen and its processor. This is broadly speaking the challenge we face when trying to reverse engineer intelligence or biological systems.

This all sounds a little hopeless, but fortunately mathematics comes to our aid: there are effects, such as the law of large numbers and the central limit theorem that mean that in complex systems there will be phenomena that emerge at different scales. In our abstractions, summarizing these phenomena simply (in a similar way to temperature and pressure) is sufficient to develop models at a particular level of granularity. Developing such models at different scales is how our true understanding of the system evolves. We need to create models that abstract different parts of the system to answer different questions. These abstractions will occur at different scales.

Attempts to simulate curiosity which do not account for the need for, and utility of, curiosity within the intelligent systems could indeed be thought of as ‘artificial curiosity’. True curiosity would be an emergent phenomenon that arises from the challenges of operating in a highly complex environment with limited computational capability and a requirement for quick decision making.

This idea of a “mechanistic fallacy” is not new. In a cosmic joke worthy of Douglas Adams himself Pierre Simon Laplace, one of the greatest scientists to grace our planet, has become know for his “determinism”.

We ought then to regard the present state of the universe as the effect of its anterior state and the cause of the one which is to follow. Given for one instant an intelligence which could comprehend all the forces by which nature is nanimated and the respective situation of the beings who compose it — an intelligence sufficiently vast to submit these data to analysis — it would embrace in teh same formula the movements of the greatest bodies of the universe and those of the lightest atom; for it, nothing would be uncertain and the future, as the past, would be present to its eyes. Pierre Simon Laplace, A Philosophical Essay on Probabilities pg 4

Indeed, this quote (or a version of it) forms the mainstay of philosophical discussion on mechanism (scroll down to Universal Mechanism). But the statement is from a book on probability, those that bother to read two pages further on will note that he is actually describing the mechanistic fallacy. Pause and give it a moment’s thought: why would he motivate determinism in a book about probability? He is describing what he felt was a straw man, but one which we still seem to instinctively seek out even in our modelling today. Read just two pages further on:

The curve described by a simple molecule of air or vapor is regulated in a manner just as certain as the planetary orbits; the only difference between them is that which comes from our ignorance.

Probability is relative, in part to this ignorance, in part to our knowledge. Pierre Simon Laplace, A Philosophical Essay on Probabilities pg 6

He is describing the mechanistic fallacy and its solution.

Stupidity could be seen as the opposite of intelligence, a failure to make the right intelligent decision given the evidence. This might be due to limitations in modelling or computational capability. A mere failure of our agent to comprehend. So what could artificial stupidity be?

Imagine if you had access to the tools of probability that Laplace was so keen to persuade his contemporaries about (his Philosophical Essay on Probabilities continues for another 190 pages!). Imagine if you had that, and an additional 200 years of experience after Laplace published. Imagine if there were a myriad of applications that could only be solved by propagation of uncertainty. Imagine if an entire field had evolved to try and counteract our inability to naturally and conciously deal with uncertainty. Then imagine if you designed an intelligent system that made no attempt to propagate uncertainty through its own analysis. Such a system might be said to be artificially stupid. And I’m pretty sure that Zoubin wouldn’t quibble with that description.