15th September, 1830, teh Liverpool and Manchester Railway opened. When it was proposed that people should travel by rail, fears were voiced of the high speeds that would cause passengers heads to explode as soon as they were reached. Heady speeds of 50 km/h were unheard of at the time.

When it came to it, the first death on our railways actually occurred when the local political representative for Liverpool disembarked from a stationary train and, while attempting to greet the Duke of Wellington, was run over by George Stephenson’s Rocket.

No exploding heads, but a pair of severed legs and a death in a local hospital.

Nowadays, we do not disembark, when the train has a problem.

But then again, who could actually predict what would really happen when we’d never been there before? Indeed, as the MP for Liverpool found out, being rapidly accelerated (or decelerated) by impact with an object of high relative momentum is more than enough to destabilise your physical shell to such an extent that your internal spirit is extinguished.

The debate over what our AI futures holds sometimes has the feeling of exploding heads at the moment.

Fears about AI as an existential threat were triggered by Elon Musk, investor in DeepMind, who warned that humanity was ‘releasing the demon’.

The previous post in this series parodied activities in the field as a Homeric Greek tragi-comedy, in this post I’ll try and give some perspective on the evolving futures debate.

As well as Homer, the Greeks were famous for was politics and philosophy. Our idea of philosophy comes from Plato’s tales of Socrates, the man who talked all ideas through. He didn’t believe in the written word, he believed in discussion and persuasion, often by the ‘ironic’ argument.

The first challenge to understanding someone’s point of view is actually listening to what they are saying. The next challenge is in unpicking what their words actually mean.

When you are trying to unpick such challenges across academic boundaries, it is particularly difficult. Firstly, because, as academics, we are often more fond of talking than listening, but secondly because in each of our subfields there is a Babel like propagation of terminology: each word means a different thing to each of us. Ethics, models, morals, noise, generalisation, mechanisism even probability. There are many barriers to communication, and Google translate will not do the job for us. Overcoming them requires patience and understanding.

A large part of me does believe that we are seeing something of great drama emerging in machine learning and AI. And I think many of the ideas that are being debated are extremely important, both for researchers and the public.

Reactions of the wider machine learning community, including my own, to our new found notoriety have varied between incredulity and ridicule, but we need to take very seriously the way we are perceived and what our impact on the society around us is.

In January the NYU Centre for Data Science recently convened a meeting to bring the experts together, not just those that were worrying, but developers of AI systems, economists, futurologists, philosophers, psychologists, roboticists. I was party to a few of the conversations that lead to the formation of the meeting

The last week has been mainly focussed on societal applications of machine learning. First Yann Le Cun convened a meeting at NYU on the “Future of AI”, with a particular remit to be cross disciplinary covering the potential and pitfalls of our AI solutions. The first day focussed on the potential of AI and what a core part it forms of the futures it forms for companies such as Google, Facebook, Microsoft, Mobileye and Nvidia. The speaker list was impressive, each company represented typically by a CEO or a CTO, but for me things got really interesting on the second and third days. Then the sessions were more cross disciplinary and focussed, including economists, philosophers, psychologists, cognitive scientists and futurologists. The aim being to stimulate a wide ranging debate about our AI future and its effect on society.

THe discussions took place under Chatham House Rules: meaning that we can summarise the essence of what was said but not by whom, unless they gave their explicit permission. This rule is likely necessary because for an open debate on the pitfalls people need to be secure that their words won’t be sensationalised, particularly in the instances where they may have been playing devil’s advocate.

Speakers included Erik Bryd… Nick Bostrom, Max Tegmark as well as including CEOs and CTOs of these comof AI as well as the The last few days have been a little hectic. Visiting Max Planck Society, followed by four days in New York at the Future of AI symposium convened by Yann Le Cun.

These events are vital for opening up the debate, achieving a shared set of objectives. However, some critical things were missing from the NYU debate, in particular there was no particular focus on data.

There is a dangerous tendency to separate our discussion on AI from our discussion on data. Machine learning is not just the principle technology underpinning the recent success stories in AI, it is, along with statistics, the principle technology driving our agenda in data science.

THe last session in New York was on AI Safety. This seems an emotive term, but we had Nick Bostrom attending. His entire book “Superintelligence” is keyed to

On return to the UK, I went straight to the Royal Society in London to participate in evidence gathering for the Royal Society’s working group on Machine Learning. Our remit is focussed on the next five to ten years, and data is featuring very prominently in our discussions. The sessions themselves could not have been more contrasting. Small group evidence gathering, with particular questions targetted at invited experts who had responded to an earlier call for written evidence.

I can’t say too much about the details of either session, because of Chatham House rules and then because there will be a report arising from the Royal Society Working Group that I should not prejudge. However, it did feel rather extraordinary to go (in a single 24 hour period) from chatting to Daniel Kahneman to interviewing Baroness O’Neill.

There are many unknowns for AI, in particular because the technology is so far away from us. I remain very unpersuaded by most of what Nick Bostrom has to say in his book “Superintelligence”. To my mind, it falls mainly in the regime of ‘exploding heads’. There is a pseudoplausibility about some of the arguments, but when subject to deeper scrutiny, they collapse like a house of cards. This may unfair, because I appreciate that Nick is looking further into the future than I would normally e comfortable with, but he does so by attempting to construct narrative threads that build on a causal chain of events which seem to me to be particularly fragile. There are some significant ommissions.

In particular, when we look to the past, we see that people were normally overly optimistic about how rapidly new advances would be assimilated. Xeros PARC focused on the idea that the office of the future would be paperless, a sensible projection, but before it came about (indeed it’s not quite here yet) there was an enormous proliferation of the use of paper, so the demand increased. In a similar way, in computational biology researchers have suggested that computational techniques would obviate the need for biological experiment, whereas the reality is that predictions have required validation. As the complexity of predictions increases we become more and more reliant on advances in the experimental side for verifying them.

Something very similar is likely to happen with artificial intelligence technologies. As we develop them further, we will likely require more sophistication from our human side. For example, we won’t be able to replace doctors, but we will need doctors who have a more sophisticated understanding of interpretation of high resolution genetic testing. An ability to assimilate that understanding with their other knowledge.

One term that is regularly discussed is “AI Safety”. This seems to be quite an emotive term, with fears about embodied intelligences with their own independent ‘final goals’ dominating Bostrom’s thinking. Consequences of what happens when these goals are extrapolated.

Bostrom offers us a form of technopop philosophy, which builds on popular ideas (such as those explored by Asimov or Clarke) and combines them with a superficial technical basis. But the technical basis is often misdeployed or sometimes not deployed according to the convenience of the argument.

Let me give you an example, Bostrom hardly makes use of uncertainty in describing intelligence. In my own approaches uncertainty and correct handling of uncertainty is criticial. However, by ignoring it Bostrom can give the impression that a superintelligence would act with unerving confidence. The only point where I recollect uncertainty is really mentioned is when Bostrom refers to how he things a rational Bayesian agent would respond to being given a goal. Bostrom suggests that due to uncertainty it would believe it had never achieved its goal and continue to consume world resource in an effort to do so.

This idea of a idiot savant, is a convient combination of aspects of human and machine to bring about a terrifying consequence. It provides an interesting narrative, and in the manner of an Ian Fleming novel, it’s littered with technical detail to increase the plausibility of the reader. However, in the same way that so many of Blofeld’s schemes are quite fragile when exposed to deeper analysis, many of Bostrom’s ideas are as well. 1

I don’t really think Bostrom actually ever gives a satisfactory definition of intelligence. Superintelligence is defined as outperforming humans in every intelligent capability (a term we would refer to as dominance in multi-objective optimisation). So a superintelligence is a dominating solution over human intelligence. However, if human intelligence isn’t defined then that definition is a little diffuse.

I like the following definition of intelligence: “Use of information to take decisions which save energy”. Here by information I might mean data or facts or rules. And saving energy I mean saving ‘free’ energy.2

Accepting the lack of definition of intelligence, we can still consider the routes to superintelligence Bostrom proposes. He is talking about 30 year timescales, these are timescales which are difficult to predict over. And I think it is admirable that Nick is trying to address this, but I’m also keen to ensure that particular ideas which seem very implausible don’t become a particular meme in this debate.

Indeed, I think it’s worse than that, because many of the ideas, if not all are implausible. They are implausible at some different levels, let me start with a detailed criticism, which is highlights the way in which some of the threads of his narratives fall apart, and then turn to a more general criticism which I think does a lot to deflate the whole volume.

I’ve chosen one detailed criticism, as an exemplar, although I think there could be many more. Bostrom dismisses hybrid human-computer systems while taking very seriously the idea of full emmulation of the brain by computer (and achieving superintelligence by speeding up the emmulation or running multiple copies).

If we had the level of understanding we need to fully emulate the brain, then we would be a long way to being able to interface directly with the brain. It appears to me more likely, particularly given the presence of applications in patients with spinal problems, or motor neurone problems, that we will have devloped hybrid systems that interface directly with the brain a long time before we have managed a full emulation of the human brain.

This type of naive idea comes from a lack of understanding of what an emulation would involve. It would not involve an exact simulation of each neuron in the brain down to the quantum level (and if it did, it would be way more computationally demanding than is suggested). It would instead involve some level of abstraction. Abstraction as to what is important in generating our intelligence. Replacing mechanism with the summary measure that the brain finds useful. An understanding of this sort of abstraction is totally missing from the entire text, but is vital in modelling, and I believe intelligence. Such abstractions require a deep understanding of how the brain is working, and such understandings are exactly what Bostrom says are impossible to determine for hybrid systems. So the argument starts by chasing its tail and ends up biting its own arse.

The hybrid systems are important, because they would change the nature of the way society would evolve. If we had such hybrid systems there would certainly be many social issues, but the threats that Bostrom talks about would be largely diminished. Firstly, it’s not human vs computer, but augmented human vs computer (my understanding is that a skilled human with a computer can still beat the best chess computers at chess).

implausible. The arguments are also wooly because the lack of a definition of intelligence means that at any given point he will anthropomorphise the intelligence, or embody the intelligence, to make a particular scenario appear more menacing, and yet later exploit the interconnectivity of the intelligence and the power that comes with it. These interchanges may not be purposeful, but the reflect a lack of clarity in the thinking in the book.

think there is a regular interchange between embodied AI and a distributed intelligence which does not sit weIf I were to characterise the difference in the two meetings, I’d say that the AI debate is more about

Oddly the only mention or consideration of uncertainty seems to come by proposing that the intelligence may not even stop when its goal is complete (better make more paper clips just in case I haven’t really succeeded).

This simplistic thinking may come from a lack of experience in deploying systems in practice. While most of our machine learning systems have objective functions, these do not really map nicely to the idea of a ‘final goal’ and only really are effective for simplistic tasks, such as classification. Perhaps reinforcement learning is closer with its mechanism of reward for doing well, but this also does not map cleanly to a ‘final goal’.

My own belief is that if we are goal driven in our intelligence, then it is by sophisticated goals (akin to multi-objective optimisation) and each of us weight those goals according to sets of values that may themselves evolve. We are sophisticated in this way because our environment itself is evolving, and our ways of behaviour need to also.

Another aspect that isn’t explored is whether there is a fundamental limitation to intelligence. Singularities are often unsustainable in practice because the mechanisms which they exploit are rapidly exhausted after initial launch. My own belief is that we became intelligent through a need to model each other, and ourselves, to perform better planning. That would have evolved to collaborative planning and complex social interactions. The human social system though became continually more complex as we introduced more and more intelligence within the social group. As it becomes more complex it becomes difficult to compute. To project further ahead we need to compress that complexity and abstract it. That may well be going on at an abstract level. But those who have performed time series prediction will know how quickly uncertainty accumulates as you try to look forward. There is often a timeframe ahead of which things become too misty to compute any more. Further computational power doesn’t help you in this instance, uncertainties in the system dominate. If intelligence is viewed as predictive, then this gives limits to how much computation is worth doing.

Bostrom assumes speeding up intelligences will necessarily make them well beyond our comprehension. But this may not be the case, for example, IBM Watson’s Jeopardy win simply stored a lot more knowledge than we can imagine storing, and then used some simplistic techniques from language processing to recover those facts. That is not beyond our comprehension, that

My own belief is that this may well be the case for humans ability to predict each other, given our data constraints and our storage constraints.

I Within a 24 hour period to go from chatting to Daniel Kahneman to Baroness O’Neill.

AI Safety: ML Safety: Statistical safety

To a large extent we’ve heard these ideas before.

Validation of systems by statistics. Compare with Google driverless cars and evolving deployment of Tesla systems.Fairness and trust.

Highlights:Eric Schmidt, we want to solve hard problems like climate change. Thom dietterich in discussion: deep learning isn’t going to solve climate change.

Magpies (fighting over shiny objects) vs crows (sitting intelligently on fences).

Connectome, sembastian seung. Uploading.

Murray Shanhan: disagreement over quality of SR talk: Murray read Nick Bostrom. Murray thought that he got Yann to concede a few points.

Gaussian diagram… things you might expect to hear, and things you actually hear.

Betrand from Brown on the inaccessibility of the value function (over lunch).

Kelly from IBM harping on about Watson. (FT article)

Speeding up intelligence is what led to conv nets and recurrent nets working. (Nick Bostrom). Early examples of recurrent nets on auditory, conv on image and even text to speech … all our big successes are just more comptue and more data.

Monologues at NIPS, lack of debate in the futures section. Yann on unsupervised learning, RIch Sutton speaking up for reinforcement learning

Doughnut centres! Not just in big data at Sheffield, but maybe also with the hype … it may knock out the jammy centre (nod to MIT and Harvard as Boston Kreme).

The need for data scientists will go up in the early stages of AI development, just as it did for paper with computer printers (Xerox) and biological experiment with simulatin. We generate more hypothesis which need more testing. WE generate more analyses which will still need interpretation.

render your physical shell is enough to provoke a serious systems failure for our bodies, high accelleration is enough to cause your eyes to pop out (or

IBM

Strengths: a very well considered company with a large customer base, history of achievement in AI with Deep Blue and Watson in Jeopardy

Weaknesses: an overactive marketing campaign for Watson that makes it sound monolithic and implausible (see also this FT article. This makes the difficult to collaborate with and hard for customers to understand what they are buying in to. Customers who are data aware will also be worried about loss of control.

Opportunities: there are many companies that would rather ignore the data revolution and have someone else take care of it on their behalf. IBM are probably the leaders in providing that service. HP’s struggles to be seen as a competitor in that domain are well documented.

Threats: in the longer term the business strategy of farming out data may prove to be a foolish one, and the market place might shrink.

Apple

Strengths: combines the trick of being a prestige brand with being a best seller. This means they have a large amount of cash available. They seem to be positioing themselves as uninterested in your data. An interesting bet in an era where many of their competitors are busy exploiting it. This could prove to be another feather in the prestigious cap if the public becomes more sensitive to how their data is being used. Despite struggling to recruit directly from ML, due to lack of investment in conferences and a very closed approach to their technology, they have enough money to buy the pick of the available start ups as evidenced by their recent purchases of VocalIQ and Emotient. The recent addition of ad blocking capability to their phones also may indicate that they see direct marketing through data farming as less a part of a futures, and they are happy to spoil it for the other players.

Weaknesses: the current generation of artificial intelligence methods is based on machine learning and almost entirely data dependent. They are seen as a closed company that has only recently started officially appearing at the major conferences, although there are rumors they’ve been attending incognito for years.

One of their prestige offerings is that they don’t want your data.The next strand they are adding to their bow is the prestige of privacy.

Opportunities: Large

  1. Actually my favourite technical plot mechanisms are when the unforeseen twists save the day, like the bacteria attacking th emartions in H.G. Wells “War of the Worlds”. 

  2. Heat engines are systems for converting heat into useful work. By my definition intelligence is conducted through an inference engine that is for taking information and using it to conserve work (by avoiding things we didn’t need to do).