I’m planning to put this into Arxiv, but thought I’d share via a blog post first in case there’s any feedback.

TL;DR Variation is important in decision making.


It should be remembered that just as the Declaration of Independence promises the pursuit of happiness rather than happiness itself, so the iterative scientific model building process offers only the pursuit of the perfect model. For even when we feel we have carried the model building process to a conclusion some new initiative may make further improvement possible. Fortunately to be useful a model does not have to be perfect.

George E. Box (1979)

In the book “Justice: What’s The Right Thing to Do?” (Sandel 2010) Michael Sandel aims to help us answer questions about how to do the right thing by giving some context and background in moral philosophy. Sandel is a philosopher based at Harvard University who is reknowned for his popular treatments of the subject. He starts by illustrating decision making through the ‘trolley’ problem.

Trolley problems seem to be a mainstay of moral philosophy: in the original variant (Foot 1967) there is a runaway trolley1 rolling at speed down a track, it is approaching a set of points beyond which is a group of five workers. The workers do not have time to get out of the way of the trolley. They will be killed. You have the opportunity to switch the track, saving the five workers, but the trolley will run onto another track, killing a single worker. You could shout a warning, but somehow the workers wouldn’t hear you, or if they did hear they wouldn’t have time to get out of the way. The moral question is should you switch the track? You would kill one worker, but save five. Apparently most of us think that in that situation we’d pull the switch.2

Sandel starts his book by looking at utilitarianism. Utilitarianism says that we should make decisions that lead to the the greatest benefit to humanity. Under utility theory, Sandel explains, we pull the lever because we expect fewer people will be unhappy when one person dies than when five people die. Of course we can debate that, and we will come back to that in a moment.

Utility Theory and Utilitarianism

In moral philosophy utilitarianism is the idea that when considering a choice of actions we should choose the one that brings the most benefit to the population.

Utilitarianism is a philosophical variation of utility theory. It is due to Jeremy Bentham and John Stuart Mill. For their time the ideas in utilitarianism are very advanced. But it needs to be borne in mind that that time was rather more limited mathematically and scientifically than we are today. Jeremy Bentham predates Laplace and was born only 15 years after Newton. In Jeremy Bentham’s time differential calculus was only taught in advanced degrees in the most sophisticated universities. Probability was only just beginning to be understood.

Utility Theory

Utilitarianism is an example of a philosophy that suggests that the result of a decision can be evaluated mathematically. By quantifying the benefits of the decision and weighing them against the downsides, the idea is that decision making can be rendered mathematical. Such mathematical functions are known as utility functions. The basic principle is that we can encode the good and the bad in a mathematical formula.

Since Bentham and Stuart Mill, the idea of framing our actions by formulating them mathematically has become inextricably interlinked with the domain of mathematical modelling. One way we have of quantifying the value of a mathematical model is its ‘goodness of fit’. In particular, we validate our models by evaluating how well the model does at predicting a quantity of interest. Sometimes that prediction can be associated with a direct monetary gain (such as in a stock market) or sometimes that prediction is associated with an improvement in the quality or length of life (such as in the diagnosis of disease).

Sensitivity and Specificity

Consider a model that tries to predict, on the basis of a test, whether an individual has cancer or not. Most, if not all, clinical tests are unreliable, i.e. a positive test does not always mean that the disease is present. The portion of positive tests where disease are present is known as the sensitivity of the test. It is a count of those people that are correctly diagnosed with the disease divided by the number who really have the disease. A test with 50% sensitivity will diagnose 50% of the people who really have the disease with the disease.

You might think it is a good thing to have a highly sensitive test, and you’d be right. But actually it is easy to increase the sensitivity of a test. Imagine a lazy lab technician, who rather than carrying out the test, just answers ‘disease present’ for every sample sent. The test will now have 100% sensitivity! Unfortunately all those who don’t have disease will also be categorised as having the disease. The technician has rendered the test non-specific. To account for a test that is weak in this way we also measure the quality of a test by its specificity: the number of patients who don’t have the disease that are correctly identified. For the lazy technician the specificity is 0%.

For clinical tests there is a trade off between sensitivity and specificity. A test that is highly sensitive (captures all the diseased population) will often be non specific (it will incorrectly diagnose a large portion of the non-diseased population).

When considering which tests to use we necessarily have to decide what the effect of wrong decisions is on indvidual people. If a test isn’t very sensitive, then patients will be given the all clear, when they actually have a disease. But if a test is highly sensitive but non-specific, many patients will be faced with unnecessary worry that they have the disease when none is present. The number of people involved will also depend on the base rates of the disease in the population. There will also be monetary cost associated with processing the tests and those who have a positive result.

We want to know the utility of the test, and Utility Theory is one approach we have to assessing that. If we make a decision that a particular sensitivity and specificity is acceptable for a test, if we make judgments of the numbers of worrying and unnecessary trips to the doctors that patients must endure then we are placing value on these aspects of life.

To make a single decision we must weigh up these different aspects against one another. We can then decide which outcomes we value the most. Even if we don’t write it all down explicitly, by our actions we can see that we are making decisions that define a utility function: a mathematical function that weights how we value the competing factors. Utility functions are so common that they receive many different names: objective function, cost function, error function, fitness function, risk function.3 The motivation for any given utility function can be different but the end result is the same, a mathematical formula by which we can quantify the relative value of a set of predictions or decisions.

The more we desire accountability for our decisions, the more that we require that they are rationalised, the greater our tendency to be explicit about the mathematical form of our utility function. The more explicit we become the easier it is to see how much value we are associating with aspects of our lives that we might instinctively feel cannot be priced. Aspects that cannot be bought or sold: our health or our life itself.

How can we reconcile this drive for accountability with our natural instinct that we should not be placing a price on such things?

The first thing we have to do is acknowledge the application of ideas about utility is much more sophisticated than is sometimes acknowledged in popular treatments of the subject. In Sandel’s book he criticises utilitarianism by framing it in the context of the historical arguments given in a specific era. But he doesn’t place those writings in context. The utility functions of Jeremy Bentham and John Stuart Mill provide a nascent theory, but there are a number of ways in which those ideas can be updated given the knowledge we have gained in the intervening 250 years.

The Push and the Trolley

In his book, Sandel follows up his initial trolley example with a more complex one.4 This time you are on a bridge. There is a rail line going under the bridge and there are workers on the line. This time there is a trolley going towards the five workers on the far side of the bridge. The workers will be killed by the runaway trolley. There is a man on the bridge with you, but there is nothing else to hand. As before you could shout, but the workers can’t get out of the way in time.

Apparently your only opportunity is to push the other man off the bridge onto the rails, thereby deflecting the trolley and saving the five men working on the line. You could jump onto the line yourself to deflect the trolley, but the man on the bridge is heavier than you, and you would be too light to deflect the trolley. Your only chance is to push the other man, the ‘fat man’, off.

It seems most people would choose not to push the fat man off. Sandel finds this difficult to explain from the perspective of utilitarianism and becomes dismissive of utilitarianism as a result.

We may be doing Sandel an injustice but it seems there is a significant over simplification in this scenario. In an attempt to construct a counter example to illustrate a point, there is a key aspect to the second example, which is not so present in the first. It’s an aspect that humans are faced with every day, but we are not entirely sure how we deal with it. It is a domain where humans can outperform machines, that aspect is uncertainty.

In an effort to define a situation that primes us for decision making the trolley scenarios tell stories. The story is far more complex in the second example than the first. We are unable to help but imagine the scenario. We might picture that the bridge is a viaduct. We might picture that it is made of sandstone. We might imagine that the fat man is wearing a blue shirt. We can picture what it means to push him off. One of Sandel’s arguments is that the difference between the two scenarios is that people are unable to kill the a man through direct interaction, and this aspect isn’t explained by utilitarianism. That is certainly a plausible explanation, but we don’t have to leave utilitarianism behind so quickly.

Evolution and Utilitarianism

Bentham was not aware of Darwin’s principle of natural selection, he predates Darwin. Bentham believed that we should take an action if it maximised the happiness of the population (thereby minimising pain). So his utility function was the sum of the happiness of the population. For its time this is an interesting concept, but Bentham’s disciple, John Stuart Mill, struggled with the idea that a night of debauchery was the same form of happiness as, for example, staying in and reading a good book.5 He argued that these different happinesses should not be valued the same. This is a clear flaw in the idea of a single utility function which Bentham had proposed. However, we should probably be allowed to update Bentham’s ideas a little in the light of what we’ve discovered since.

Natural Selection

Darwin’s idea was that species evolve through natural selection. Natural selection is a relatively simple concept, but it has some complex consequences. Natural selection just suggests that successful strategies will prevail. This is a somewhat self-verifying statement. The measure of success is that the strategy did prevail. However, there are some complex consequences. For a species to prevail it needs to survive. Natural selection forces us to think about why our behaviour might be helpful in determining our survival. Bentham’s idea was that we should maximize happiness. But given knowledge of Darwin, it may be that we’d prefer to identify happiness as an intermediate reward. Natural selection implies that one of our longer term goals is the survival of our species, with intermediat implications for our selves, our societies and our ways of life.

Most of us wouldn’t accept that our happiness at any precise moment is an absolute assessment of where we feel we are in life. In fact, we often find our state of happiness much more sensitive to changes in our circumstances than any particular absolute measurement we can make. If our circumstances are good in the absolute sense, but not improving, then we may have to consciously remind ourselves of how lucky we are to gain pleasure from our position.

In the children’s book “A Squash and a Squeeze” (Donaldson and Scheffler 2004) illustrates this idea nicely with a rendering of an old folk tale. An old lady lives in a house which she finds too small. She asks the advice of a “wise old man” who suggests she adds a chicken, a pig, a goat and a cow to her living quarters. The lady does as instructed finding her accommodation increasingly cramped as she does so. Finally the man tells the lady to let all the animals out again. The lady is now happy, finding her house to be far more spacious without the animals. She is happy at that moment, despite her absolute circumstances not having changed since she first went to the wise old man for advise. This folk tale may not provide a practical solution to the housing crisis: its moral is a caricature of our sensibilities: our happiness is more sensitive to our changes of circumstance than our absolute positioning. From an evolutionary perspective this makes sense. If as organisms we became satiated at a particular stage of achievement then our species or societies could become complacent.

The mathematical foundations of the study of change are given by Newton and Leibniz‘s work differential calculus. In Bentham’s time these ideas were taught only in the most advanced University courses. The argument above suggests that actually happiness is some (monotonic) function of the gradient of whatever personal utility function we have. Not the absolute value. This idea also may deal nicely with the issue of different types of happiness. A night of debauchery may make us instantaneously very happy, implying a high rate of change. But it is over quickly, implying that the absolute change in our circumstances is small (the absolute improvement in the utility would be the level of happiness multiplied by the time we were happy for). Of course this improvement may be offset by whatever the consequences of the debauchery are the next day. John Stuart Mill’s variation on utilitarianism considered ’higher pleasures’, such as the pleasure gained from literature and learning. See also Eleni Vasilaki’s perspective on Epicurus (Vasilaki 2017). Instantaneously we may experience less happiness when engaged in these activities compared to whatever our night of debauchery involved. However, these activities can be sustained for a very long period and allow us to achieve more. The absoute improvement in our circumstance would be given by the period of study multiplied by the pleasure given.

In the terminology of differential calculus, our happiness must be integrated to form our utility. Not instantaneously measured. It may be that Stuart Mill’s differentiating of happiness could have been reconciled with utility theory by realising he was actually distinguishing between sustainable forms of happiness and non-sustainable. Unfortunately we can’t revisit him to ask, but we can at least do him the justice of giving him the benefit of the doubt.

Daniel Kahneman’s Nobel Memorial Prize in Economics was awarded for the idea of prospect theory. Kahneman describes the theory and its background in his book, “Thinking Fast and Slow” (Kahneman 2011). Prospect theory is not a mathematical theory, but a theory in behavioral economics based on empirical observations of human behavior. Empirical observation about how we value different alternatives that involve risk. A key observation of Kahneman’s is in line with our analysis above: people are responsive to change in circumstance, not absolute circumstance. Prospect theory goes on to identify asymmetries in our sensitivities. A negative change in circumstance weights upon us greater than the equivalent positive change in circumstance.

Subjective Utility

Bentham’s ideas focussed around the idea of a global utility, maximisation of happiness across the population. Darwin’s principle of natural selection actually insists that there must be variation in the population, and therefore variation in our perception of our circumstances.

Natural selection relies on variation, because if there is no variation, then there can be no separation between effective and ineffective strategies. Different strategies arise from different value systems. If all organisms were to pursue the same strategy and when the circumstantial judgment of the selection process would fall upon all members of the species simultaneously, they would die or survive together.

A Cognitive Bias towards Variance

One of the themes that Kahneman explores is the tendency of humans to produce, through their System 2 thought processes (their ‘slow thinking brains’), overcomplicated explanations of observed data. There’s a tendency for people to focus on a detailed narrative as if it was pre-determined and within the control of all participants. In practice we cannot control events in such a regulated manner.

To predict we need data, a model and computation. Data is the information we are given, the model is our belief in the way the world works and computation is required to assimilate the two. This is true for humans and computers. Ignoring the quality of the data for a moment, and focussing on the model, our predictive system can fail in one of two ways. It can either over simplify or it can over complicate.

This binary choice may seem obvious, but it has some significant consequences. The phenomenon was studied in machine learning by Geman et al (Geman, Bienenstock, and Doursat 1992) who referred to it as the ‘bias variance dilemma’. They decomposed errors into those due to oversimplification (the bias error) and those due to insufficient data to underpin a complex model.

Bias errors are errors that arises when your model is not rich enough to capture all the nuances of the world around. Bias errors occur when the rich underlying phenomena underpinning an observation are ignored and a simpler model explanation is given. An example of a bias error would be one that arises from the simple model “home teams always win sports games”. There is some truth to the home advantage: this model will do better than 50/50 guessing. But it is an oversimplification. It is biased. However, because we have a lot of data about sports games, then two experts using this rule to predict outcome would make consistent predictions.

An error due to variance is one which occurs when we go too far the other way. There are a myriad of factors that could effect the outcome of a sports event. Weather, balloons on the pitch, the mental and physical fitness of each of the players. The quality of the pitch. If we take them all into account we might hope for better predictions. But in reality, we haven’t seen enough data to determine how each of these factors effects outcome. Badly determined parameters lead to high variance error. Two experts see slightly different data and weight these complex factors according to their perspective. As a result their predictions can vary, leading to variance error.

The important point is that these errors are fundamentally different in their characteristics. The error due to bias, the simplification error, comes about by not taking taking all the factors into account. In statistical models bias errors are very common: indeed they are often preferred because they are associated with simpler models and the parameters often have some explanatory power.

The type of error Kahneman is describing in human explanations would be termed an error due to variance. A variance error is different from a bias error. In a variance error you may have a model that is sufficient to describe the underlying system, but you don’t have enough data or information to pin down exactly how your observed outcome came to be. A characteristic of error due to variance is that different observers may have highly rich, but conflicting, explanations of what brought about the phenomenon. The soft of conflicting explanations that bring about lively debate in television studios during half-time breaks in football matches

The bias-variance dilemma is a major challenge in machine learning. One widely accepted solution to the dilemma is that we choose a model which exhibits larger bias because even though it is known to be incorrect (too simple) for the data w have available it will make better predictions. Many of Kahneman’s mechanism’s and solutions for human irrationality actually do introduce simple statistical models to improve the quality of prediction. Kahneman relates how decision making can be rendered more consistent and higher quality in this manner.

An alternative and widely used solution in machine learning is do develop large families of complex models that exhibit variance, just as individual humans do. However, once these models are trained they are not relied on individually but the are combined in ‘ensembles’ to predict together. They vote on the solution or their average prediction is taken. This idea is very similar to the ‘wisdom of the crowds’. Seen from this context there are very good reasons why a ‘population’ of intelligent beings should exhibit variance-error instead of bias-error. A characteristic of bias-error would be that we would all be consistent in our predictions. This is good for accounting for behaviour, but it is a serious problem if we are all consistently wrong. Variance-error implies that we all come up with different, over-complicated, reasons why events transpire as they did. Taken as a population we cover a wide range of alternatives. As a result we act in different ways, and make different decisions. It is clear that we will never be all correct, but when it comes to evolution, the important thing is that we are never all wrong.

Decision Making and Bias-Variance

A further advantage of choosing variance-error over bias-error for a population is that a consistent and robust prediction can always be achieved by averaging outcome. In machine learning approaches such as bagging and boosting (Breiman 1996) can be used to reduce variance-error in a population of models.6 Advocates of the “Wisdom of the Crowds” propose the same principle. By preferring bias-error in our population we have no recourse, but models that exhibit variance-error can always be combined to create a more stable prediction from the population as a whole: democratic decision making is one way to achieve this.

So the ‘rational’ behaviour of a population under natural selection is to sustain a variety of approaches to life. It follows then, that if there is to be natural selection within our species, our ideas of achievement should vary. Our individual utility should be subjective. What brings me happiness, may not bring you happiness. We probably have different ideas of debauchery, literature and learning. While we can disagree with each others tastes, natural selection tells us that our species is more robust if there if there is a diversity of approaches. Darwin’s principle tells us we should be like this.

When we see half-time football pundits debating their convolved explanations of the way the match is evolving we should remember that their arguments are all important and each one may have some validity. They are the result of overly complex models being applied on little data. The game of football is fundamentally stochastic, but the analysts treat it as deterministic.

This phenomenon is not new, and it is not constrained to football punditry. In 1954 the psychologist Peter Meehl wrote a book about how clinical experts can be outperformed by simple statistical models (Meehl 1954). Meehl suggested they ‘try to be clever and think outside the box’. Kahneman addresses this challenge in Chapter 21 of his book.

complexity may work in the odd case, but more often than not it reduces validity

going on to say

humans are incorrigibly inconsistent in making summary judgments of complex information. When asked to evaluate the same information twice, they frequently give different answers.


Unreliable judgements cannot be predictors of anything.

The two approaches Kahneman proposes for dealing with this in human society are:

  1. Replace human punditry making with simple statistical models. This also makes sense from a statistics point of view if we desire consistency we should replace the models that exhibit variance-error (the humans) with models that exhibit bias-error (simple statistical forumlae).

  2. Exploit wisdom of the crowds. Wisdom of the crowds is a proposal that human opinion should be aggregated to improve predictions. This is consistent with the idea that humans tend to make variance-errors.

Punditry is widespread, and the bias-variance analysis shows us that there are good reasons why, under natural selection, we should value such diversity of opinions. What is also important is that we should develop mechanisms in society for these opinions to be properly represented when making a decision.

This error is dangerous, it is one of the intellectual failings of those that sought to put ideas from eugenics into political practice. Early philosophies based on natural selection were overly focussed on the average value of the ‘fitness’ of a population. Trying to increase this value whilst simultaneously reducing variation is a very dangerous game. It is artificial selection. It assumes that you have preordained what the future natural circumstances are going to be. It may be OK for race horses, greyhounds, crops, sheep and cows because in those circumstances we are aiming to control their environment. It is not OK for the human race.

We are right to express moral outrage at what the negative eugenecists tried to achieve. But it was also motivated from a flawed understanding of science. Their model was wrong. Populations that are capable of excelling in a particular environment, because they are highly tuned to it, are rapidly extinguished when circumstances change. If the animals in a species become too specialised then they may not be able to respond to changing circumstances. Think of cheetahs and eagles vs rats and pigeons.

Socially the same principle should hold. I may not agree with many people’s subjective approach to life, I may even believe it to be severely sub-optimal. But I should not presume to know better, even if prior experience shows that my own ‘way of being’ is effective. Variation is vitally important for robustness. There may be future circumstances where my approaches fail utterly, and other ways of being are better.

A Universal Utility

The quality of our subjective utilities at any given time is measured by their effectiveness in the world. Survival of the species indicates that it is the sustenance of the entire human species that should concern us in the long run. Although there will be many intermediate effects that we will be looking to achieve in the medium term. Indeed, we may even question if there are circumstances under which we would not wish the human species to survive.7 The universal utility by which we are judged is therefore difficult to define. We can pin down aspects of it, but perhaps the best we can do is seek compromise between our individual utilities, while maintaining awareness that there may be outside forces, for example climate change, that will have such a detrimental effect on all our lives that it is worth investing significant time and effort as a society to try and reduce our exposure.

Lets get back to the trolleys.

The Real Ethical Dilemma

The trolley problem is an oversimplification, and one which is not useful in characterizing the moral dilemmas we are faced with in the modern era of computer decision making. Instead of the trolley problem, let’s propose a new dilemma, and let’s focus on driverless cars.

Most arguments for driverless cars focus on the overall reduction in human death rates we expect to result from their adoption. For example, if we introduce driverless cars and bring about a 90% reduction in deaths on the road, then that surely is a good thing. But what if the remaining 10% of deaths are focussed on a particular section of the population, for example, what if we find that the only people those cars do continue to kill are cyclists.

Now there are ethical and moral questions about what we have developed. Even if we have reduced the total number of cyclist deaths8 is it fair to disproportionately affect one section of the population? A simplistic utilitarian perspective would say yes, please proceed, although we individually might be uncomfortable with this. Let’s explore further.

Utilitarianism appears to be telling us to favour a set up that could disproportionately effect a minority: in this case cyclists. Let’s develop a more sophisticated view of the right utility and see how it might effect our conclusions.

Uncertainty: The Tyger that Burns Bright

There are two principles we should take into account when considering how we should aim to effect the evolution of our society. The first we have discussed above, uncertainty. The second is Darwin’s principle of natural selection.

Natural selection is a simple idea although it’s had a difficult history. Unfortunately, when applied on its own it leads to some unsophisticated principles that don’t work in practice. First of all, we might naively assume from natural selection that there is a single best way of doing things. This idea is embedded in the statement “survival of the fittest”. But that is a naive reinterpretation of natural selection that gives the wrong connotation. The phrase is not due to Darwin, but due to Herbert Spencer, a Victorian philosopher who applied principles of evolution to sociology. The phrase encourages the idea that one individual will survive, the one that has the best approach. This is a damaging idea, and one that should be put to bed.

The marvel of evolution is its responsiveness, success is defined by the environment, both at the individual level and the level of communities and species. But the environment itself is complex and evolving. It is dynamic. Strategies for success are therefore not static. The criteria for success are also uncertain. While we might accept that any given moment there is a ‘formula for success’, the dynamic nature of our evironment means that that formula for success is evolving.

Such a formula for success is equivalent to our utility function, but we are now placed in the circumstance where there is uncertainty around the utility function itself, because it is rapidly evolving and we don’t have access to it.

How quickly is that utility evolving? In the modern world, for humans, and animals, very quickly. Species are becoming extinct at an alarming rate, our world is warming, potentially changing our climate from our current relatively benign climate closer to the more uncomfortable climates of the past. This means the utility is evolving. The criteria for a lion born 200 years ago are very different from a lion today. How would we redesign the lion to survive today?

In a rapidly evolving environment, the species that are most vulnerable are those that are specialised to fragile niche environments that will disappear as our global environments evolve. As humans we are lucky in that we can consciously change our practices to react to our evolving environment. But one thing is clear: the idea that there is a single particular solution, an ‘absolute’ principle by which we should progress a society is seriously flawed.

In fact, that is not quite true. There is one absolute policy we should follow. That policy is: “There will be single absolute policy that should be followed slavishly in all circumstances”. Our environment is evolving, and will continue to evolve through our lives. Perhaps more so now than at any other time in human history. I’m not even referring to our climate, the temperature changes we might expect from global warming, I’m referring to our social circumstances, the rate at which new technologies are emerging, such that within the one individual’s lifespan our modes of intercommunication in our society are changing so as to be almost unrecognisable, and certainly unenvisageable from decade to decade.

In these periods uncertainty about the right thing to do dominates. And the correct form of response to uncertainty is to value diversity. Every form of extremism that forms a threat to the aspects of life I value can be characterised as absolutist. Whether communist, fascist, islamist or christianist. The malignant characteristic of their extremist forms prohibits any other strategy for society. You don’t need the existence of gods to interpret this as extreme folly and hubris. Ironically these absolutist philosophies are the only behaviours that we can absolutely exclude.

Tigers and Trolleys

From this perspective Sandel’s use of the second trolley example takes on a new light. The first example, that of a simple switch in the points, is about as deterministic and mechanistic as a situation can get. You pull the lever, you expect the points to change. Even if they don’t there is no downside associated with the consequences, other than a maybe angry railworker who sees you just tried to send a quarter ton railway wagon down his throat. The second example is largely contrived, and riddled with uncertainty. The story suggests that we know that the man we want to push off the bridge is heavy enough to divert the trolley and yet we ourselves aren’t. That strikes me as an absurd notion. We’ve had moments to assess the situation, and yet we can judge this with confidence? Next we have to heave this heavier man over the barrier at the side of the bridge (I’m assuming it has a barrier). Assuming we achieve this, we now have to ensure our aim is accurate enough to land the gentleman fully on the track. We’ve already determined his general heft is only just sufficient to divert the trolley, we know he’ll have to lay very squarely on the track, and remain there. Finally, who’s to say what the path of the trolley will be after hitting him? The endangered workmen are clearly close enough to not be able to clear the track in time, so is it not possible that the diverted trolley will hurtle into them anyway?

My own belief is that, perhaps at a sub-conscious level, humans are very sensitive to the uncertainty in this scenario. We can vividly imagine attempting to explain our actions after the event, and I think the reaction of anyone we explained them to would be utter incredulity: “So the trolley was hurtling to the men, and instead of yelling to them you attempted to push another man off the bridge?”. Only the most trivial simplification of the situation combined with a naive interpretation of utility theory would conclude that pushing the man was the correct action. Indeed, if it had not been phrased as a decision process, it would not even have entered our mind. Why then is it a mainstay of moral philosophy? This may be an example of what Daniel Kahneman refers to as theory induced blindness, but what I think would be more correctly referred to as model-induced blindness. Here the model of reasoning is one that ignores uncertainty and our ability to subconciously quantify it. That’s a good model for the original trolley example but an extremely poor one for the second. The model is wrong.

This brings us to mind again George Box’s quote, after all, the model may be wrong but is it useful? My colleague Richard Wilkinson pointed out to me that a better quote for the modern era (from the same paper) might be

Since all models are wrong the scientist must be alert to what is importantly wrong. It is inappropriate to be concerned about mice when there are tigers abroad.

George E. P. Box (1976)

I use this quote in talks9, the intent of this quote is it is important to worry about the manner in which the model is wrong. Models are abstractions, and it is important when modelling to decide upon the correct level of granularity for our abstraction. Unfortunately, it seems that there is a widespread propensity to take such a model (as applied to the first trolley question) and unquestioningly apply it to the next similar seeming problem on the test paper. Kahneman also has something to say about this type of behaviour from a behavioural psychologist perspective: he categorises it as a consequence of the laziness of System 2 and the resulting dominance of System 1. Or to put it in plain terms: very often people don’t stop and think. And in this case it seems to me that failure to stop and think means that the moral philosopher was eaten by a tiger before he had a chance to get anywhere near the fat man.

Apologies for being trite and rather polemical, but these points emerge from an ongoing frustration about the extent to which our theorising about decision making ignores the almost omnipresent effects of uncertainty. Uncertainty about our individual values, our values as a society, our future circumstances, our present circumstances. Uncertainty is endemic, and explicitly accounting for it induces robust behaviour in the presence of changing circumstance. Paramount among these is respect for diversity. Diversity of opinion and behaviour. This is the reason for our cognitive bias towards variance. The importance of diversity in dealing with an uncertain environment.


Uncertainty of the correct utility and our wider values means that, in practice, the ‘correct’ decision is difficult produce or verify. In these circumstances, there has a been a tendency to seek proxies such as consistency in decision making. However, a consistent decision making algorithm will tend to err on the side of over-simplifying circumstances, potentially missing particular nuances and more negatively effecting one sector of society over another.

In this paper we have argued that the right response to uncertainty is diversity. As studied by Peter Meehl and Daniel Kahnemann, our own cognitive bias towards overcomplicating situations leads to a diversity of responses by experts when presented with the same data. Both Meehl and Kahnemann characterize this diversity of response as a bad thing because, provably, if experts differ opinion then at least one of them must be ‘wrong’.

However, obsession about consistency between experts, or algorithms, misses the point that such algorithms (or experts) could also be consistently wrong. This is particularly worrisome for algorithmic decision making which can be deployed en masse with rapid detrimental effect over particular sectors of society. Such deployments are encouraged by a naive-utilitarian perspective, but a more sophisticated understanding of societal utility actually suggests that preservation of diversity could be much more important.

In an uncertain environment, we are arguing that society would be more robust if diversity of solutions and opinions are sustained and respected. Consistency of opinion should not be substituted as a proxy for truth, and diversity should be understood as being more robust in an evolving society such as ours where there is he uncertainty around our shared value system and the nature of ‘fitness’ in a rapidly evolving environment.


Box, George E. P. 1976. “Science and Statistics.” Journal of the American Statistical Association 71 (356): 791–99. http://www.jstor.org/stable/2286841.

———. 1979. “Robustness in the Strategy of Scientific Model Building.” Edited by R. L. Launer and G. N. Wilkinson. Academic Press, 201–36. http://www.dtic.mil/docs/citations/ADA070213.

Breiman, Leo. 1996. “Bagging Predictors.” Machine Learning 24 (2): 123–40. doi:10.1007/BF00058655.

Donaldson, Julia, and Axel Scheffler. 2004. A Squash and a Squeeze.

Foot, Philippa. 1967. “The Problem of Abortion and the Doctrine of the Double Effect in Virtues and Vices.” Oxford Review 5: 5–15. doi:10.1093/0199252866.003.0002.

Geman, Stuart, Elie Bienenstock, and René Doursat. 1992. “Neural Networks and the Bias/Variance Dilemma.” Neural Computation 4 (1): 1–58. doi:10.1162/neco.1992.4.1.1.

Kahneman, Daniel. 2011. Thinking Fast and Slow.

Meehl, Paul E. 1954. Clinical Versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence.

“Reported Road Casualties in Great Britain: Main Results 2015.” 2016. UK Department for Transport.

Sandel, Michael. 2010. Justice: What’s the Right Thing to Do?

Thomson, Judith Jarvis. 1976. “Killing, Letting Die, and the Trolley Problem.” The Monist 59 (2): 204–17.

Vasilaki, Eleni. 2017. “Is Epicurus the Father of Reinforcement Learning?” ArXiv E-Prints. https://arxiv.org/abs/1710.04582.

  1. In Phillipa Foot’s original version the example refers to a runaway tram and you are the driver, but this has evolved for it to be referred to as a trolley, and you have control of the points.

  2. There is almost palpable excitement in the room amoung students of humanities when driverless cars are discussed, mainly in anticipation (I believe) of a real world application of the trolley problem. The car has a choice between killing a grandma, or a baby, which does it choose? What algorithm does it use? We will discuss a more sophisticated consequence of the algorithm below, but if the car gets in the situation where it’s about to kill someone, something will have gone seriously wrong. There will be a great deal of uncertainty in what happens next, and the main objective should be to reduce the energy of the car, and minimize the result of the impact. In these circumstances, no car manufacturer will be calling upon moral subroutines to determine who should survive.

  3. Actually a risk function normally applies we are looking at expected utility, i.e. it incorporates some notion of the probability of outcomes.

  4. This example is due to Judith Jarvis Thomson, her original paper (Thomson 1976) is a very interesting read. She introduces three separate variations on Phillipa Foot’s trolley scenario and impersonalizes them by associating them with three characters, Edward, Frank and George.

  5. Since I first drafted these ideas, my colleague Eleni Vasilalki began exploring Epicurius’s philosophy as an underpinning of Reinforcement Learning. From a machine learner’s perspective the ideas of Epicurius also seem related to John Stuart Mill (???).

  6. If you have an Xbox at home and use the Kinnect, it is using exactly this technique to determine where you are in the video frame. The algorithm is called a “random forest”, and it averages across many ‘decision trees’ to create an output using a voting scheme. A decision tree is an approach to categorisation that is used in machine learning to classify objects. It uses a tree-like structure to decide what an object is according to its attributes. The algorithm typically takes one feature of the data at a time and ‘branches’ according to the value of the attribute, for example one decision in a ‘vehicle classifier’ might be to go down one branch if the number of wheels are greater than 3 (car or truck) and another if the wheels are less than 3 (motorbike or bicycle). A follow on question might involve engine maximum power, sending it down a different branch. Each tree makes its own, detailed decision. Each individual tree may also not be very good on average. Leo Breiman suggested an approach to aggregating these decisions to give the final answer. The algorithm is known as a random forest. Decision trees are a key component in Facebook’s advertising ranking algorithm. They also are used to identify where you are in the image when the Kinnect turns on.

  7. This idea emerges in the film “The Matrix”.

  8. Cyclist deaths in the UK currently make up around 6% of the total deaths. For example in 2015 of 1,730 total deaths on the road, 100 were cyclists (6%). Pedestrians made up a further 408 (24%), motorcyclists 365 (21%) (“Reported Road Casualties in Great Britain: Main Results 2015” 2016).

  9. I did once have to explain that it doesn’t mean the tigers are in foreign countries.