Figure: Sandel's book looks at how to do the right thing with a context of moreal philosophy. Sandel (2010)
In the book "Justice: What's The Right Thing to Do?" (Sandel 2010) Michael Sandel aims to help us answer questions about how to do the right thing by giving some context and background in moral philosophy. Sandel is a philosopher based at Harvard University who is reknowned for his popular treatments of the subject. He starts by illustrating decision making through the 'trolley' problem.
Helper function for sampling data from two different classes.
import numpy as np
def create_data(per_cluster=30):
"""Create a randomly sampled data set :param per_cluster: number of points in each cluster """
X = []
y = []
scale =3
prec =1/(scale*scale)
pos_mean = [[-1, 0],[0,0.5],[1,0]]
pos_cov = [[prec, 0.], [0., prec]]
neg_mean = [[0, -0.5],[0,-0.5],[0,-0.5]]
neg_cov = [[prec, 0.], [0., prec]]
for mean in pos_mean:
X.append(np.random.multivariate_normal(mean=mean, cov=pos_cov, size=per_class))
y.append(np.ones((per_class, 1)))
for mean in neg_mean:
X.append(np.random.multivariate_normal(mean=mean, cov=neg_cov, size=per_class))
y.append(np.zeros((per_class, 1)))
return np.vstack(X), np.vstack(y).flatten()
Helper function for plotting the decision boundary of the SVM.
def plot_contours(ax, cl, xx, yy, **params):
"""Plot the decision boundaries for a classifier. :param ax: matplotlib axes object :param cl: a classifier :param xx: meshgrid ndarray :param yy: meshgrid ndarray :param params: dictionary of params to pass to contourf, optional """
Z = cl.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
# Plot decision boundary and regions
out = ax.contour(xx, yy, Z,
levels=[-1., 0., 1],
colors='black',
linestyles=['dashed', 'solid', 'dashed'])
out = ax.contourf(xx, yy, Z,
levels=[Z.min(), 0, Z.max()],
colors=[[0.5, 1.0, 0.5], [1.0, 0.5, 0.5]])
return out
import mlai
import os
def decision_boundary_plot(models, X, y, axs, filename, titles, xlim, ylim):
"""Plot a decision boundary on the given axes :param axs: the axes to plot on. :param models: the SVM models to plot :param titles: the titles for each axis :param X: input training data :param y: target training data"""for ax in axs.flatten():
ax.clear()
X0, X1 = X[:, 0], X[:, 1]
if xlim isNone:
xlim = [X0.min()-1, X0.max()+1]
if ylim isNone:
ylim = [X1.min()-1, X1.max()+1]
xx, yy = np.meshgrid(np.arange(xlim[0], xlim[1], 0.02),
np.arange(ylim[0], ylim[1], 0.02))
for cl, title, ax inzip(models, titles, axs.flatten()):
plot_contours(ax, cl, xx, yy,
cmap=plt.cm.coolwarm, alpha=0.8)
ax.plot(X0[y==1], X1[y==1], 'r.', markersize=10)
ax.plot(X0[y==0], X1[y==0], 'g.', markersize=10)
ax.set_xlim(xlim)
ax.set_ylim(ylim)
ax.set_xticks(())
ax.set_yticks(())
ax.set_title(title)
mlai.write_figure(os.path.join(filename),
figure=fig,
transparent=True)
return xlim, ylim
import matplotlib
font = {'family' : 'sans',
'weight' : 'bold',
'size' : 22}
matplotlib.rc('font', **font)
import matplotlib.pyplot as plt
# Create an instance of SVM and fit the data.
C =100.0# SVM regularization parameter
gammas = [0.001, 0.01, 0.1, 1]
per_class=30
num_samps =20# Set-up 2x2 grid for plotting.
fig, ax = plt.subplots(1, 4, figsize=(10,3))
xlim=None
ylim=Nonefor samp inrange(num_samps):
X, y=create_data(per_class)
models = []
titles = []
for gamma in gammas:
models.append(svm.SVC(kernel='rbf', gamma=gamma, C=C))
titles.append('$\gamma={}$'.format(gamma))
models = (cl.fit(X, y) for cl in models)
xlim, ylim = decision_boundary_plot(models, X, y,
axs=ax,
filename='../slides/diagrams/ml/bias-variance{samp:0>3}.svg'.format(samp=samp),
titles=titles,
xlim=xlim,
ylim=ylim)
Figure: In each figure the more simple model is on the left, and the more complex model is on the right. Each fit is done to a different version of the data set. The simpler model is more consistent in its errors (bias error), whereas the more complex model is varying in its errors (variance error).
Decision Making and Bias-Variance
In a population we should prefer variance-errors.
Bias errors lead to consistent, decsion making.
Consistently wrong!
Variance errors can also be averaged e.g. bagging and boosting(Breiman 1996)
Complex explanations such as half-time football punditry.
Also clinical experts (Meehl 1954). Meehl suggested they 'try to be clever and think outside the box'.
One Correct Solution
Artificial Selection and Eugenics.
OK for race horses, greyhounds, crops, sheep and cows
Not OK for the human race.
One Correct Solution
Flawed understanding of science
Animals in a species become too specialised then they may not be able to respond to changing circumstances.
Think of cheetahs and eagles vs rats and pigeons.
Similar Ideas Socially
I may not agree with many people's subjective approach to life, I may even believe it to be severely sub-optimal. But I should not presume to know better, even if prior experience shows that my own 'way of being' is effective.
Variation is vitally important for robustness. There may be future circumstances where my approaches fail utterly, and other ways of being are better.
A Universal Utility
Quality of our individual subjective utilities measured by effectiveness.
But it is survival of entire species that dominates long term.
A universal utility by which we are judged is difficult to define.
The Real Ethical Dilemma
Trolley Problem is an oversimplification.
Driverless cars:
introduce driverless cars and bring about a 90% reduction in deaths
There is only one absolute policy we should follow.
There will be single absolute policy that should be followed slavishly in all circumstances
George Box
Since all models are wrong the scientist must be alert to what is importantly wrong. It is inappropriate to be concerned about mice when there are tigers abroad.
George E. P. Box (Box 1976)
Tigers and Trolleys
A simple switch in the points, is deterministic/mechanistic
Figure: The original trolley problem. The decision is deterministic.
The second example is largely contrived, and riddled with uncertainty.
Figure: In the situation where you push an overweight gentleman, the decision is riddled with uncertainty. Doubt inevitably creeps in.
References
Box, George E. P. 1976. “Science and Statistics.” Journal of the American Statistical Association 71 (356): 791–99. http://www.jstor.org/stable/2286841.
Geman, Stuart, Elie Bienenstock, and René Doursat. 1992. “Neural Networks and the Bias/Variance Dilemma.” Neural Computation 4 (1): 1–58. doi:10.1162/neco.1992.4.1.1.
Kahneman, Daniel. 2011. Thinking Fast and Slow.
Meehl, Paul E. 1954. Clinical Versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence.
Sandel, Michael. 2010. Justice: What’s the Right Thing to Do?