Faith and AI: Introduction to Machine Learning

Neil D. Lawrence

2019-03-29

Faith and AI 2, St George’s House, Windsor

There are three types of lies: lies, damned lies and statistics

??

There are three types of lies: lies, damned lies and statistics

Benjamin Disraeli

There are three types of lies: lies, damned lies and statistics

Benjamin Disraeli 1804-1881

There are three types of lies: lies, damned lies and ‘big data’

Neil Lawrence 1972-?

Mathematical Statistics

‘Mathematical Data Science’

\[\text{data} + \text{model} \xrightarrow{\text{compute}} \text{prediction}\]

Machine Learning

Driver of two different domains:
1. Data Science: arises from the fact that we now capture data by happenstance.
2. Artificial Intelligence: emulation of human behaviour.
Connection: Internet of Things

Machine Learning

Driver of two different domains:
1. Data Science: arises from the fact that we now capture data by happenstance.
2. Artificial Intelligence: emulation of human behaviour.
Connection: Internet of ~~Things~~

Machine Learning

Driver of two different domains:
1. Data Science: arises from the fact that we now capture data by happenstance.
2. Artificial Intelligence: emulation of human behaviour.
Connection: Internet of People

Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data (1981/1/28)

What does Machine Learning do?

ML Automates through Data
- Strongly related to statistics.
- Field underpins revolution in data science and AI
With AI:
- logic, robotics, computer vision, speech
With Data Science:
- databases, data mining, statistics, visualization

Embodiment Factors


compute	\[\approx 100 \text{ gigaflops}\]	\[\approx 16 \text{ petaflops}\]
communicate	\[1 \text{ gigbit/s}\]	\[100 \text{ bit/s}\]
(compute/communicate)	\[10^{4}\]	\[10^{14}\]

See “Living Together: Mind and Machine Intelligence” Lawrence (2017)

Evolved Relationship

Information Barons threaten our Privacy

What’s Changed (Changing) for Medical Data?

Try Googling for: “patient data”…

The traction engine was preceded by a boy waving a red flag. It was restricted to two and a half miles an hour. However, the boy’s role was to warn oncoming traffic. In practice, drivers need to deal with over taking traffic as well. Mirror, signal manouver. Legislation evolved as we better understood the way in which we used the road.

See this Guardian article: “Let’s learn the rules of the digital road before talking about a web Magna Carta”}

What are the Issues?

Who owns our data?
Is it ‘finders keepers’?
Does ownership proliferate?
What does data protection offer?
Who has the right to share our data?
Can we withdraw this right?

Moral Panics: Perhaps Rightly

What’s Changed (Changing) for Medical Data?

Genotyping.
Epigenotyping.
Transcriptome.
Detailed characterization of phenotype.
- Stratification of patients.
Massive unstructured data sources.

Open Data

Automatic data curation: from curated data to curation of publicly available data.
Open Data: http://www.openstreetmap.org/?lat=53.38086&lon=-1.48545&zoom=17&layers=M.
Social network data, music information (Spotify), exercise.

### Africa

Why Africa?

Short circuit the process.
- For UK—infrastructure paralysis.
- For Africa—potential for distributed architectures.
- User-centric models of data management.
Store personal data on mobile phone within control of individual.

citizenme

UK Government Stipulation on Data Availability

Patient Online: Roadmap

PULSE Report

EMIS Patient Access

midata project

Botched

Topol Review

Example: Prediction of Malaria Incidence in Uganda

Work with Ricardo Andrade Pacheco, John Quinn and Martin Mubaganzi (Makerere University, Uganda)
See AI-DEV Group.

Malaria Prediction in Uganda

(Andrade-Pacheco et al., 2014; Mubangizi et al., 2014)

Tororo District

Malaria Prediction in Nagongera (Sentinel Site)

Mubende District

Malaria Prediction in Uganda

GP School at Makerere

Kabarole District

Early Warning System

Early Warning Systems

Deep Health

Understanding Patient Data

WannaCry

Bush Pilot Model

The difference between capability and intent.

Motivation

Indsidious decision-making that has downstream instrumental effects we don’t control.
A power-asymmetry between data-controllers and data-subjects
A loss of personhood in the re-representation of ourselves in the digital world.
The GDPR’s endeavour to curb contractual freedom cannot by itself reverse the power-asymmetry between data-controllers and data-subjects.

Analogy

Digital Democracy vs Digital Oligarchy Lawrence (2015a) or Digital Feudalism Lawrence (2015b)
Data subjects, data controllers and data processors.

Legal Mechanism of Trusts

Fiduciary responsibility of Trustees.
Burden of proof in negligence is reversed.
Trustees are data controllers
Beneficiaries are data subjects
Power of data accumulation wielded on the beneficiaries behalf
See Edwards (2004), Delacroix and Lawrence (2018) and Lawrence (2016)

Conclusion

Machine Learning is Underpinning Technology for AI.
Also drives data science.
There are challenges and pitfalls for data and personal privacy.
Promise of data driven solutions.
Pitfalls of loss of privacy.
Data Trusts as a solution.

Thanks!

twitter: @lawrennd
podcast: The Talking Machines
newspaper: Guardian Profile Page
Blog post on Lies, Damned Lies and Big Data
Blog post on System Zero

References

Andrade-Pacheco, R., Mubangizi, M., Quinn, J., Lawrence, N.D., 2014. Consistent mapping of government malaria records across a changing territory delimitation. Malaria Journal 13. https://doi.org/10.1186/1475-2875-13-S1-P5

Delacroix, S., Lawrence, N.D., 2018. Disturbing the “one size fits all” approach to data governance: Bottom-up data trusts. SSRN. https://doi.org/10.2139/ssrn.3265315

Edwards, L., 2004. The problem with privacy. International Review of Law Computers & Technology 18, 263–294.

Lawrence, N.D., 2017. Living together: Mind and machine intelligence. arXiv.

Lawrence, N.D., 2016. Data trusts could allay our privacy fears.

Lawrence, N.D., 2015a. Beware the rise of the digital oligarchy.

Lawrence, N.D., 2015b. The information barons threaten our autonomy and our privacy.

Mubangizi, M., Andrade-Pacheco, R., Smith, M.T., Quinn, J., Lawrence, N.D., 2014. Malaria surveillance with multiple data sources using Gaussian process models, in: 1st International Conference on the Use of Mobile Ict in Africa.