Towards Machine Learning Systems Design
Lessons from Computational Biology
Neil D. Lawrence
2019-05-14
Mathematical Genomics Away Day
What is Machine Learning?
\[ \text{data} + \text{model} \xrightarrow{\text{compute}} \text{prediction}\]
data : observations, could be actively or passively acquired (meta-data).
model : assumptions, based on previous experience (other data! transfer learning etc), or beliefs about the regularities of the universe. Inductive bias.
prediction : an action to be taken or a categorization or a quality score.
What is Machine Learning?
\[\text{data} + \text{model} \xrightarrow{\text{compute}} \text{prediction}\]
To combine data with a model need:
a prediction function \(\mappingFunction (\cdot)\) includes our beliefs about the regularities of the universe
an objective function \(\errorFunction (\cdot)\) defines the cost of misprediction.
Machine Learning
Driver of two different domains:
Data Science : arises from the fact that we now capture data by happenstance.
Artificial Intelligence : emulation of human behaviour.
Connection: Internet of Things
Machine Learning
Driver of two different domains:
Data Science : arises from the fact that we now capture data by happenstance.
Artificial Intelligence : emulation of human behaviour.
Connection: Internet of Things
Machine Learning
Driver of two different domains:
Data Science : arises from the fact that we now capture data by happenstance.
Artificial Intelligence : emulation of human behaviour.
Connection: Internet of People
Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data (1981/1/28)
What does Machine Learning do?
ML Automates through Data
Strongly related to statistics.
Field underpins revolution in data science and AI
With AI:
logic , robotics , computer vision , speech
With Data Science:
databases , data mining , statistics , visualization
What does Machine Learning do?
Automation scales by codifying processes and automating them.
Need:
Interconnected components
Compatible components
Early examples:
Codify Through Mathematical Functions
How does machine learning work?
Jumper (jersey/sweater) purchase with logistic regression
\[ \text{odds} = \frac{p(\text{bought})}{p(\text{not bought})} \]
\[ \log \text{odds} = \beta_0 + \beta_1 \text{age} + \beta_2 \text{latitude}.\]
Codify Through Mathematical Functions
How does machine learning work?
Jumper (jersey/sweater) purchase with logistic regression
\[ p(\text{bought}) = \sigmoid{\beta_0 + \beta_1 \text{age} + \beta_2 \text{latitude}}.\]
Codify Through Mathematical Functions
How does machine learning work?
Jumper (jersey/sweater) purchase with logistic regression
\[ p(\text{bought}) = \sigmoid{\boldsymbol{\beta}^\top \inputVector}.\]
Codify Through Mathematical Functions
How does machine learning work?
Jumper (jersey/sweater) purchase with logistic regression
\[ \dataScalar = \mappingFunction\left(\inputVector, \boldsymbol{\beta}\right).\]
We call \(\mappingFunction(\cdot)\) the prediction function .
Fit to Data
Use an objective function
\[\errorFunction(\boldsymbol{\beta}, \dataMatrix, \inputMatrix)\]
E.g. least squares \[\errorFunction(\boldsymbol{\beta}, \dataMatrix, \inputMatrix) = \sum_{i=1}^\numData \left(\dataScalar_i - \mappingFunction(\inputVector_i, \boldsymbol{\beta})\right)^2.\]
Two Components
Prediction function, \(\mappingFunction(\cdot)\)
Objective function, \(\errorFunction(\cdot)\)
Machine Learning in Supply Chain
Supply chain : Large Automated Decision Making Network
Major Challenge:
We have a mechanistic understanding of supply chain.
Machine learning is a data driven technology.
Deploying Artificial Intelligence
Challenges in deploying AI.
Currently this is in the form of “machine learning systems”
Internet of People
Fog computing: barrier between cloud and device blurring.
Complex feedback between algorithm and implementation
Deploying ML in Real World: Machine Learning Systems Design
Major new challenge for systems designers.
Internet of Intelligence but currently:
Machine Learning Systems Design
Fragility of AI Systems
They are componentwise built from ML Capabilities.
Each capability is independently constructed and verified.
Pedestrian detection
Road line detection
Important for verification purposes.
Robust
Need to move beyond pigeonholing tasks.
Need new approaches to both the design of the individual components, and the combination of components within our AI systems.
Rapid Reimplementation
Whole systems are being deployed.
But they change their environment.
The experience evolved adversarial behaviour.
Machine Learning Systems Design
Adversaries
Stuxnet
Mischevious-Adversarial
An Intelligent System
Joint work with M. Milo
An Intelligent System
Joint work with M. Milo
Peppercorns
A new name for system failures which aren’t bugs.
Difference between finding a fly in your soup vs a peppercorn in your soup.
Turnaround And Update
There is a massive need for turn around and update
A redeploy of the entire system.
This involves changing the way we design and deploy.
Interface between security engineering and machine learning.
Conclusion
The Cell is a Micro Supply Chain.
Analyzing cell data has a lot in common with analyzing supply chain data.
In Biology you are fortunate to have many cells (destructive testing).
In Supply Chain we find it easier to deploy modificiations for the system.
Downstream effects are complex and need monitoring.
Life is really good at dealing with evolving environments … our designs not so much.