
# Towards Machine Learning Systems Design

Lessons from Computational Biology

Mathematical Genomics Away Day

### What is Machine Learning?

$\text{data} + \text{model} \xrightarrow{\text{compute}} \text{prediction}$

• data : observations, could be actively or passively acquired (meta-data).
• model : assumptions, based on previous experience (other data! transfer learning etc), or beliefs about the regularities of the universe. Inductive bias.
• prediction : an action to be taken or a categorization or a quality score.

### What is Machine Learning?

$\text{data} + \text{model} \xrightarrow{\text{compute}} \text{prediction}$

• To combine data with a model need:
• a prediction function $\mappingFunction (\cdot)$ includes our beliefs about the regularities of the universe
• an objective function $\errorFunction (\cdot)$ defines the cost of misprediction.

### Machine Learning

• Driver of two different domains:
1. Data Science: arises from the fact that we now capture data by happenstance.
2. Artificial Intelligence: emulation of human behaviour.
• Connection: Internet of Things

### Machine Learning

• Driver of two different domains:
1. Data Science: arises from the fact that we now capture data by happenstance.
2. Artificial Intelligence: emulation of human behaviour.
• Connection: Internet of Things

### Machine Learning

• Driver of two different domains:
1. Data Science: arises from the fact that we now capture data by happenstance.
2. Artificial Intelligence: emulation of human behaviour.
• Connection: Internet of People
Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data (1981/1/28)

### What does Machine Learning do?

• ML Automates through Data
• Strongly related to statistics.
• Field underpins revolution in data science and AI
• With AI:
• logic, robotics, computer vision, speech
• With Data Science:
• databases, data mining, statistics, visualization

### What does Machine Learning do?

• Automation scales by codifying processes and automating them.
• Need:
• Interconnected components
• Compatible components
• Early examples:
• cf Colt 45, Ford Model T

### Codify Through Mathematical Functions

• How does machine learning work?
• Jumper (jersey/sweater) purchase with logistic regression

$\text{odds} = \frac{p(\text{bought})}{p(\text{not bought})}$

$\log \text{odds} = \beta_0 + \beta_1 \text{age} + \beta_2 \text{latitude}.$

### Codify Through Mathematical Functions

• How does machine learning work?
• Jumper (jersey/sweater) purchase with logistic regression

$p(\text{bought}) = \sigmoid{\beta_0 + \beta_1 \text{age} + \beta_2 \text{latitude}}.$

### Codify Through Mathematical Functions

• How does machine learning work?
• Jumper (jersey/sweater) purchase with logistic regression

$p(\text{bought}) = \sigmoid{\boldsymbol{\beta}^\top \inputVector}.$

### Codify Through Mathematical Functions

• How does machine learning work?
• Jumper (jersey/sweater) purchase with logistic regression

$\dataScalar = \mappingFunction\left(\inputVector, \boldsymbol{\beta}\right).$

We call $\mappingFunction(\cdot)$ the prediction function.

### Fit to Data

• Use an objective function

$\errorFunction(\boldsymbol{\beta}, \dataMatrix, \inputMatrix)$

• E.g. least squares $\errorFunction(\boldsymbol{\beta}, \dataMatrix, \inputMatrix) = \sum_{i=1}^\numData \left(\dataScalar_i - \mappingFunction(\inputVector_i, \boldsymbol{\beta})\right)^2.$

### Two Components

• Prediction function, $\mappingFunction(\cdot)$
• Objective function, $\errorFunction(\cdot)$

### Machine Learning in Supply Chain

• Supply chain: Large Automated Decision Making Network
• Major Challenge:
• We have a mechanistic understanding of supply chain.
• Machine learning is a data driven technology.

### Deploying Artificial Intelligence

• Challenges in deploying AI.
• Currently this is in the form of “machine learning systems”

### Internet of People

• Fog computing: barrier between cloud and device blurring.
• Computing on the Edge
• Complex feedback between algorithm and implementation

### Deploying ML in Real World: Machine Learning Systems Design

• Major new challenge for systems designers.
• Internet of Intelligence but currently:
• AI systems are fragile

### Fragility of AI Systems

• They are componentwise built from ML Capabilities.
• Each capability is independently constructed and verified.
• Pedestrian detection
• Important for verification purposes.

### Robust

• Need to move beyond pigeonholing tasks.
• Need new approaches to both the design of the individual components, and the combination of components within our AI systems.

### Rapid Reimplementation

• Whole systems are being deployed.
• But they change their environment.
• The experience evolved adversarial behaviour.

• Stuxnet

### An Intelligent System

Joint work with M. Milo

### An Intelligent System

Joint work with M. Milo

### Peppercorns

• A new name for system failures which aren’t bugs.
• Difference between finding a fly in your soup vs a peppercorn in your soup.

### Turnaround And Update

• There is a massive need for turn around and update
• A redeploy of the entire system.
• This involves changing the way we design and deploy.
• Interface between security engineering and machine learning.

### Conclusion

• The Cell is a Micro Supply Chain.
• Analyzing cell data has a lot in common with analyzing supply chain data.
• In Biology you are fortunate to have many cells (destructive testing).
• In Supply Chain we find it easier to deploy modificiations for the system.
• Downstream effects are complex and need monitoring.
• Life is really good at dealing with evolving environments … our designs not so much.