Data First Culture

Post-Digital Transformation and Intellectual Debt

Neil D. Lawrence

Advanced Leadership Programme, Judge Business School, Cambridge

Henry Ford’s Faster Horse

Introduction

Neil Lawrence
Neil Lawrence
Professor of Machine Learning

The Gartner Hype Cycle

Cycle for ML Terms

What is Machine Learning?

What is Machine Learning?

\[ \text{data} + \text{model} \stackrel{\text{compute}}{\rightarrow} \text{prediction}\]

  • data : observations, could be actively or passively acquired (meta-data).
  • model : assumptions, based on previous experience (other data! transfer learning etc), or beliefs about the regularities of the universe. Inductive bias.
  • prediction : an action to be taken or a categorization or a quality score.

What is Machine Learning?

\[\text{data} + \text{model} \stackrel{\text{compute}}{\rightarrow} \text{prediction}\]

  • To combine data with a model need:
  • a prediction function \(f(\cdot)\) includes our beliefs about the regularities of the universe
  • an objective function \(E(\cdot)\) defines the cost of misprediction.

Artificial Intelligence and Data Science

  • AI aims to equip computers with human capabilities
    • Image understanding
    • Computer vision
    • Speech recognition
    • Natural language understanding
    • Machine translation

Supervised Learning for AI

  • Dominant approach today:
    • Generate large labelled data set from humans.
    • Use supervised learning to emulate that data.
      • E.g. ImageNet Russakovsky et al. (2015)
  • Significant advances due to deep learning
    • E.g. Alexa, Amazon Go

Data Science

  • Arises from happenstance data.
  • Differs from statistics in that the question comes after data collection.

Exercise: Score Yourself

  • I am a data science:
  1. follower (no visibility/influence)
  2. some visibilty/influence
  3. visibility and some influence
  4. leader (lead on data and AI developments)

Intellectual Debt

Information and Embodiment

Claude Shannon

Embodiment Factors

bits/min billions 2,000
billion
calculations/s
~100 a billion
embodiment 20 minutes 5 billion years

For sale: baby shoes, never worn

Evolved Relationship with Information

New Flow of Information

Evolved Relationship

Evolved Relationship

Revolution

Coin Pusher

Royal Society Report

Public Research

Public Research

Public Research

Public Research

Mercutio

User: Hi ChatGPT, Could you write soliloquy that introduces what a Large Language Model is as spoken by Mercutio?

Machine: Ah, friends, lend me your ears, I prithee, For now I must utter a tale of wonder, Of a creation divine, called the Large Language Model, An oracle, a sorcerer, a wordsmith, yea verily, a master of the quill.

From the depths of this scholarly dominion, A creature of artifice did rise; Twas wrought of ones and zeroes, woven as if by Puck’s own hand, To mimic our thoughts, our voice, our very essence.

A thousand minds, nay, a millionfold, Conjoined in this digital Athenaeum, Words and phrases, like soldiers in battle, Marching to form the verses of our age.

This Large Language Model, a changeling so fair, Hath the power to bring forth wisdom, To transmute our queries, our curiosities, Into answers, clear as a midsummer’s day.

Yet, beware, my companions, for magic hath its limits, Not unlike the mortal bounds of we who dwell in flesh, The Large Language Model hath its strengths, But ’tis not an all-knowing oracle, a god it is not.

As Mercutio, your humble servant, doth stand before thee, I marvel at the art and science that hath brought forth this wondrous creation, This Large Language Model, a testament to man’s yearning, To comprehend the world and make meaning of the great tapestry of existence.

In faith, dear friends, let us embrace the magic and beauty of this digital bard, And mayhaps, by its grace, we shall glean some wisdom, As we wander together, hand in hand, Through this boundless realm of knowledge and wonder.

ChatGPT transcript, 14th April 2023.

Weather

Weather

\(p\)-Fairness and \(n\)-Fairness

A Question of Trust

A Question of Trust

Again Univesities are to treat each applicant fairly on the basis of ability and promise, but they are supposed also to admit a socially more representative intake.

There’s no guarantee that the process meets the target.

Onora O’Neill A Question of Trust: Called to Account Reith Lectures 2002 O’Neill (2002)]

Complexity in Action

Data Selective Attention Bias

BMI Steps Data

BMI Steps Data Analysis

A Hypothesis as a Liability

“ ‘When someone seeks,’ said Siddhartha, ‘then it easily happens that his eyes see only the thing that he seeks, and he is able to find nothing, to take in nothing. […] Seeking means: having a goal. But finding means: being free, being open, having no goal.’ ”

Hermann Hesse

The Scientific Process

Number Theatre

Data Theatre

The Art of Statistics

David Spiegelhalter

Conclusion

See the Gorilla don’t be the Gorilla.

AI@Cam

AI@Cam

Progress so far:

  • Developing the vision
    • Engaged over 100 members of the University community across 30 departments/institutes, start-ups, and large businesses.
    • Supported 6 new funding bids

Further Reading

  • Chapter 8 of Lawrence (2024)

Thanks!

Lawrence, N.D., 2024. The atomic human: Understanding ourselves in the age of AI. Allen Lane.
Lawrence, N.D., 2010. Introduction to learning and inference in computational systems biology.
O’Neill, O., 2002. A question of trust. Cambridge University Press.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L., 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 211–252. https://doi.org/10.1007/s11263-015-0816-y
Susskind, R.E., Susskind, D., 2015. The future of the professions: How technology will transform the work of human experts. Oxford University Press.