AI for Science

Paradigms, tacit knowledge, and scientific agency in the age of large models

Neil D. Lawrence

Bellairs Workshop on AI for Science

Questions

  • Where do scientific paradigms live?
  • What role does human understanding and agency play?
  • What’s an emerging playbook for AI-for-science?

The Structure of Scientific Revolutions

  • Paradigms: shift in science or in underpinning information infrastructure?
  • Understanding: what do we still demand (and from whom)?
  • Omni-models vs specialists: what’s gained/lost?
  • We already struggle to see the whole of science across fields and institutions.
  • LLM interfaces can lower the cost of moving across disciplines — but increase the risk of “plausible” error.
  • So the skill that matters is scepticism: what would convince us we’re wrong, and where is the verification boundary?
  • A model family \(f_\theta\) that maps inputs to outputs.
  • A learning objective that scores how well \(f_\theta\) matches data (and priors/constraints).
  • Optimisation + scale: we fit \(\theta\) with large compute and large datasets.
  • AlphaFold
  • GraphCast/Aardvark

Generalist Models

Generalist Examples

  • ImageNet
  • BERT
  • Chat interfaces
  • Polymathic
  • If the training data are rigorous equations (PDE/ODE solvers), we have a clearer sense of “ground truth.”
  • These models may not map onto human intuitions — but they can still be scientifically useful.
  • That makes physics a promising sandbox for a science of AI: what is learned, what generalises, and how do we verify?

The Unreasonable Effectiveness of Orchestration

  • LLMs emulate human behaviour
  • Unsurprising that the work better “in collaboration”
  • Also with rapid access to information infrastructure.

Agent Examples

  • Denario
  • Claude Code
  • Codex

Conjectures and Refutations

What are we Delegating?

  • Science as technology
  • Science as understanding

Judgement Examples

  • Covid19 Epidemiological Modelling
  • Does your model account for Facemasks?
  • Does your model account for Hotel Closures?

We might not have data, but we can do some arithmetic

Intellectual Debt

Technical Debt

  • Compare with technical debt.
  • Highlighted by Sculley et al. (2015).

Separation of Concerns

  • Decompose your complex problem/task into parts.
  • Each part manageable (e.g. by a small team)
  • Recompose to solve total problem

Addresses Complex Challenge

  • Highly successful approach to complex tasks.
  • Tuned to the human bandwidth limitation.
  • But the whole system still hard to understand.

Intellectual Debt

  • Technical debt is the inability to maintain your complex software system.
  • Intellectual debt is the inability to explain your software system.

Agentic Debt

  • Agentic AI could pay down technical and intellectual debt.
  • But it can create agentic debt: delegation without authority/authorship

Agentic Debt

  • Delegation of workflows without crisp boundaries.
  • Agentic debt is about unsafe or illegible delegation.

Judgment Layer

  • Agentic Orchestration compress reading/writing/coding into a single interface.
  • Tool use turns text into action: search, code execution, lab automation, simulation pipelines.
  • This shifts the bottleneck to verification and scientific judgement.
  • Operational understanding: can I use it safely and know when it fails?
  • Mechanistic understanding: do I have an interpretable causal/mechanistic story?
  • Paradigm understanding: can the community reproduce, contest, and extend it?
  • Social understanding: are the ideas understood in the wider public and other fields?
  • The “judgement layer” in organisations is often tacit: norms, exceptions, escalation paths, and context that rarely makes it into documentation.
  • It lives in handoffs and approvals: what gets challenged, what gets waived, and what triggers a halt.
  • Ceding this tacit knowlege without making it explicit is how we accumulate agentic debt.

The Information Infrastructure

Communication Bandwidth

  • Human communication: walking pace (2000 bits/minute)
  • Machine communication: light speed (billions of bits/second)
  • Our sharing walks, machine sharing …

New Flow of Information

New Flow of Information

Evolved Relationship

Evolved Relationship

HAM

HAM

In Mathematical Context

  • Databases / tables (OEIS): store patterns and prior results.
  • Solvers (CAS, SAT/SMT): mechanised search and case analysis.
  • Modern triad: proof assistants, machine learning, large language models.
  • Digitally verifiable: proof assistants (e.g. Lean) and machine-checkable artefacts.
  • Operationally reliable: code, simulators, pipelines — repeatable, but not always interpretable.
  • Tacit + contextual: protocols, judgement, and field knowledge (bio/geo/social science).
  • LLMs can compress and transmit tacit knowledge — but they push the bottleneck to verification boundaries.
  • We can hold humans to account for judgement (even when it’s wrong).
  • We can’t hold models to account; they don’t bear responsibility or liability.
  • So we must design systems where a named actor signs off, with traceable evidence.

Developing Science

  • What do we want trainees to be able to explain, verify, and challenge?
  • Where is the verification boundary in AI-for-science systems?
  • Which artefacts are the new paradigm stores?

Thanks!

  • company: Trent AI
  • book: The Atomic Human
  • twitter: @lawrennd
  • The Atomic Human pages Kuhn, Thomas: The Structure of Scientific Revolutions 295–299 , Popper, Karl: Conjectures and Refutations 327,328, intellectual debt 84, 85, 349, 365, separation of concerns 84-85, 103, 109, 199, 284, 371, intellectual debt 84-85, 349, 365, 376, topography, information 34-9, 43-8, 57, 62, 104, 115-16, 127, 140, 192, 196, 199, 291, 334, 354-5, anthropomorphization (‘anthrox’) 30-31, 90-91, 93-4, 100, 132, 148, 153, 163, 216-17, 239, 276, 326, 342, human-analogue machine (HAMs) 343-347, 359-359, 365-368.
  • newspaper: Guardian Profile Page
  • blog: http://inverseprobability.com

References

Kuhn, T.S., 1962. The structure of scientific revolutions. University of Chicago Press, Chicago, IL.
Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J.-F., Dennison, D., 2015. Hidden technical debt in machine learning systems, in: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (Eds.), Advances in Neural Information Processing Systems 28. Curran Associates, Inc., pp. 2503–2511.