Exploring Connections Between Intelligence and Thermodynamics
Neil D. Lawrence
Departmental Seminar, Department of Computer Science, University of Manchester
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Entropy \[ S(X) = -\sum_X \rho(X) \log p(X) \]
In thermodynamics preceded by Boltzmann’s constant, \(k_B\)
Where \[ E_\rho\left[T(Z)\right] = \nabla_\boldsymbol{\theta}A(\boldsymbol{\theta}) \] because \(A(\boldsymbol{\theta})\) is log partition function.
operates as a cummulant generating function for \(\rho(Z)\).
Joint entropy can be decomposed \[ S(Z) = S(X,M) = S(X|M) + S(M) = S(X) - I(X;M) + S(M) \]
Mutual information \(I(X;M)\) connects information and energy
Measurement changes system entropy by \(-I(X;M)\)
Increases available energy
Difference in available energy: \[ \Delta A = A(X) - A(X|M) = I(X;M) \]
Can recover \(k_B T \cdot I(X;M)\) in work from the system
\[ I(X;M) = \sum_{x,m} \rho(x,m) \log \frac{\rho(x,m)}{\rho(x)\rho(m)}, \]
Unlike animal game (which reduces entropy), Jaynes’ World maximizes entropy
System evolves by ascending the entropy gradient \(S(Z)\)
Animal game: max uncertainty → min uncertainty
Jaynes’ World: min uncertainty → max uncertainty
Thought experiment: looking backward from any point
Game appears to come from minimal entropy configuration (“origin”)
Game appears to move toward maximal entropy configuration (“end”)
\[ \rho(Z) = h(Z) \exp(\boldsymbol{\theta}^\top T(Z) - A(\boldsymbol{\theta})), \] where \(h(Z)\) is the base measure, \(T(Z)\) are sufficient statistics, \(A(\boldsymbol{\theta})\) is the log-partition function, \(\boldsymbol{\theta}\) are the natural parameters of the distribution.}
\(X\) divided into past/present \(X_0\) and future \(X_1\)
Conditional mutual information: \[ I(X_0; X_1 | M) = \sum_{x_0,x_1,m} p(x_0,x_1,m) \log \frac{p(x_0,x_1|m)}{p(x_0|m)p(x_1|m)} \]
Measures dependency between past and future given memory state
Perfect Markovianity: \(I(X_0; X_1 | M) = 0\)
Memory variables capture all dependencies between past and future
Tension between Markovianity and minimal entropy creates uncertainty principle
\[ \Delta \theta_{\text{steepest}} = \eta \frac{\text{d}S}{\text{d}\theta} = \eta p(1-p)(\log(1-p) - \log p). \] \[ G(\theta) = p(1-p) \] \[ \Delta \theta_{\text{natural}} = \eta(\log(1-p) - \log p) \]
Converging perspectives on intelligence:
Unified core: Intelligence as optimal information processing
Implications: