Information Topography

A Journey Through Jaynes’ World

Neil D. Lawrence

DALI Sorrento Meeting

Jaynes’ World

  • System governed by probability and entropy
  • Framework for observer-free dynamics and entropy-based emergence
  • Aim: better understanding of the notion of an “information topography”

Definitions and Global Constraints

System Structure

  • Full set of variables: \(Z = \{Z_1, Z_2, \dots, Z_n\}\)
  • Partition at time \(t\):
    • Active variables \(X(t)\): contributing to entropy
    • Latent variables \(M(t)\): information reservoir

Representation via Density Matrix

  • System state: \(\rho(\boldsymbol{\theta}) = \frac{1}{Z(\boldsymbol{\theta})} \exp\left( \sum_i \theta_i H_i \right)\)
  • Entropy: \(S(\boldsymbol{\theta}) = A(\boldsymbol{\theta}) - \boldsymbol{\theta}^\top \nabla A(\boldsymbol{\theta})\)
  • where \(A(\boldsymbol{\theta}) = \log Z(\boldsymbol{\theta})\)
  • Fisher Information: \(G_{ij}(\boldsymbol{\theta}) = \frac{\partial^2 A}{\partial \theta_i \partial \theta_j}\)

Entropy Capacity and Resolution

  • Maximum entropy: \(N\) bits
  • Minimum detectable resolution: \(\varepsilon\)
  • System dynamics show discrete, detectable transitions

Dual Role of Parameters and Variables

  • Each variable \(Z_i\) has:
    • Generator \(H_i\)
    • Natural parameter \(\theta_i\)
  • Active parameters evolve with \(|\dot{\theta}_i| \geq \varepsilon\)

Core Axiom: Entropic Dynamics

  • System evolves by steepest ascent in entropy \[ \frac{d\boldsymbol{\theta}}{dt} = -G(\boldsymbol{\theta}) \boldsymbol{\theta} \]
  • Follows the gradient of entropy in parameter space

Histogram Game

Two-Bin Histogram Example

  • Simplest example: Two-bin system
  • States represented by probability \(p\) (with \(1-p\) in second bin)

Entropy

  • Entropy \[ S(p) = -p\log p - (1-p)\log(1-p) \]
  • Maximum entropy at \(p = 0.5\)
  • Minimal entropy at \(p = 0\) or \(p = 1\)

Natural Gradients vs Steepest Ascent

\[ \Delta \theta_{\text{steepest}} = \eta \frac{\text{d}S}{\text{d}\theta} = \eta p(1-p)(\log(1-p) - \log p). \] \[ G(\theta) = p(1-p) \] \[ \Delta \theta_{\text{natural}} = \eta(\log(1-p) - \log p) \]

Gradient Ascent in Natural Parameter Space

Gradient Ascent Evolution

Entropy Evolution

Trajectory in Natural Parameter Space

Four Bin Histogram Entropy Game

Four Bin Histogram Entropy Game

Constructed Quantities and Lemmas

Variable Partition

\[ X(t) = \left\{ i \mid \left| \frac{\text{d}\theta_i}{\text{d}t} \right| \geq \varepsilon \right\}, \quad M(t) = Z \setminus X(t) \]

  • Active variables: \(X(t) = \left\{ i \mid \left| \frac{\text{d}\theta_i}{\text{d}t} \right| \geq \varepsilon \right\}\)
  • Latent variables: \(M(t) = Z \setminus X(t)\)
  • Partition changes as system evolves

Fisher Information Matrix Partitioning

  • Partition \(G(\boldsymbol{\theta})\) into active/latent blocks
  • \(G_{XX}\): Information geometry of active variables
  • \(G_{MM}\): Structure of latent reservoir
  • \(G_{XM}\): Cross-coupling between domains

Lemma 1: Form of the Minimal Entropy Configuration

  • Minimal entropy state: \(\rho(\boldsymbol{\theta}_o) = \frac{1}{Z(\boldsymbol{\theta}_o)} \exp\left( \sum_i \theta_{oi} H_i \right)\)
  • All parameters sub-threshold: \(|\dot{\theta}_{oi}| < \varepsilon\)
  • Regular, continuous, and detectable above resolution scale

Lemma 2: Symmetry Breaking

  • If \(\theta_k \in M(t)\) and \(|\dot{\theta}_k| \geq \varepsilon\), then \(\theta_k \in X(t + \delta)\)
  • Latent variables can become active when their rate of change exceeds threshold
  • Mechanism for emergence of new active variables

Four-Bin Saddle Point Example

  • Four-bin system creates 3D parameter space
  • Saddle points appear where:
    • Gradient is zero
    • Some directions increase entropy
    • Other directions decrease entropy
  • Information reservoirs form in critically slowed directions

Saddle Point Example

Entropy Evolution

Entropy-Time

  • Entropy-time: \(\tau(t) := S_{X(t)}(t)\)
  • Measures accumulated entropy of active variables

Monotonicity of Entropy-Time

  • \(\tau(t_2) \geq \tau(t_1)\) for all \(t_2 > t_1\)
  • Entropy-time always increases
  • Implies irreversibility of the system

Corollary: Irreversibility

Irreversibility

  • \(\tau(t)\) increases monotonically
  • Prevents time-reversal globally
  • Provides an arrow of time for the system

Conjecture: Frieden-Analogous Extremal Flow

  • When latent-to-active flow is extremal, system exhibits critical slowing
  • System entropy separates into active variables \(I = S[\rho_X]\) and “intrinsic information” \(J = S[\rho_{X|M}]\)
  • Analogous to Frieden (1998) extreme physical information principle \(\delta(I - J) = 0\)

The Information Game

  • Introduced an information game
  • Simple dynamics gradient ascent on von Neumann entropy
  • Hypothesised complicated behaviour

Thanks!

Appendix

Variational Derivation of the Initial Curvature Structure

Variational Derivation

  • Determine constraints on Fisher Information Matrix \(G(\boldsymbol{\theta})\)
  • Follow Jaynes’ approach to solve variational problem
  • Capture structure of system’s minimal entropy state

Uncertainty Principles

  • Hirschman uncertainty principle: entropy of function and Fourier transform
  • Beckner strengthened the inequality with optimal constant
  • Bialynicki extended to information entropy in wave mechanics
  • System respects these limits via von Neumann entropy

Density Matrix Form

  • \(\rho(\boldsymbol{\theta}) = \frac{1}{Z(\boldsymbol{\theta})} \exp\left( \sum_i \theta_i H_i \right)\)
  • \(Z(\boldsymbol{\theta}) = \mathrm{tr}\left[\exp\left( \sum_i \theta_i H_i \right)\right]\)
  • \(\boldsymbol{\theta} \in \mathbb{R}^d\), \(H_i\) are Hermitian observables

Von Neumann Entropy

  • \(S[\rho] = -\text{tr} (\rho \log \rho)\)
  • Minimal entropy configuration via Jaynes’ variational approach
  • Derive density matrix from information-theoretic constraints

The Minimal Entropy State

  • System begins in state of minimal entropy
  • Represented by density matrix \(\rho(\boldsymbol{\theta})\)
  • Resolution constraint \(\varepsilon \sim \frac{1}{2^N}\)

Jaynesian Derivation of Minimal Entropy Configuration

  • Assign density matrix maximally noncommittal with respect to missing information
  • Adapt Jaynes’ maximum entropy principle to derive minimum entropy configuration
  • Assume zero initial entropy bounded by \(N\) bits

Minimizing Von Neumann Entropy

  • Minimize \(S[\rho] = -\mathrm{tr}(\rho \log \rho)\)
  • Subject to constraints encoding resolution bounds
  • System begins in minimal entropy state
  • State cannot be delta function (must obey resolution constraint \(\varepsilon\))
  • Entropy bounded above by \(N\) bits: \(S[\rho] \leq N\)

Constraints

  • Normalization: \(\mathrm{tr}(\rho) = 1\)
  • Resolution constraint: \(\mathrm{tr}(\rho \hat{Z}^2) \geq \epsilon^2\)
  • Optional dual-space constraint: \(\mathrm{tr}(\rho \hat{P}^2) \geq \delta^2\)
  • These ensure system has finite resolution

Lagrangian Formulation

  • Lagrangian with Lagrange multipliers: \[ \mathcal{L}[\rho] = -\mathrm{tr}(\rho \log \rho) + \lambda_0 (\mathrm{tr}(\rho) - 1) - \lambda_x (\mathrm{tr}(\rho \hat{Z}^2) - \epsilon^2) - \lambda_p (\mathrm{tr}(\rho \hat{P}^2) - \delta^2) \]

Solution: Gaussian State

  • Functional derivative: \(\frac{\delta \mathcal{L}}{\delta \rho} = -\log \rho - 1 - \lambda_x \hat{Z}^2 - \lambda_p \hat{P}^2 + \lambda_0 = 0\)
  • Solution: \(\rho = \frac{1}{Z} \exp\left(-\lambda_z \hat{Z}^2 - \lambda_p \hat{P}^2\right)\)
  • Partition function: \(Z = \mathrm{tr}\left[\exp\left(-\lambda_z \hat{Z}^2 - \lambda_p \hat{P}^2\right)\right]\)
  • Results in a Gaussian state for density matrix

Natural Parameters and Fisher Information

  • Lagrange multipliers define natural parameters: \(\theta_z = -\lambda_z\), \(\theta_p = -\lambda_p\)
  • Exponential family form: \(\rho(\boldsymbol{\theta}) \propto \exp(\boldsymbol{\theta} \cdot \mathbf{H})\)
  • Fisher Information: \(G(\boldsymbol{\theta})\) from second derivative of \(\log Z(\boldsymbol{\theta})\)
  • Steepest ascent in \(\boldsymbol{\theta}\) space traces entropy dynamics

Information Geometry and Uncertainty

  • Verify \(\left| \left[G(\boldsymbol{\theta}) \boldsymbol{\theta}\right]_i \right| < \varepsilon\) for all \(i\)
  • Non-commuting observables: \([H_i, H_j] \neq 0\)
  • Uncertainty relation: \(\mathrm{Var}(H_i) \cdot \mathrm{Var}(H_j) \geq C > 0\)
  • Bounded curvature: \(\mathrm{tr}(G(\boldsymbol{\theta})) \geq \gamma > 0\)

Initial State and Landscape Unfolding

  • Initial density matrix: \(\rho(\boldsymbol{\theta}_o)\)
  • Permissible curvature geometry: \(G(\boldsymbol{\theta}_o)\)
  • Constraint-consistent basis of observables \(\{H_i\}\) with quadratic form
  • System begins in regular, latent, low-entropy state
  • From here entropy ascent and symmetry-breaking transitions emerge

Key Insights

  • Information topography emerges from precision/capacity trade-off
  • Density matrix structure encodes fundamental limits
  • System dynamics follow steepest entropy ascent

References

Frieden, B.R., 1998. Physics from fisher information: A unification. Cambridge University Press, Cambridge, UK. https://doi.org/10.1017/CBO9780511622670