Darwin as Minimum Description Length: Selection, Variation, and Modularity as Code–Length Optimization
Recognition Science & Recognition Physics Institute; Austin, Texas, USA
Abstract
We develop a methods‑first theory that identifies fitness with negative description length. For an environment \(\mathcal{E}\) and an organism \(g\), we define the evolutionary code length \(L_g = L(\text{model}_g) + L(\text{errors}\mid \mathcal{E})\). If replication rates obey \(r(g) \propto e^{-\beta L_g}\) for resource factor \(\beta>0\), then replicator dynamics descend the mean code length and the stationary distribution is \(\pi^*(g) \propto e^{-\beta L_g}\). Variation follows an anisotropic proposal law \(q(\Delta)\propto e^{-\Delta J}\) where \(J\) is a symmetric, convex ledger cost, and modularity is favored when environmental tasks share information \(M\), yielding reuse savings at least \(M-b\) bits for a module of size \(b\). The empirical program is operational and auditable under a preregistered MDL protocol with explicit falsifiers.
Key Contributions
- Formal MDL fitness: Replicator dynamics descend the population mean of \(L_g\).
- Variation anisotropy: Convex, symmetric ledger cost \(J\) yields \(q(\Delta)\propto e^{-\Delta J}\).
- Modularity lower bound: Reuse gains \(\Delta L_{\text{reuse}}\ge M-b\) when tasks share information \(M\).
- Measurement plan: Preregistered, auditable protocol with explicit falsifiers.
Keywords
evolution, minimum description length, fitness, modularity, replicator dynamics, Recognition Science