RS-Foldφ-lattice

Glass-box protein folding. Every distance derived from the golden ratio. No training data. No free parameters. Watch the physics work.

pip install rsfold

rsfold fold --sequence NLYIQWLKDGGPSSGRPPPS --output result.pdb
rsfold benchmark --output results.json

Source on GitHub · MIT License · Benchmarks →

Protein Sequence Iterations

Energy Trace

Contact Map

What Is This?

RS-Fold is a protein structure prediction engine built entirely from first principles. Unlike statistical methods that learn patterns from known structures, RS-Fold derives every geometric constraint from a single mathematical object: the golden ratio φ = (1+√5)/2.

The protein you see above was folded in your browser, right now, using only the amino acid sequence as input. No neural network. No database lookup. No server. Pure physics running in JavaScript.

The result is a “glass-box” model: every force, every distance, and every contact decision can be traced back to a specific theorem proved in Lean 4.

The φ-Lattice

Every distance in the model is a power or root of φ applied to measured bond lengths. Nothing is fitted.

Cα–Cα backbone

3.85 Å

φ² × 1.47 Å

Helix i→i+4

6.23 Å

φ × backbone

β-sheet interstrand

4.90 Å

√φ × backbone

Helix bundle packing

10.08 Å

φ² × backbone

Contact budget

N / φ²

≈ 38% of residues form long-range contacts

Radius of gyration

(N/φ)^1/3 × 3.85

Compact globule scaling from chain length

How It Works

Encode

Each amino acid is represented as an 8-channel chemistry vector (volume, charge, polarity, H-bond donors/acceptors, aromaticity, flexibility, sulfur content). These are physical observables, not learned embeddings.

DFT-8 Spectral Analysis

A sliding 8-point Discrete Fourier Transform extracts frequency content from each chemistry channel. The dominant DFT mode, amplitude, and phase at each residue become a WToken — the “recognition fingerprint” of that position.

Predict Contacts

Residue pairs are scored by phase coherence, amplitude resonance, mode compatibility, and chemistry gating (charge attraction, H-bonds, aromatic stacking). The top N/φ² contacts are kept — a budget derived from the contact theorem, not tuned.

Minimize J-Cost

The energy function is the Recognition Science cost J(r) = ½(r + 1/r) − 1 applied to distance ratios. Backbone bonds, helix contacts, tertiary contacts, sterics, and compactness all use J-cost. Gradient descent with momentum drives the structure to the φ-lattice minimum.

RS-Fold vs AlphaFold

	RS-Fold	AlphaFold
Approach	First-principles physics	Deep learning on MSA + templates
Parameters	Zero (all derived from φ)	~93 million trained weights
Training data	None	~170,000 PDB structures
Typical RMSD	8–16 Å	~1 Å
Explainability	Every force has a Lean proof	Attention weights (opaque)
Novel folds	Can design folds not in PDB	Limited to evolutionary space
Speed	~30 ms in browser	Minutes on GPU
Runs offline	Yes (browser or CLI)	Requires server + GPU

RS-Fold does not compete with AlphaFold on accuracy. It answers a different question: why does a protein fold the way it does, not just what shape does it take? The glass-box mechanism enables protein design from first principles — including folds that evolution never explored.

Machine-Verified Derivation Chain

Every geometric constant traces back to a Lean 4 theorem with zero sorry.

T5 J-cost uniqueness: J(x) = ½(x + 1/x) − 1 is the unique solution to the RCL

↓

T6 φ forced: the golden ratio is uniquely pinned by self-similarity on the discrete ledger

↓

T7 8-tick cycle: minimal period = 2^D = 8 for D=3 spatial dimensions

↓

D2 φ-geometry: Cα–Cα = φ²×1.47Å, helix pitch, β-rise (matches PDB <2% error)

↓

D5 Contact budget: max contacts ≤ N/φ² from the DFT-8 neutral subspace

↓

D9 Jamming frequency: f_jam = 1/(τ₀·φ¹⁹) ≈ 14.65 GHz

Empirical Validation

✓

10/10 Helix Design

10 helical sequences designed from φ-geometry alone. All formed helices when cross-validated with ESMFold and AlphaFold. 10/10 negative controls (Pro insertions) disrupted the helix as predicted.

✓

PDB Geometry <2% Error

Derived bond lengths (Cα–Cα = 3.85Å, H-bond = 2.85Å) match the Protein Data Bank to within 2%.

✓

Contact Quantization

PDB contact-distance histograms show peaks at φ⁰, φ¹, φ², φ^2.5Å — exactly the φ-ladder rungs the theory predicts.

Static W-Token Contacts

W-token-based contact prediction was not better than random in ablation studies. The theory now rests on the 8-tick dynamic clock, not static sequence encoding. This is disclosed as a falsified claim.

Install the CLI

For longer sequences or batch processing, use the Python package.

pip install rsfold

rsfold fold --sequence NLYIQWLKDGGPSSGRPPPS --output result.pdb
rsfold benchmark --output results.json

Source: github.com/jonwashburn/recognition-science · License: MIT · Benchmark results →