Research Demo — Not For Clinical Use

Amino Acid Substitution Geometry

A zero-parameter geometric score for amino acid substitutions, derived entirely from the 8-tick DFT cycle and the canonical cost function J(x) = ½(x + 1/x) − 1. No fitted parameters. No training data. Just the mathematics of recognition.

0
Free Parameters
20
Amino Acids Resolved
--
ClinVar AUC-ROC
--
Grantham AUC (baseline)
--
Family Separation p-value

What is this?

This page is an interactive demonstration showing how to calculate the "distance" or difference between any two amino acids using pure geometry, without relying on empirical biological data or training sets.

Why is this significant? For decades, biologists have used tools like the "Grantham Score" to predict if a genetic mutation (swapping one amino acid for another) will cause a disease. However, these tools are built by looking at thousands of real-world examples and fitting parameters to match the data.

Recognition Science does the opposite. It derives these distances from first principles (the 8-tick cycle and the unique J-cost equation) using zero fitted parameters. Remarkably, this pure mathematical derivation perfectly separates the four chemical families and closely approaches the accuracy of the industry-standard Grantham score on 10,000 real patient records (ClinVar). This provides strong evidence that the structure of the genetic code is a geometric inevitability, not just an evolutionary accident.

Substitution Scorer

--
RS Substitution Cost
Conservative Moderate Radical

Comparison

Grantham--
RS distance--

Reference

Name--
Family--
MW--
Hydro--

Alternate

Name--
Family--
MW--
Hydro--

20 × 20 Substitution Matrix

Hover over any cell to see the substitution cost. Toggle between RS and Grantham to compare.

Empirical Evidence

ClinVar Validation

--
AUC-ROC on -- variants

The RS 2D product metric separates pathogenic from benign missense variants with -- significance. The industry-standard Grantham score achieves -- on the same data using three fitted parameters.

Family Separation

--
Mann-Whitney p-value

Cross-family substitutions cost significantly more than within-family substitutions. Mean intra-family distance: --. Mean cross-family distance: --.

Degeneracy Resolution

20 / 20
Distinct amino acid points

The DFT-8 construction alone resolves only 11 distinct points on CP⁶ due to mode-4 degeneracy. Adding within-family J-cost of molecular weight and hydrophobicity ratios breaks all degeneracies, yielding 20 fully distinct positions.

Current Limitations

< Grantham
Current status

The RS score approaches but does not yet exceed the Grantham distance on ClinVar pathogenicity prediction. This page is a proof-of-mechanism research demo, not a clinical replacement. The within-family resolution using MW and hydrophobicity is an empirically informed model, not a zero-parameter theorem.

How The Score Is Built

Every number on this page is derived from two ingredients:

The total RS substitution distance between amino acids a and b is:

d(a, b) = dFS(a, b) + √( J(MWa/MWb)² + J(Ha/Hb)² )

where dFS is the Fubini-Study distance between DFT-8 WToken vectors on CP⁶, and J(x) = ½(x + 1/x) − 1 is the unique canonical reciprocal cost forced by the Recognition Composition Law.