This workflow uses ESM2 and its log-likelihood scoring function to propose beneficial protein mutations. Starting from a known wild-type sequence and improving its overall log-likelihood (how well the given sequence matches its training data distribution) can yield variants with improved stability, expression, and affinity. These beneficial mutations can be sampled with various algorithms, from greedy search to other balancing between exploration and exploitation. Log-likelihood scores are well-correlated with the aforementioned properties, per the ProteinGym benchmark.
id: golden-boar-flint

EGFR
Weak
2.2e-6 M
True
6.7 kDa
58
id: jade-deer-cedar

EGFR
None
80.75
True
6.4 kDa
58
id: brisk-yak-ruby

EGFR
None
84.63
True
6.4 kDa
58
id: quick-moth-plume

EGFR
None
80.76
True
6.4 kDa
58
id: silver-boar-jade

EGFR
None
83.70
True
6.4 kDa
58
id: solid-hawk-willow

EGFR
Weak
80.15
True
6.4 kDa
58
id: vast-quail-cypress

EGFR
None
82.19
True
6.4 kDa
58
id: bright-otter-iron

EGFR
None
83.27
True
5.8 kDa
52
id: violet-vole-opal

EGFR
None
86.54
True
5.4 kDa
49
id: noble-fox-maple

EGFR
None
82.83
True
5.8 kDa
52
id: rough-falcon-cypress

EGFR
None
81.64
True
6.5 kDa
58
id: quiet-orca-ruby

EGFR
None
81.16
True
6.6 kDa
58
id: calm-crane-bronze

EGFR
None
86.73
True
6.1 kDa
58
id: steady-wolf-stone

EGFR
None
72.95
True
6.4 kDa
58
id: solid-gecko-cloud

EGFR
None
77.53
True
6.4 kDa
58
id: swift-dove-opal

EGFR
None
79.88
True
6.4 kDa
58
id: quick-hawk-topaz

EGFR
None
80.28
True
6.5 kDa
58
id: brisk-swan-frost

EGFR
None
80.97
True
5.5 kDa
49
id: strong-panda-lava

EGFR
None
83.55
True
6.5 kDa
58
id: mellow-panther-ember

EGFR
None
81.24
True
6.6 kDa
58
id: quick-zebra-stone

EGFR
Weak
84.76
True
5.9 kDa
53