TIMED Nanobody Workflow
TIMED is a family of Convolutional Neural Networks (CNNs) models for protein sequence design. For this competition we used TIMED (vanilla), TIMED-Charge (charge aware), and co-TIMED (side-chain aware).
We begin by creating using backbones of existing nanobodies (2VSM, 8XPY, 9B9E). We fix the framework region and only allow sequence desgin around the CDR regions.
We then used TIMED models to generate probability distributions and sampled ~500K sequences total. These sequences were then scored using the E1 Large Language Model (LLM). For each backbone we picked the best scoring sequences and folded them with Boltz-2 to obtain iPSAE scores.
We treated these Boltz-2 scores as ground truth to train an ensemble of surrogate models (Ridge Regression and Random Forest). These models guided a round of in silico directed evolution, where we generated mutant pools from the top candidates, predicted their efficacy, and selected the highest-confidence, novel sequences for the final library using DE-STRESS.
BindCraft & TIMED
TIMED is a family of Convolutional Neural Networks (CNNs) models for protein sequence design. For this competition we used TIMED (vanilla), TIMED-Charge (charge aware), and co-TIMED (side-chain aware).
We begin by creating several backbone designs using BindCraft and SolubleMPNN. We then used TIMED models to generate probability distributions and sampled ~500K sequences total. These sequences were then scored using the E1 Large Language Model (LLM). For each backbone we picked the best scoring sequences and folded them with Boltz-2 to obtain iPSAE scores.
We treated these Boltz-2 scores as ground truth to train an ensemble of surrogate models (Ridge Regression and Random Forest). These models guided a round of in silico directed evolution, where we generated mutant pools from the top candidates, predicted their efficacy, and selected the highest-confidence, novel sequences for the final library using DE-STRESS.
Molecular dynamics simulations & DESTRESS analysis
We interrogated some of our designs with molecular dynamics simulations in order to assess the stability of the complex dynamically. We ran the default protocol implemented in the drMD molecular dynamics package, with three replicas per complex. The production simulations were run for 100 ns each. The metrics used to assess the stability of the designs were the RMSD and RMSF of the receptor, binder and complex, the solvent accessible surface area (SASA) of the complex, the ΔSASA (equal to the SASA of the complex minus the SASA of the receptor and the SASA of the binder), the number of contacts formed between receptor and binder (with a cutoff of 4.5 A), and the minimum distance between receptor and binder.
Finally, proteins were ranked with a custom defined DE-STRESS feature based score. This score consisted of weighted sum of netsolp usability, rosetta total energy for the complex, rosetta energy for the binder and rosetta interaction energies between the receptor and the binder.
id: dark-gecko-pine

Nipah Virus Glycoprotein G
0.74
82.87
--
14.0 kDa
124
id: jade-crane-thorn

Nipah Virus Glycoprotein G
0.73
80.94
--
13.5 kDa
119
id: shy-owl-opal

Nipah Virus Glycoprotein G
0.70
76.16
--
9.5 kDa
80
id: wild-falcon-birch

Nipah Virus Glycoprotein G
0.66
84.28
--
13.3 kDa
119
id: radiant-mole-dust

Nipah Virus Glycoprotein G
0.50
81.64
--
9.3 kDa
80
id: vast-seal-stone

Nipah Virus Glycoprotein G
0.32
79.20
--
9.7 kDa
81
id: jade-wolf-sand

Nipah Virus Glycoprotein G
0.00
82.82
--
13.2 kDa
114
id: rapid-fox-stone

Nipah Virus Glycoprotein G
0.00
84.82
--
13.1 kDa
124