[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 1

Alex Burlacu/[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 1

Submitted toGEM x Adaptyv: RBX1 Binder Design Competition

Description

Looking at the shape of the protein, both in complex with Cul1 (PDB:1U6G) and by itself (PDB:2LGV) I figured the long chain from residue ID 1 to 35 will cause issues during design, so I decided to trim the structure and focus only on residues 35 and upwards.

I also decided to target the side with residues PHE 81, SER 85, LEU 88, PHE 103, TYR 106, for the binder to act as an inhibitor, blocking RBX1 from binding Cul1. I ran a few experiments with targeting these hotspots, but specifying anti-hotspots proved a better approach. So in my experiments I configured BoltzGen to avoid residues 35 to 53 (avoid the non-binding site, based on analyzing PDB:1U6G) and 69 to 71 (prevent clash with the truncated tail).

During the first few experiments, I discoverd a few interesting ideas, namely - (1) maybe I should design a peptide, to improve designability and achieve good binding, at the same time, and (2) I noticed some of the generated designs were beta-barrels or beta-sandwiches with long-ish loops, so I wanted to design a VHH-inspired molecule with a beta-barrel framework region and binding loops.

So my second iteration with BoltzGen was to redesign the loops of two of the more successful beta-barrel designs using the "nanobody-anything" protocol. I fixed the non-loopy regions and let BoltzGen generate only those.

Finally I decided to try running a few redesign rounds for the non-binding regions using SolubleMPNN (T=0.1 and with and without a 0.1 backbone noise), in hopes of improving the pTM and pLDDT and make the full molecule "look" more plausible.

For filtering I used the following query in pandas: "design_ipsae_min > 0.12 & sc > 0.50 & interaction_pae < 15 & complex_plddt > 0.6 & CYS_fraction < 0.01 & ptm > 0.68 & seq_instability_index < 35 & liability_HydroPatch_count < 1 & liability_score < 100 & seq_gravy < 0.4"

The seq_instability_index and seq_gravy are computed using BioPython, and SC is computed using sc-rs (https://github.com/cytokineking/sc-rs). Everything else are obtained from BoltzGen.

I added 4 more designs (submission_id={23,24,25}-an-original and 0-extra-selection1) without applying these filters, the 23, 24, and 25 being the sequences before Soluble-MPNN redesign, and the 0-extra-selection1 is the predicted as the best performing binder as per HADDOCK3 PRODIGY.

All the work was done on a MacBook M4 Max 128GB, running BoltzGen with changes from this PR (https://github.com/HannesStark/boltzgen/pull/145), starting March 6nd. All in all I designed (and redesigned) about 4000 to 5000 molecules.

Proteins (26)

TableGrid