[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 1

N200486 DADI DILEEP KUMAR/[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 1

Submitted toGEM x Adaptyv: RBX1 Binder Design Competition

Description

We developed a target-aware de novo binder design and optimization workflow for the GEM x Adaptyv RBX1 competition. The goal was not only to produce binders with strong structural confidence, but to bias designs toward the correct structured RBX1 surface and then improve them through iterative sequence optimization.

We used the organizer-provided RBX1 sequence: MAAAMDVDTPSGTNSGAGKKRFEVKKWNAVALWAWDIVVDNCAICRNHIMDLCIECQANQASATSEECTVAWGVCNHAFHFHCISRWLKTR QVCPLDNREWEFQKYGH. Because RBX1 contains an intrinsically disordered N-terminal region and a structured zinc-coordinated C-terminal RING domain, we focused modeling and optimization on the structured region RBX1 residues 40-108, including the zinc-stabilized fold. This gave a cleaner and more reliable target for structural prediction and sequence optimization than using the entire protein uniformly.

We also fixed an explicit target epitope rather than allowing binders to drift toward arbitrary RBX1 surfaces. The expanded hotspot set used throughout the workflow was ALA43, ILE44, ARG46, ASN47, ILE54, GLU55, GLN57, ALA58, TRP87, ARG91, PRO95, LEU96, corresponding to chain B positions 43,44,46,47,54,55,57,58,87,91,95,96. Within this, we treated ILE44, GLN57, TRP87, ARG91, PRO95, LEU96 as a core hotspot subset. Candidate binders were rewarded only if they retained contact to this intended hotspot face.

Our initial sequence pool came from multiple open-source de novo design workflows rather than a single generator. We used designs produced from PPIFlow, PXDesign, RFdiffusion- related pipelines, Boltz-centered workflows, and additional internal de novo binder generation routes. The purpose of mixing sources was to increase structural and sequence diversity before optimization, which is important for RBX1 because it is a compact and difficult target with limited surface area.

All candidates were evaluated with a common scoring stack centered on Boltz2. For each binder, we predicted an RBX1-binder complex against the fixed RBX1 40-108 target, then extracted interface-relevant confidence metrics. We treated ipSAE as the primary optimization score because it was the most useful measure of interface quality in our setting. We also tracked protein_iptm, complex confidence, and related interface metrics. In parallel, we estimated binding affinity with PRODIGY as a secondary proxy and applied simple developability filters, including checks on hydrophobic runs, sequence liabilities, cysteine burden, and overall sequence composition. This was necessary because small targets can otherwise favor sticky or unrealistic binders.

Our ranking logic was multi-objective but ipSAE-heavy. A sequence was considered promising only if it combined high interface confidence, correct hotspot engagement, acceptable predicted affinity, and acceptable developability. We rejected candidates that improved a single metric while clearly worsening hotspot placement or sequence quality.

After broad generation and first-pass filtering, we switched to sequence-only local optimization. We did not regenerate full backbones once good parents existed. Instead, we treated the best binders as parent sequences and optimized them through iterative local mutations. For each parent, we first performed a baseline evaluation, then identified a small number of mutable positions, usually interface residues, lower-confidence positions, or residues associated with aggregation or poor surface chemistry.

We then generated restricted single mutants and rescored every variant with the same full stack: Boltz2 complex prediction, ipSAE computation, PRODIGY, hotspot-contact counting, and developability analysis. Surviving single mutations were then combined into focused double- mutant candidates, and in a few cases into limited triple-mutant candidates. We did not assume additivity; every combination was rescored from scratch because epistasis was common.

We used strict parent-relative gating. A mutant was kept only if it improved or maintained ipSAE, kept protein_iptm within an acceptable range, preserved core and expanded hotspot contacts, did not worsen predicted affinity beyond a small tolerance, and did not regress in developability. This prevented false positives that looked good in isolation but were not true improvements over the parent binder.

We also ran a broader exploration branch across diverse design families to avoid overcommitting too early to one scaffold class. This confirmed that while one family yielded the strongest optimized variants, other families also produced competitive starting points.

The final ranked submission therefore came from two linked stages: diverse de novo binder generation from multiple open-source methods, followed by target-specific local sequence optimization under a consistent RBX1 structural scoring stack.

Proteins (89)

TableGrid