[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 1

Bingyi Zhao/[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 1

Submitted toGEM x Adaptyv: RBX1 Binder Design Competition

Description

NVIDIA Proteina-Complexa was used for the De novo single-chain binders against RBX1, with an NVIDIA A100 GPU.

The workflow integrates considerations from structure-based design, single-molecule stability analysis, and data-driven prioritization strategies motivated by downstream protein engineering and pharmaceutical constraints, contributed by Bingyi Zhao, Dr. Haipei Liu, and Dr. Zhiwen Jiang.

In contrast to purely structure-driven design workflows, our approach explicitly reflects preclinical pipeline considerations, where early-stage decisions strongly influence downstream success rates. In such settings, candidates are not evaluated solely on predicted binding performance, but on their likelihood to remain viable through expression, purification, formulation, and repeated handling under non-ideal conditions. Accordingly, the workflow emphasizes early risk identification rather than late-stage optimization. Design choices, filtering criteria, and ranking signals are aligned with common failure modes observed in protein development, including aggregation propensity, structural instability, and sensitivity to minor perturbations. This shifts the objective from maximizing peak in silico performance to selecting candidates with more robust overall profiles and fewer latent liabilities.

From a data perspective, this corresponds to a prioritization strategy that integrates multiple weak but complementary indicators—structural confidence, compactness, sequence-level features, and consistency across representations—rather than relying on a single dominant metric. Such multi-factor selection is intended to better approximate real-world decision-making in preclinical pipelines, where uncertainty is high and failure is often driven by combined effects rather than single-point deficiencies.

Multiple RBX1 target representations were tested, including trimmed constructs in which part of the disordered N-terminus was removed while the zinc-stabilized folded region was retained. Priority was given to trimmed targets centered on the structured C-terminal domain. Hotspot-guided runs were then performed on selected surface residues, particularly around positions 44, 89, 90, and 91. Binder lengths were explored across several ranges depending on the target construct.

The standard Complexa workflow of generate → filter → evaluate → analyze was used.

Large candidate pools were sampled first, followed by reward-based filtering and stricter downstream selection thresholds in order to reduce computational cost while preserving the most promising candidates. Final ranking emphasized generative reward, complex confidence, and monomer structural plausibility.

Hypothesis and design rationale. The main hypothesis was that RBX1 would be more tractable when the structured, zinc-coordinated region was used as the principal conditioning surface, while hotspot residues could bias the search toward more relevant epitopes. It was not assumed that the disordered N-terminus alone would provide a sufficiently reliable docking surface for de novo binder discovery. Instead, trimming flexible low-information segments was expected to improve geometric consistency, and hotspot-guided runs were expected to improve interface quality compared with unrestricted exploration.

A second hypothesis was that predicted binding alone should not determine prioritization. Candidates were considered more valuable when binding-oriented metrics were supported by signs of structural stability and early developability.

Final selection was additionally guided by PRIB developability considerations. Beyond predicted binding quality, priority was given to candidates with more plausible monomer behavior, better structural compactness, and fewer apparent liabilities related to instability, weak packing, or aggregation risk. This was intended to enrich for binders that are not only computationally plausible, but also more credible as wet-lab starting points.

Supplementary notes. Target choice, hotspot selection, and filtering thresholds were found to strongly influence output quality. Free exploration was capable of producing plausible interfaces, but hotspot-guided runs were more effective for concentrating the search on interpretable RBX1 surface patches. Final submitted candidates were therefore selected conservatively, with preference given to designs supported by both binding-oriented metrics and PRIB-inspired developability considerations.

Reference [1] Didi, Kieran, et al. "Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute." The Fourteenth International Conference on Learning Representations.

Proteins (21)

TableGrid