[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 1

Void Labs/[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 1

Submitted toGEM x Adaptyv: RBX1 Binder Design Competition

Description

Submission Overview — Void Labs
Harsha Poonepalle & Vedant Kalipatnapu

Design Method: Optimized Mosaic Pipeline

Our approach builds on the Mosaic binder design pipeline (Protenix2025 hallucination + Boltz2 cross-validation), which we thank the original authors for making available. We optimized its loss weights, filtering thresholds, and ranking strategy based on large-scale data analysis of experimental binding
outcomes.

We started by analyzing the full Proteinbase dataset (5,253 designs, 427 confirmed binders across multiple methods). We computed ROC-AUC for every
available structural metric against experimental binding to determine which computational scores actually predict real-world success. The results were clear: min_ipSAE (the minimum of binder→target and target→binder interface AlignmentError scores) achieved an AUC of 0.806 — far outperforming the
commonly used ipTM (AUC 0.628) and pDockQ (AUC 0.500, effectively random). We also found that sequence-level features like glutamate enrichment (AUC 0.594) and low glycine content (AUC 0.625) correlate with binding success, consistent with charged helical interfaces dominating the hit distribution.

These findings directly informed how we modified the Mosaic loss function. Rather than using default weights, we retuned 14 loss terms to emphasize what the data told us matters most: ipSAE_min receives a 1.5x weight to enforce a high floor on the worst-direction interface score, bidirectional ipSAE terms are each weighted at 0.5x, and a hard NoCysteine penalty (5.0x) eliminates disulfide-dependent designs that tend to fail in expression. MPNN-guided
sequence recovery (weight 8.0) maintains designability, and we run a two-phase optimization — 120 steps of continuous relaxation followed by 20 sharpening steps.

After Protenix generates candidates, every design passes through Boltz2 cross-validation (best-of-5 sampling, 8 recycling steps). Because the design
predictor and the validator share no weights, agreement between them provides a genuine orthogonal signal rather than self-consistency. Candidates are filtered on Boltz2 ipTM ≥ 0.90, Boltz2 min_ipSAE ≥ 0.70, binder pLDDT ≥ 0.90, and zero cysteines, then ranked by a composite score weighted toward the
metrics we validated as most predictive (40% min_ipSAE, 30% Boltz2 ipTM, 20% Protenix ipTM, 10% Boltz2 pLDDT).

In our Proteinbase benchmarking, designs run through this optimized pipeline achieved an 82.1% hit rate — statistically the highest of any method in the dataset (p < 10⁻¹⁵). We attribute this to: (1) loss weights calibrated against real experimental outcomes rather than defaults, (2) dual-predictor filtering that eliminates false positives from any single model, and (3) strict metric thresholds derived from AUC analysis rather than convention.

Proteins (57)

TableGrid

VLVoid Labs

id: vast-dove-vine

Binder

Miniprotein

Mosaic (Weights-optimized) — Protenix2025 + Boltz2

Target

RBX1

ipSAE

0.81

pLDDT

88.21