Submission Overview — Void Labs
Harsha Poonepalle & Vedant Kalipatnapu
Design Method: Optimized Mosaic Pipeline
Our approach builds on the Mosaic binder design pipeline (Protenix2025 hallucination + Boltz2 cross-validation), which we thank the original authors for
making available. We optimized its loss weights, filtering thresholds, and ranking strategy based on large-scale data analysis of experimental binding
outcomes.
We started by analyzing the full Proteinbase dataset (5,253 designs, 427 confirmed binders across multiple methods). We computed ROC-AUC for every
available structural metric against experimental binding to determine which computational scores actually predict real-world success. The results were
clear: min_ipSAE (the minimum of binder→target and target→binder interface AlignmentError scores) achieved an AUC of 0.806 — far outperforming the
commonly used ipTM (AUC 0.628) and pDockQ (AUC 0.500, effectively random). We also found that sequence-level features like glutamate enrichment (AUC
0.594) and low glycine content (AUC 0.625) correlate with binding success, consistent with charged helical interfaces dominating the hit distribution.
These findings directly informed how we modified the Mosaic loss function. Rather than using default weights, we retuned 14 loss terms to emphasize what
the data told us matters most: ipSAE_min receives a 1.5x weight to enforce a high floor on the worst-direction interface score, bidirectional ipSAE terms
are each weighted at 0.5x, and a hard NoCysteine penalty (5.0x) eliminates disulfide-dependent designs that tend to fail in expression. MPNN-guided
sequence recovery (weight 8.0) maintains designability, and we run a two-phase optimization — 120 steps of continuous relaxation followed by 20
sharpening steps.
After Protenix generates candidates, every design passes through Boltz2 cross-validation (best-of-5 sampling, 8 recycling steps). Because the design
predictor and the validator share no weights, agreement between them provides a genuine orthogonal signal rather than self-consistency. Candidates are
filtered on Boltz2 ipTM ≥ 0.90, Boltz2 min_ipSAE ≥ 0.70, binder pLDDT ≥ 0.90, and zero cysteines, then ranked by a composite score weighted toward the
metrics we validated as most predictive (40% min_ipSAE, 30% Boltz2 ipTM, 20% Protenix ipTM, 10% Boltz2 pLDDT).
In our Proteinbase benchmarking, designs run through this optimized pipeline achieved an 82.1% hit rate — statistically the highest of any method in the dataset (p < 10⁻¹⁵). We attribute this to: (1) loss weights calibrated against real experimental outcomes rather than defaults, (2) dual-predictor filtering that eliminates false positives from any single model, and (3) strict metric thresholds derived from AUC analysis rather than convention.
id: vast-dove-vine

RBX1
0.81
88.21
--
7.1 kDa
65
id: calm-bear-quartz

RBX1
0.90
75.84
--
7.4 kDa
65
id: rapid-yak-topaz

RBX1
0.84
85.65
--
7.2 kDa
65
id: shy-dove-frost

RBX1
0.82
86.11
--
7.3 kDa
65
id: silver-fox-jade

RBX1
0.83
79.40
--
7.6 kDa
65
id: rough-cobra-clay

RBX1
0.91
87.52
--
7.2 kDa
65
id: crimson-fox-leaf
No preview available
--
--
--
--
--
65
id: strong-bat-granite

RBX1
0.71
87.98
--
6.9 kDa
65
id: gentle-falcon-vine

RBX1
0.73
85.02
--
7.5 kDa
65
id: vast-ram-oak
No preview available
--
--
--
--
--
65
id: soft-kiwi-snow

RBX1
0.86
87.27
--
7.2 kDa
65
id: bright-goat-lava
No preview available
RBX1
0.86
87.69
--
7.0 kDa
65
id: strong-crow-lotus

RBX1
0.88
85.33
--
7.4 kDa
65
id: mellow-raven-iron

RBX1
0.84
86.71
--
7.6 kDa
65
id: shy-hawk-marble

RBX1
0.86
86.12
--
7.5 kDa
65
id: golden-bat-clay

RBX1
0.87
85.56
--
7.7 kDa
65
id: quiet-crane-oak

RBX1
0.83
79.85
--
7.5 kDa
65
id: mellow-bat-cypress

RBX1
0.84
84.19
--
7.1 kDa
65
id: solid-mole-clay

RBX1
0.80
87.52
--
7.0 kDa
65
id: amber-mole-iron

RBX1
0.70
84.75
--
7.2 kDa
65
id: quiet-dove-snow

RBX1
0.89
84.80
--
7.2 kDa
65