De Novo Design of RBX1 Protein Binders Using RFDiffusion 3
Background and Objective RBX1 (RING-box protein 1) is the catalytic subunit of Cullin-RING E3 ubiquitin ligase complexes, enabling substrate ubiquitination and proteasomal degradation. Its dysregulation is associated with tumorigenesis, making it an important target for protein binder design with applications in functional modulation and targeted protein degradation. Designing binders against RBX1 is challenging due to complex protein–protein interfaces and the vast combinatorial sequence space. To address this, we developed a fully computational, de novo pipeline using RFDiffusion 3. All designs were generated from scratch without motif scaffolding or lead optimization, ensuring compliance with competition constraints.
Overall Approach The workflow integrates backbone generation, sequence design, and multi-stage filtering. RFDiffusion 3 is used to generate binder backbones targeting the RBX1 surface, followed by sequence generation using ProteinMPNN. The resulting sequences undergo sequence-level filtering, novelty enforcement, diversity selection, and structure-based validation. This staged pipeline ensures that only high-confidence and experimentally viable candidates are retained.
Backbone Generation and Sequence Design De novo binder backbones were generated using RFDiffusion 3 across multiple length ranges (approximately 80–200 amino acids), enabling exploration of diverse binding geometries and interaction modes. RFDiffusion 3 is a diffusion-based generative model that iteratively refines protein backbone structures conditioned on the target surface, allowing precise control over binder orientation and interface formation. Each generated backbone was processed using ProteinMPNN for inverse folding. Multiple sequences were sampled per backbone, and top-scoring sequences based on model likelihood were retained. This ensures that selected sequences are structurally consistent with the backbone and energetically favorable.
Sequence-Level Filtering Generated sequences were filtered to remove candidates with poor developability prior to structural evaluation. This included validation of amino acid composition, removal of duplicates, and enforcement of the competition length constraint (≤250 amino acids). Physicochemical properties such as hydrophobicity, charge balance, and amino acid diversity were assessed alongside stability indicators including aromaticity and instability index. Sequences were also screened for liabilities such as N-glycosylation, deamidation, and isomerization motifs, as well as aggregation propensity and unpaired cysteine risk. Sequences failing strict thresholds were removed, while borderline candidates were ranked and filtered, ensuring only high-quality sequences progressed further.
Novelty and Diversity Constraints To ensure de novo design compliance, sequences were screened against the UniRef50 database using MMseqs2. Candidates with less than 25% edit distance from known proteins were excluded, ensuring novelty. Subsequently, sequences were clustered based on similarity, and representative sequences were selected to maximize diversity and avoid redundancy in the final candidate set.
Structure-Based Validation Final candidates were evaluated through structure prediction of binder–RBX1 complexes using a GPU-accelerated framework. Binding confidence was assessed using ipTM and pDockQ2, while interface-specific accuracy was evaluated using ipSAE. Additional metrics included overall structural confidence (pTM), predicted binding affinity (ΔG) using PRODIGY, buried surface area, and interface error (PAE). Candidates with weak interface formation, low confidence scores, or structural inconsistencies were discarded.
Ranking and Final Selection Final binders were ranked using a composite scoring strategy prioritizing interface-specific confidence metrics (ipSAE and ipTM), followed by docking quality (pDockQ2), predicted binding affinity, and interface properties. Top-ranked candidates were selected for submission, ensuring compliance with all competition requirements.
Conclusion The RFDiffusion 3-based pipeline provides a powerful framework for de novo RBX1 binder design by integrating diffusion-based backbone generation with sequence optimization and rigorous filtering. This approach efficiently reduces a large design space into a focused set of high-confidence, novel, and experimentally viable candidates, increasing the likelihood of successful expression and binding.
id: calm-heron-ash

RBX1
0.78
82.42
--
11.5 kDa
100
id: bright-moth-topaz

RBX1
0.84
87.49
--
11.4 kDa
100
id: frozen-kiwi-cedar

RBX1
0.88
82.88
--
10.4 kDa
90
id: violet-seal-wave

RBX1
0.86
87.01
--
10.3 kDa
90
id: silver-moth-birch

RBX1
0.74
87.65
--
10.1 kDa
90
id: mellow-ox-marble

RBX1
0.78
86.18
--
11.8 kDa
100
id: pale-wolf-cloud

RBX1
0.63
85.29
--
11.3 kDa
100
id: ivory-raven-maple

RBX1
0.87
84.85
--
11.4 kDa
100
id: crimson-falcon-granite

RBX1
0.59
88.99
--
11.5 kDa
100
id: bright-quail-fern

RBX1
0.84
86.45
--
9.6 kDa
80
id: quick-gecko-ice

RBX1
0.90
87.90
--
16.4 kDa
150
id: steady-ant-wave

RBX1
0.91
87.49
--
16.8 kDa
150
id: deep-mole-sand

RBX1
0.90
87.44
--
9.2 kDa
80
id: dark-cobra-quartz

RBX1
0.84
83.08
--
9.3 kDa
80
id: frozen-crow-ivy

RBX1
0.61
86.59
--
9.7 kDa
80
id: silent-swan-bronze

RBX1
0.80
86.06
--
10.5 kDa
90
id: young-hawk-fern

RBX1
0.70
78.07
--
9.4 kDa
80
id: pale-owl-topaz

RBX1
0.70
90.89
--
17.0 kDa
150
id: quick-ox-dust

RBX1
0.90
86.19
--
11.7 kDa
100
id: azure-kiwi-fern

RBX1
0.66
86.08
--
9.4 kDa
80
id: young-dove-ash

RBX1
0.90
87.99
--
16.8 kDa
150