The Core Reasoning
The fundamental challenge in computational protein design is that generative models can sometimes create structurally beautiful proteins that are biologically irrelevant to the specific target. The reasoning behind this agent-based workflow is to effectively bridge the gap between qualitative biological literature and quantitative generative constraints.
By utilizing an LLM agent acting within an environment like Claude Code, the pipeline transforms from a simple "generate and score" script into a traceable decision-making engine. It makes explicit trade-offs between biological reliability and structural diversity (including exploratory positions like W101 and E100). This prevents the generative models from prematurely converging on a narrow, potentially flawed binding hypothesis.
The 5-Stage Pipeline
This method operates as a computational funnel, starting with broad biological rules and narrowing down to highly validated structural candidates through five distinct stages:
Phase 1: Knowledge Translation (Target Definition)
The LLM agent reviews existing literature to identify the most critical, experimentally supported interaction points on the RBX1 surface. It maps these biological findings to a specific structural interface, establishing the mandatory target constraints for all downstream generation.
Phase 2: Constrained Generation (Expanding the Search Space)
The predefined interface constraints are fed into RFdiffusion to construct custom binder backbones specifically aimed at that spatial region. To thoroughly explore the sequence space, ProteinMPNN then generates dozens of amino acid sequence variations for each of those backbones, resulting in a structurally diverse library of 480,000 raw binder candidates.
Phase 3: Agent-Driven Quality Control (Filtering)
To manage this massive scale, the agent shifts its focus to reliability, applying strict, automated filtering criteria based on sequence recovery and structural scores from ProteinMPNN. The library is aggressively distilled down to 515 high-confidence candidates, with all metadata meticulously preserved for later auditing.
Phase 4: Orthogonal Validation (Boltz-2 Rescoring)
The agent independently validates the generated complexes using Boltz-2 to predict their binding affinity and geometric poses. It computationally measures the heavy-atom distances (< 5.0 Å) to ensure the generated binders actually physically touch the exact residues defined in Phase 1.
Phase 5: Asymmetric Prioritization (Closing the Loop)
The agent ranks the final designs using an "asymmetric scoring rule" tied directly back to the initial literature. It heavily rewards binders that hit the primary anchors and gives secondary credit for hitting the exploratory positions , resulting in a finalized, prioritized list of candidates that successfully balance targeted biological engagement with high structural confidence.
id: strong-panther-stone

RBX1
None
80.11
True
12.8 kDa
117
id: ivory-falcon-ice

RBX1
None
87.96
True
11.2 kDa
102
id: bright-ox-pine

RBX1
None
58.64
True
11.5 kDa
105
id: dark-orca-flint

RBX1
None
87.94
True
13.9 kDa
119
id: hollow-jaguar-oak

RBX1
None
76.69
True
12.3 kDa
112
id: dark-mole-fern

RBX1
None
50.08
True
10.3 kDa
91
id: gentle-mole-pine

RBX1
None
76.38
True
10.8 kDa
96
id: silent-lion-sand
No preview available
--
--
--
--
--
114
id: strong-yak-ruby
No preview available
--
--
--
--
--
98
id: strong-vole-reed
No preview available
--
--
--
--
--
124
id: green-tiger-lotus
No preview available
--
--
--
--
--
91
id: quick-vole-orchid
No preview available
--
--
--
--
--
109
id: rough-boar-snow
No preview available
--
--
--
--
--
115
id: brisk-falcon-maple
No preview available
--
--
--
--
--
123
id: swift-goat-jade
No preview available
--
--
--
--
--
119
id: quick-fox-dust
No preview available
--
--
--
--
--
88
id: pale-quail-pearl
No preview available
--
--
--
--
--
79
id: jade-bat-thorn
No preview available
--
--
--
--
--
124
id: azure-lion-plume
No preview available
--
--
--
--
--
122
id: pale-deer-lotus
No preview available
--
--
--
--
--
97
id: calm-tiger-ice
No preview available
--
--
--
--
--
107