[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 2

Prof. Guoqing Hu/[GEM x Adaptyv: RBX1 Binder Design Competition] Submission 2

Submitted toGEM x Adaptyv: RBX1 Binder Design Competition

Description

Method： RFDiffusion+AlphaFold2+CY1/CY2/SURF_CY/mGLI

Authors: Guoqing Hu, Hongyu Yu, Stephen S.-T. Yau

Affiliations: Guoqing Hu: Hetao Institute of Mathematics and Interdisciplinary Sciences (HIMIS), Shenzhen 518000, Guangdong, P.R. China Hongyu Yu: Department of Mathematical Sciences, Tsinghua University, Beijing 100084, P.R. China Stephen S.-T. Yau: Beijing Institute of Mathematical Sciences and Applications (BIMSA), Beijing 101408, P.R. China

Contact: Guoqing Hu Email: huguoqing@himis-sz.cn

Team background: Guoqing Hu received the Ph.D. degree in Computer Science from the University of Illinois at Chicago, USA, in 1997. He is a Research Fellow at HIMIS, Shenzhen. His research focuses on biomathematics, artificial intelligence, and neural networks.

Hongyu Yu received the B.S. degree from Tsinghua University in 2022 and is currently pursuing the Ph.D. degree in Applied Mathematics at Tsinghua University. His research focuses on computational biology.

Stephen S.-T. Yau received the Ph.D. degree in Mathematics from SUNY Stony Brook, USA, in 1976. He served at Harvard University and later at the University of Illinois at Chicago, and after retirement joined Tsinghua University. His research interests include bioinformatics, computational biology, nonlinear filtering, complex algebraic geometry, CR geometry, and singularities theory.

Design strategy: We designed a final panel of 100 de novo protein binders against human RBX1 using a multi-stage computational workflow combining target-specific structural ranking, geometry-aware interface descriptors, transfer learning from public Proteinbase binder competition data, and novelty filtering.

Target and generation: RBX1 was modeled primarily using the 2LGV-derived target structure and epitope patches defined on the exposed RBX1 surface. Candidate binders were generated with RFdiffusion, redesigned with ProteinMPNN, and evaluated with AlphaFold2-multimer / ColabFold. In round 1, we sampled 80 aa and 100 aa binders across multiple RBX1 patches. In round 2, we focused only on the strongest patches and added 72 aa, 88 aa, and 120 aa binders, increasing diversity while concentrating computation on the most promising regions.

RBX1-specific ranking: For each candidate complex, we extracted the predicted RBX1-binder interface pocket from the AF2-multimer rank_001 model and computed four feature families: CY1, CY2, SURF_CY, and mGLI. These features were designed to capture interface geometry, local structural organization, surface-field smoothness, and multiscale interaction topology. We combined them with AF2-derived metrics, especially ipTM, pTM, interface PAE, and pLDDT. pLDDT was used as a quality gate rather than the dominant score. This produced an RBX1-specific ranking focused on interface plausibility, geometric consistency, and structural confidence.

Transfer learning prior: Because RBX1 has no direct experimental labels in this campaign, we also incorporated a transferable binder prior learned from public Proteinbase competition data. We previously trained a binding classifier and a binder-strength regressor on public binder datasets. RBX1 candidates were scored with these pretrained models using the same merged feature-table format as in pretraining. These predictions were used only as an auxiliary prior to refine the shortlist while preserving RBX1-specific evidence from AF2 and geometry-based ranking.

Filtering and final selection: Shortlisted candidates were organized into four decision groups: strong keep, rescue, suspicious default-prediction uplift, and de-prioritize. Candidates with missing transferable features were explicitly flagged and not allowed to benefit from artificial default-score uplift. The final top 100 panel was assembled from strong-keep candidates, selected rescue candidates with valid features, and additional high-ranking non-default candidates to preserve diversity across RBX1 patches and binder lengths.

Novelty and submission constraints: All final sequences were filtered to satisfy competition-style constraints: single-chain proteins, standard amino acids only, length less than or equal to 250 aa, diversity within the final panel, and UniRef50 novelty screening using MMseqs2 plus edit-distance verification. The final 100-sequence panel was rechecked as a whole, and all 100 candidates passed our local novelty validation workflow.

Final rationale: Our submission is not the output of a single score, but the result of a staged design-and-selection pipeline integrating generative design, AF2-multimer complex prediction, CY/SURF_CY/mGLI geometry-aware ranking, transferable Proteinbase reranking, removal of zero-feature/default-score artifacts, and final diversity and novelty validation.

Proteins (100)

TableGrid

PGHProf. Guoqing Hu

id: hollow-otter-reed

Binder

Other