Author: Lucia Urcelay and colleagues, Ferruz Lab, CRG
Design method: Germinal + propietary pLM
First stage:
The first version of the nanobody was designed using the Germinal generative framework (reference: https://www.biorxiv.org/content/10.1101/2025.09.19.677421v1) . The workflow began by identifying the key interface residues in the Nipah virus Glycoprotein G–ephrin-B2 complex. From this analysis, a set of target hotspots was selected to guide Germinal’s conditional generation. Multiple hotspot combinations were tested, resulting in approximately 200 de novo nanobody sequences. Among these, the hotspot set A304, A305, A504, A558, A579 produced the most consistent performance across generations. For the top candidate selection, several structural and binding-relevant metrics were evaluated, including ipSAE, min-ipSAE, oDockQ, and ipLDDT. Then, the nanobody that presented the best overall performance across these metrics was selected and was therefore selected for optimization.
Second stage:
A proprietary protein language model was fine-tuned approximately 100 curated sequences characterized using in-silico metrics. This model produced a model capable of generating diverse nanobodies against Nipah virus. Inference was performed on the sequence from the first stage, then, the 10th percentile of lowest log likelihood residues were selected and replaced by highest log likelihood residues given by the model. This sequence is the result of that optimization.