A focused protein sequence library comprising 100 variants was constructed using a structure-guided, semi-rational design strategy. The design process integrated conserved functional motifs with targeted sequence diversification to generate a set of variants suitable for functional screening and structural characterization. Scaffold definition and conserved region identification. A reference scaffold was defined based on multiple sequence alignments of a functionally characterized protein family. Conserved motifs critical for structural integrity and catalytic activity were retained across all variants. These include a conserved N-terminal motif ([AP][PT]PEQIA[QRK]MRALDA[KFY]FNSPS[CTN]NPR[PR]T[RK]VIGGD...) and key residues implicated in substrate binding and conformational stability. The scaffold was derived from curated sequences in the Pfam and UniProt databases to ensure functional relevance. Diversification strategy. Variable positions were selected based on sequence entropy analysis and structural mapping. Positions exhibiting natural sequence variation in homologs, as well as surface-exposed loops predicted to tolerate mutations without disrupting the overall fold, were prioritized for diversification. Amino acid substitutions were introduced using a combinatorial approach guided by physicochemical property matrices (e.g., BLOSUM62 substitution preferences) to maintain compatibility with the structural context. Conservative substitutions were applied in regions adjacent to functional sites, while more exploratory mutations were introduced in peripheral loops and the C-terminal tail. Computational screening and structural filtering. Each designed sequence was assessed for structural plausibility using a rapid folding simulation pipeline. Secondary structure propensities were evaluated using PSIPRED, and stability was estimated via empirical force field calculations (FoldX). Sequences predicted to adopt conformations substantially deviating from the reference fold or exhibiting unfavorable energetic profiles were deprioritized. The final library was curated to retain structural diversity while excluding sequences with predicted misfolding or aggregation propensity. Library composition and diversity metrics. The final set of 100 sequences spans a length range of 50 to 150 residues, with pairwise sequence identities ranging from 40% to 85%. This diversity profile balances sequence coverage of the design space while maintaining sufficient structural relatedness to allow meaningful comparative analysis. C-terminal extensions of 0 to 5 residues were incorporated in a subset of variants to explore the functional impact of variable tail length. Sequence redundancy control. To maximize functional information yield, sequences sharing greater than 90% global sequence identity were clustered, and a single representative from each cluster was retained. This redundancy filtering ensured that the final library captures a broad range of sequence space without overrepresentation of closely related variants. This design approach combines the advantages of rational structure-guided engineering with targeted sequence diversification, yielding a library that is both functionally relevant and sufficiently diverse for downstream applications such as directed evolution screening, structure–function relationship studies, and machine learning–guided optimization.
id: shy-goat-snow
No preview available
--
--
--
--
--
61
id: young-dove-jade
No preview available
--
--
--
--
--
61
id: soft-toad-frost
No preview available
--
--
--
--
--
61
id: calm-deer-vine
No preview available
--
--
--
--
--
61
id: pale-bear-ash
No preview available
--
--
--
--
--
61
id: steady-moth-pearl
No preview available
--
--
--
--
--
61
id: calm-gecko-sand
No preview available
--
--
--
--
--
61
id: silent-quail-thorn
No preview available
--
--
--
--
--
61
id: lunar-yak-cedar
No preview available
--
--
--
--
--
63
id: scarlet-seal-plume
No preview available
--
--
--
--
--
61
id: ivory-panda-moss
No preview available
--
--
--
--
--
61
id: small-mole-bronze
No preview available
--
--
--
--
--
63
id: deep-panther-plume
No preview available
--
--
--
--
--
63
id: steady-toad-sand
No preview available
--
--
--
--
--
63
id: noble-shark-snow
No preview available
--
--
--
--
--
63
id: young-lynx-ice
No preview available
--
--
--
--
--
62
id: scarlet-ant-lotus
No preview available
--
--
--
--
--
63
id: silent-toad-pearl
No preview available
--
--
--
--
--
63
id: solid-fox-lava
No preview available
--
--
--
--
--
62
id: dark-ibis-dust
No preview available
--
--
--
--
--
63
id: steady-fox-bronze
No preview available
--
--
--
--
--
61