We used an as-yet unreleased protein familiy language model called ProFam (that will be open-sourced in the next week, spotlight presentation paper at neurips MLSB) we conditioned on homologs of EPHRIN-B2 and generated sequences from scratch, selecting those with sequence identity < 65% to any natural sequence
id: steady-owl-ash

Nipah Virus Glycoprotein G
0.75
77.28
--
14.5 kDa
125
id: gentle-swan-ember

Nipah Virus Glycoprotein G
0.38
80.89
--
15.0 kDa
135
id: dark-lynx-stone

Nipah Virus Glycoprotein G
0.36
81.62
--
15.1 kDa
132
id: jade-deer-oak

Nipah Virus Glycoprotein G
0.27
76.27
--
14.8 kDa
130
id: frozen-lion-opal

Nipah Virus Glycoprotein G
0.13
85.26
--
25.2 kDa
229
id: rapid-deer-lotus

Nipah Virus Glycoprotein G
0.13
83.48
--
15.6 kDa
135
id: pale-kiwi-fern

Nipah Virus Glycoprotein G
0.11
78.31
--
15.6 kDa
134
id: vast-fox-birch

Nipah Virus Glycoprotein G
0.07
81.66
--
15.5 kDa
137
id: shy-lynx-wave

Nipah Virus Glycoprotein G
0.00
78.72
--
15.6 kDa
132
id: crimson-deer-quartz

Nipah Virus Glycoprotein G
0.00
83.15
--
15.3 kDa
134