End-to-end guidance for protein design pipelines. Use this skill when: (1) Starting a new protein design project, (2) Need step-by-step workflow guidance, (3) Understanding the full design pipeline, (4) Planning compute resources and timelines, (5) Integrating multiple design tools. For tool selection, use binder-design. For QC thresholds, use protein-qc.
Target Preparation --> Backbone Generation --> Sequence Design
| | |
v v v
(pdb skill) (rfdiffusion) (proteinmpnn)
| |
v v
Structure Validation --> Filtering
| |
v v
(alphafold/chai) (protein-qc)
# Download from PDB
curl -o target.pdb "https://files.rcsb.org/download/XXXX.pdb"
# Extract target chain
# Remove waters, ligands if needed
# Trim to binding region + 10A buffer
Output: target_prepared.pdb, hotspot list
modal run modal_rfdiffusion.py \
--pdb target_prepared.pdb \
--contigs "A1-150/0 70-100" \
--hotspot "A45,A67,A89" \
--num-designs 500
modal run modal_bindcraft.py \
--target-pdb target_prepared.pdb \
--hotspots "A45,A67,A89" \
--num-designs 100
Output: 100-500 backbone PDBs
for backbone in backbones/*.pdb; do
modal run modal_proteinmpnn.py \
--pdb-path "$backbone" \
--num-seq-per-target 8 \
--sampling-temp 0.1
done
Output: 8 sequences per backbone (800-4000 total)
# Prepare FASTA with binder + target
# binder:target format for multimer
modal run modal_colabfold.py \
--input-faa all_sequences.fasta \
--out-dir predictions/
Output: AF2 predictions with pLDDT, ipTM, PAE
import pandas as pd
# Load metrics
designs = pd.read_csv('all_metrics.csv')
# Filter
filtered = designs[
(designs['pLDDT'] > 0.85) &
(designs['ipTM'] > 0.50) &
(designs['PAE_interface'] < 10) &
(designs['scRMSD'] < 2.0) &
(designs['esm2_pll'] > 0.0)
]
# Rank by composite score
filtered['score'] = (
0.3 * filtered['pLDDT'] +
0.3 * filtered['ipTM'] +
0.2 * (1 - filtered['PAE_interface'] / 20) +
0.2 * filtered['esm2_pll']
)
top_designs = filtered.nlargest(50, 'score')
Output: 50-200 filtered candidates
| Stage | GPU | Time (100 designs) |
|---|---|---|
| RFdiffusion | A10G | 30 min |
| ProteinMPNN | T4 | 15 min |
| ColabFold | A100 | 4-8 hours |
| Filtering | CPU | 15 min |
| Problem | Solution |
|---|---|
| Low ipTM | Check hotspots, increase designs |
| Poor diversity | Higher temperature, more backbones |
| High scRMSD | Backbone may be unusual |
| Low pLDDT | Check design quality |