Design Tools

ligandmpnn

Ligand-aware protein sequence design using LigandMPNN. Use this skill when: (1) Designing sequences around small molecules, (2) Enzyme active site design, (3) Ligand binding pocket optimization, (4) Metal coordination site design, (5) Cofactor binding proteins. For standard protein design, use proteinmpnn. For solubility optimization, use solublempnn.

installation instructions

# 1. Ensure Claude Code is installed

View Installation Docs

# 2. Install the marketplace

$/plugin marketplace add adaptyvbio/protein-design-skills

# 3. Install this skill

$/plugin install ligandmpnn@protein-design-skills

View on GitHub

LigandMPNN Ligand-Aware Design

Prerequisites

Requirement	Minimum	Recommended
Python	3.8+	3.10
CUDA	11.0+	11.7+
GPU VRAM	8GB	16GB (T4)
RAM	8GB	16GB

How to run

First time? See Getting started to set up Modal and biomodals.

Option 1: Modal (recommended)

cd biomodals
# modal_ligandmpnn.py takes --input-pdb; LigandMPNN run.py args go in --params-str
modal run modal_ligandmpnn.py \
  --input-pdb protein_ligand.pdb \
  --params-str "--model_type ligand_mpnn --number_of_batches 16 --temperature 0.1"

GPU: A10G default | Timeout: 900s default

Option 2: Local installation

git clone https://github.com/dauparas/LigandMPNN.git
cd LigandMPNN

python run.py \
  --model_type ligand_mpnn \
  --pdb_path protein_ligand.pdb \
  --out_folder output/ \
  --number_of_batches 16 \
  --temperature 0.1

Key parameters (LigandMPNN run.py)

Parameter	Default	Description
`--pdb_path`	required	PDB with ligand
`--model_type`	`protein_mpnn`	`ligand_mpnn`, `soluble_mpnn`, etc.
`--temperature`	0.1	Sampling temperature
`--number_of_batches`	1	Batches (sequences = batch_size x batches)
`--batch_size`	1	Sequences per batch
`--ligand_mpnn_use_side_chain_context`	0	Use ligand side-chain context

Ligand Specification

In PDB File

Ligand must be present as HETATM records:

ATOM    ...protein atoms...
HETATM  1  C1  LIG A 999      x.xxx  y.yyy  z.zzz  1.00  0.00           C

Supported Ligand Types

Small molecules (HETATM)
Metals (Zn, Fe, Mg, Ca, etc.)
Cofactors (NAD, FAD, ATP)
DNA/RNA

Output format

output/
├── seqs/
│   └── protein.fa          # FASTA sequences
└── protein_pdb/
    └── protein_0001.pdb    # PDBs with designed sequence

Sample output

Successful run

$ python run.py --pdb_path enzyme_substrate.pdb --out_folder output/ --num_seq_per_target 8
Loading LigandMPNN model weights...
Processing enzyme_substrate.pdb
Found ligand: LIG (12 atoms)
Generated 8 sequences in 3.1 seconds

output/seqs/enzyme_substrate.fa:
>enzyme_substrate_0001, score=1.45, global_score=1.38
MKTAYIAKQRQISFVKSHFSRQLE...
>enzyme_substrate_0002, score=1.52, global_score=1.41
MKTAYIAKQRQISFVKSQFSRQLD...

What good output looks like:

Score: 1.0-2.0 (lower = more confident)
Ligand detected and incorporated in context
Active site residues preserved or optimized

Decision tree

Should I use LigandMPNN?
│
├─ What's in your binding site?
│  ├─ Small molecule / ligand → LigandMPNN ✓
│  ├─ Metal ion (Zn, Fe, etc.) → LigandMPNN ✓
│  ├─ Cofactor (NAD, FAD, ATP) → LigandMPNN ✓
│  ├─ DNA/RNA → LigandMPNN ✓
│  └─ Nothing / protein only → Use ProteinMPNN
│
├─ What type of design?
│  ├─ Enzyme active site → LigandMPNN ✓
│  ├─ Metal binding site → LigandMPNN ✓
│  ├─ Protein-protein binder → Use ProteinMPNN
│  └─ De novo scaffold → Use ProteinMPNN
│
└─ Priority?
   ├─ Solubility/expression → Consider SolubleMPNN
   └─ Ligand context accuracy → LigandMPNN ✓

Typical performance

Campaign Size	Time (T4)	Cost (Modal)	Notes
100 backbones × 8 seq	15-20 min	~$2	Standard
500 backbones × 8 seq	1-1.5h	~$8	Large campaign

Throughput: ~50-100 sequences/minute on T4 GPU.

Verify

grep -c "^>" output/seqs/*.fa  # Should match backbone_count × num_seq_per_target

Troubleshooting

Ligand not recognized: Check HETATM format, verify ligand residue name Poor binding residues: Increase sampling around active site Missing contacts: Verify ligand coordinates in PDB

Error interpretation

Error	Cause	Fix
`RuntimeError: CUDA out of memory`	Long protein or large batch	Reduce batch_size
`KeyError: 'LIG'`	Ligand not found in PDB	Check HETATM records
`ValueError: no ligand atoms`	Empty ligand	Verify ligand has atoms in PDB

Next: Structure prediction for validation → protein-qc for filtering.

Back to Skills Explorer Submit an Improvement