Skip to main content
ESMFold predicts protein 3D structure directly from a single amino acid sequence using Meta’s end-to-end language model approach. It does not require multiple sequence alignments (MSAs), making it fast for rapid prototyping.

Quick example

from boileroom import ESMFold

model = ESMFold(backend="modal")
result = model.fold("MKTVRQERLKSIVRI")

print(result.atom_array)   # list of Biotite AtomArray objects
print(result.plddt.shape)  # per-residue confidence scores

Methods

.fold()

Predict the 3D structure of one or more protein sequences.
result = model.fold(sequences, options=None)
sequences
str | Sequence[str]
required
A single amino acid sequence string or a list of sequences. Use ":" to separate chains in a multimer (e.g., "CHAIN_A:CHAIN_B").
options
dict | None
default:"None"
Per-call configuration overrides. Only dynamic config keys can be set here — static keys raise ValueError. See Configuration.
Returns: ESMFoldOutput (see Output below)

Output

The ESMFoldOutput dataclass returned by .fold().

Always included

metadata
PredictionMetadata
Prediction metadata with timing information. See PredictionMetadata.
atom_array
list[AtomArray] | None
List of Biotite AtomArray objects, one per input sequence. Always generated.

Confidence metrics

plddt
np.ndarray | None
Per-residue predicted local distance difference test. Shape: (batch, residue, 37).
ptm
np.ndarray | None
Predicted TM-score (scalar per sample).
pae
np.ndarray | None
Predicted aligned error. Shape: (batch, residue, residue).
max_pae
np.ndarray | None
Maximum predicted aligned error (scalar per sample).

Structure representations

pdb
list[str] | None
PDB-formatted structure strings. Only generated when include_fields contains "pdb" or "*".
cif
list[str] | None
mmCIF-formatted structure strings. Only generated when include_fields contains "cif" or "*".

Index arrays

chain_index
np.ndarray | None
Per-residue chain assignment. Shape: (batch, residue).
residue_index
np.ndarray | None
Per-residue residue numbering. Shape: (batch, residue).

Model internals

frames
np.ndarray | None
Backbone frames. Shape: (model_layer, batch, residue, 7).
sidechain_frames
np.ndarray | None
Sidechain frames. Shape: (model_layer, batch, residue, 8, 4, 4).
angles
np.ndarray | None
Torsion angles. Shape: (model_layer, batch, residue, 7, 2).
states
np.ndarray | None
Intermediate structure module states. Shape: (model_layer, batch, residue, dim).
s_s
np.ndarray | None
Single representation from the trunk. Shape: (batch, residue, 1024).
s_z
np.ndarray | None
Pair representation from the trunk. Shape: (batch, residue, residue, 128).
distogram_logits
np.ndarray | None
Distogram logits. Shape: (batch, residue, residue, 64).

Configuration

These keys can be set via config={} at initialization or options={} per call (unless marked static).
KeyTypeDefaultStaticDescription
devicestr"cuda:0"YesGPU device identifier
glycine_linkerstr""NoLinker string inserted between chains for multimer tokenization
position_ids_skipint512NoPosition ID offset between chains in multimer mode
include_fieldslist[str] | NoneNoneNoWhich output fields to return. None returns all; use ["pdb"] or ["cif"] to generate structure strings. Use ["*"] for everything including PDB/CIF

Multimer prediction

Separate chains with ":" in the sequence string:
result = model.fold("MKTVRQERLKSIVRI:LERSKEPVSGAQLAEE")

# Chain information is available in the output
print(result.chain_index)    # per-residue chain assignment
print(result.residue_index)  # per-residue residue numbering