ESMFold - BAGEL

ESMFold predicts protein 3D structure directly from a single amino acid sequence using Meta’s end-to-end language model approach. It does not require multiple sequence alignments (MSAs), making it fast for rapid prototyping.

Quick example

from boileroom import ESMFold

model = ESMFold(backend="modal")
result = model.fold("MKTVRQERLKSIVRI", options={"include_fields": ["plddt"]})

print(result.atom_array)   # list of Biotite AtomArray objects
print(result.plddt.shape)  # per-residue confidence scores

Methods

`.fold()`

Predict the 3D structure of one or more protein sequences.

result = model.fold(sequences, options=None)

sequences

str | Sequence[str]

required

A single amino acid sequence string or a list of sequences. Use ":" to separate chains in a multimer (e.g., "CHAIN_A:CHAIN_B").

options

dict | None

default:"None"

Per-call configuration overrides. Only dynamic config keys can be set here — static keys raise ValueError. See Configuration.

Returns: ESMFoldOutput (see Output below)

Output

The ESMFoldOutput dataclass returned by .fold().

Always included

metadata

PredictionMetadata

Prediction metadata with timing information. See PredictionMetadata.

atom_array

list[AtomArray] | None

List of Biotite AtomArray objects, one per input sequence. Always generated.

Confidence metrics

plddt

np.ndarray | None

Per-residue predicted local distance difference test. Shape: (batch, residue, 37).

ptm

np.ndarray | None

Predicted TM-score (scalar per sample).

pae

np.ndarray | None

Predicted aligned error. Shape: (batch, residue, residue).

max_pae

np.ndarray | None

Maximum predicted aligned error (scalar per sample).

Structure representations

pdb

list[str] | None

PDB-formatted structure strings. Only generated when include_fields contains "pdb" or "*".

cif

list[str] | None

mmCIF-formatted structure strings. Only generated when include_fields contains "cif" or "*".

Index arrays

chain_index

np.ndarray | None

Per-residue chain assignment. Shape: (batch, residue).

residue_index

np.ndarray | None

Per-residue residue numbering. Shape: (batch, residue).

Model internals

frames

np.ndarray | None

Backbone frames. Shape: (model_layer, batch, residue, 7).

sidechain_frames

np.ndarray | None

Sidechain frames. Shape: (model_layer, batch, residue, 8, 4, 4).

angles

np.ndarray | None

Torsion angles. Shape: (model_layer, batch, residue, 7, 2).

states

np.ndarray | None

Intermediate structure module states. Shape: (model_layer, batch, residue, dim).

s_s

np.ndarray | None

Single representation from the trunk. Shape: (batch, residue, 1024).

s_z

np.ndarray | None

Pair representation from the trunk. Shape: (batch, residue, residue, 128).

distogram_logits

np.ndarray | None

Distogram logits. Shape: (batch, residue, residue, 64).

Configuration

These keys can be set via config={} at initialization or options={} per call (unless marked static).

Key	Type	Default	Static	Description
`device`	`str`	`"cuda:0"`	Yes	GPU device identifier
`glycine_linker`	`str`	`""`	No	Linker string inserted between chains for multimer tokenization
`position_ids_skip`	`int`	`512`	No	Position ID offset between chains in multimer mode
`include_fields`	`list[str] \| None`	`None`	No	Which output fields to return. `None` returns all; use `["pdb"]` or `["cif"]` to generate structure strings. Use `["*"]` for everything including PDB/CIF

Multimer prediction

Separate chains with ":" in the sequence string:

result = model.fold("MKTVRQERLKSIVRI:LERSKEPVSGAQLAEE")

# Chain information is available in the output
print(result.chain_index)    # per-residue chain assignment
print(result.residue_index)  # per-residue residue numbering

​Quick example

​Methods

​.fold()

​Output

​Always included

​Confidence metrics

​Structure representations

​Index arrays

​Model internals

​Configuration

​Multimer prediction

Quick example

Methods

`.fold()`

Output

Always included

Confidence metrics

Structure representations

Index arrays

Model internals

Configuration

Multimer prediction