ESM-2 - BAGEL

ESM2 is an embedding oracle that wraps Meta’s ESM-2 protein language model. It produces high-dimensional per-residue embeddings that capture biochemical and evolutionary context, which are used by embedding-based energy terms like EmbeddingsSimilarityEnergy. By default, BAGEL uses the 650M parameter version of ESM-2, but any ESM-2 model size can be specified via the config parameter. For multimers, ESM-2 uses the same linker and positional encoding approach as ESMFold.

ESM-2 inference is powered by boileroom. boileroom handles model loading, GPU execution, and dependency isolation — either serverlessly via Modal or locally via Apptainer. See the boileroom ESM-2 reference for backend configuration details.

Parameters

Whether to run ESM-2 on Modal’s serverless GPU infrastructure. Set to True for serverless execution (no local GPU required), or False for local GPU execution.

config

dict[str, Any]

default:"{}"

Model-specific configuration. Can be used to specify model size, linker parameters, and other options.

Optional Modal app context for reusing an existing Modal session.

Methods

embed

Calculate the embeddings of the residues in the chains. Parameters

chains

list[Chain]

required

The chains to embed. Sequences are concatenated with appropriate linker handling.

Example

import bagel as bg

# Create an ESM-2 oracle using Modal
esm2 = bg.oracles.ESM2(use_modal=True)

# Extract reference embeddings from a conserved region
reference_embeddings = esm2.predict(chains).embeddings

# Use with EmbeddingsSimilarityEnergy to maintain functional similarity
state = bg.State(
    chains=[chain],
    energy_terms=[
        bg.energies.EmbeddingsSimilarityEnergy(
            oracle=esm2,
            weight=1.0,
            residues=[conserved_residues],
            reference_embeddings=reference_embeddings,
        ),
    ],
    name="enzyme_variant",
)

ESMFold

Callback

​Parameters

​Methods

​embed

​Example

Parameters

Methods

embed

Example