Backends - BAGEL

boileroom supports two backends for executing models on GPUs. The user-facing API is identical regardless of backend — you call .fold() or .embed() the same way, and get the same output types back.

Backend selection

Set the backend when creating a model:

from boileroom import ESMFold

# Serverless GPU via Modal (default)
model = ESMFold(backend="modal")

# Local GPU via Apptainer container
model = ESMFold(backend="apptainer")

Modal provides serverless GPU execution. Your code runs locally, but the model inference runs on Modal’s cloud GPUs. Containers scale to zero when idle and spin up on demand.

Setup

Install Modal: pip install modal
Authenticate: modal token new
Use the model with backend="modal" (this is the default)

GPU options

You can select a GPU type via the device parameter. See Modal’s pricing page for the full list of available GPUs, VRAM, and per-second pricing.

Serverless scaling

Containers cold-start when first called (may take 30-60 seconds)
Subsequent calls reuse warm containers
Containers automatically scale down after a period of inactivity

Apptainer backend

Apptainer (formerly Singularity) runs models in a container on your local machine or HPC cluster. This is useful when you need local execution, cannot use cloud services, or are running on institutional compute.

Setup

Install Apptainer on your system
Use backend="apptainer" or backend="apptainer:tag" to specify a Docker image tag

Tag syntax

By default, the Apptainer backend uses the installed Boileroom package version on the default CUDA line. You can still specify an explicit image tag when needed:

# Use the image tag matching the installed boileroom package version
model = ESMFold(backend="apptainer")

# Use a specific version tag
model = ESMFold(backend="apptainer:0.3.0")

# Use an explicit CUDA-qualified image tag
model = ESMFold(backend="apptainer:cuda11.8-0.3.0")

Local execution

The Apptainer backend runs the model on your local GPU. Ensure you have a CUDA-compatible GPU with sufficient VRAM for the model.

Context manager

Both backends support context manager usage to ensure proper cleanup:

with ESMFold(backend="modal") as model:
    result = model.fold("MKTVRQERLKSIVRI")
# Backend resources are automatically released

Without a context manager, the backend is cleaned up when the ModelWrapper instance is garbage collected or when the Python process exits.

Device selection

The device parameter works the same way across backends:

# Let the backend auto-select (default: "cuda:0" if available)
model = ESMFold(backend="modal")

# Specify a device
model = ESMFold(backend="modal", device="cuda:0")

# Force CPU execution
model = ESMFold(backend="modal", device="cpu")

​Backend selection

​Modal backend

​Setup

​GPU options

​Serverless scaling

​Apptainer backend

​Setup

​Tag syntax

​Local execution

​Context manager

​Device selection

Backend selection

Modal backend

Setup

GPU options

Serverless scaling

Apptainer backend

Setup

Tag syntax

Local execution

Context manager

Device selection