Contract-first protein-design platform

Protein design,
as a pipeline.

Compose a design-to-fold DAG across RFdiffusion, ProteinMPNN, Boltz and AlphaFold2. FoldForge schedules each step on GPU sidecars, streams progress as it runs, and returns structures by reference.

See the API Read the open contract ↗

open gRPC + OpenAPI contracts · self-hostable · streaming progress over SSE

The pipeline

Four models, one DAG

Compose the standard design-to-fold pipeline — or any subgraph of it. Each step runs on its own GPU sidecar behind a typed contract, so steps are independently scalable and swappable.

RFdiffusion

De novo backbone generation via diffusion, from contig / motif specs.

ProteinMPNN

Inverse folding — designs sequences for a fixed backbone, ranked by score.

Boltz

Open AlphaFold3-class prediction for complexes — proteins, nucleic acids, ligands.

AlphaFold2

Monomer / multimer structure prediction with first-class MSA caching.

The platform

A control plane built for GPU work

FoldForge is the orchestration layer around the models — a workflow engine that survives restarts, streams progress, and treats expensive compute as something to schedule, cache, and cancel, not waste.

Contract-first

One schema set — gRPC internally, OpenAPI at the edge. The proto repo is public so integrators build against a stable contract.

MSA cache as a cost lever

AlphaFold2's MSA search dominates wall-clock. The alignment cache is a first-class API — durable and shared across replicas, so the dominant cost is paid once.

Live progress over SSE

Every workflow streams per-step progress and state transitions to the client as Server-Sent Events — resumable across reconnects.

Artifacts by reference

Structures (PDB / CIF / MSA) move through S3-compatible storage as references, never inlined into RPC messages — large blobs stay out of the hot path.

Cancellable GPU runs

Cancel a workflow and the GPU subprocess group is actually killed — the accelerator is freed instead of finishing work no one is waiting for.

HA & crash-recovery

Workflow state is the database's source of truth, with leased execution and reclaim — an orchestrator can crash mid-run and another picks the work back up.

Per-user keys & quotas

Per-tenant API keys with fixed-window quotas, enforced atomically across gateway replicas. Keys are stored only as hashes.

End-to-end tracing

One trace-id threads from the gateway through the orchestrator to every sidecar — and is persisted so it survives a crash-recovery handoff.

Self-hostable

Run it on your own hardware: a Docker Compose control plane plus your GPU hosts. Object storage is any S3-compatible store — Cloudflare R2 or local MinIO.

For developers

One POST, a structure back

Describe the pipeline as JSON, send it to the gateway, and stream the result. The same workflow runs against the mock pipeline with no GPU — so you can integrate before you provision one.

→ Typed clients generated from the OpenAPI spec
→ Bearer-auth at the edge; the gateway stays stateless
→ Poll, stream (SSE), retry, and cancel from the same API
→ Run the whole pipeline in mock mode for local dev

submit-workflow.sh

# Submit a design → fold workflow
curl -X POST https://api.your-foldforge/v1/workflows \
  -H "Authorization: Bearer $FOLDFORGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "design-then-fold",
    "steps": [
      { "id": "design", "tool": "rfdiffusion",
        "params": { "contigs": "100-100", "num_designs": 3 } },
      { "id": "seqs",   "tool": "proteinmpnn",
        "depends_on": ["design"],
        "params": { "num_sequences": 5 } },
      { "id": "fold",   "tool": "af2",
        "depends_on": ["seqs"],
        "params": { "msa_cache_policy": "USE_CACHE" } }
    ]
  }'

# Stream live progress
curl -N https://api.your-foldforge/v1/workflows/$ID/events

Architecture

Stateless edge, durable core, GPU leaves

A thin HTTP gateway fronts a database-backed workflow engine, which dispatches each step to a model sidecar. Every hop speaks the same contract.

edge

Gateway

HTTP/JSON, OpenAPI-validated, bearer auth, metrics & readiness. Stateless — scales horizontally.

core

Orchestrator

DAG validation, Postgres-backed state, leased execution, retries, and the SSE event stream.

compute

GPU sidecars

One per model — RFdiffusion, ProteinMPNN, Boltz, AF2 — behind a uniform streaming Run RPC.

storage

Object store + DB

S3-compatible artifacts (R2 / MinIO) by reference; Postgres as the source of truth.

Build against the contract today

The schemas are open. Read the proto, generate a client, and run the pipeline in mock mode before you ever touch a GPU.

Explore the proto ↗

Early access is invite-only while we scale GPU capacity · we'll email you when a slot opens

Protein design,as a pipeline.