Self-host

Run it on your own hardware

FoldForge ships as a Docker Compose control plane — gateway, orchestrator, Postgres and Caddy (auto-HTTPS) — that you point at your own GPU hosts and any S3-compatible object store. No managed dependency you don't control.

Topology

A thin always-on control plane runs anywhere Docker does; the expensive GPU work runs on hosts you rent or own and wire in by endpoint.

edge
Caddy → GatewayAuto-TLS reverse proxy in front of the HTTP/JSON API gateway.
core
Orchestrator + PostgresThe DAG engine and its durable state — workflow truth lives in Postgres on a volume.
compute
GPU sidecarsRFdiffusion / ProteinMPNN / Boltz / AF2 on rented GPU hosts, wired in via SIDECAR_* endpoints.
storage
S3-compatible object storeArtifact blobs + the AF2 MSA cache — Cloudflare R2 in production, MinIO locally.

What you'll need

A control-plane host

Any Linux box with Docker + Docker Compose. The MVP reference uses a single Hetzner Cloud node.

GPU host(s)

One or more machines with NVIDIA GPUs to run the model sidecars. Rent separately — not provisioned for you.

S3-compatible storage

A bucket for artifacts and one for the MSA cache. Cloudflare R2, AWS S3, or self-hosted MinIO.

Postgres

Runs as a container on a volume in the MVP; swap for a managed DB when scale demands it.

Bring up the stack

The reference deployment lives in the infra repo: Terraform to provision, Docker Compose to run.

  1. Provision (optional, Terraform)

    The reference Terraform provisions a Hetzner node, a Postgres volume, and Cloudflare R2 buckets. Skip it if you're bringing your own host and storage.

    terraform
    cd terraform
    cp terraform.tfvars.example terraform.tfvars   # tokens + SSH keys
    cp backend.hcl.example backend.hcl             # R2 state creds
    ../scripts/bootstrap-state.sh                  # one-time: state bucket
    terraform init -backend-config=backend.hcl
    terraform apply
  2. Configure the stack

    Copy the example env file and fill in secrets, image tags, and your SIDECAR_* GPU-host endpoints + object-store credentials.

    compose/.env
    cp compose/.env.example compose/.env
    # set: GATEWAY_TAG, ORCHESTRATOR_TAG, DB creds,
    #      SIDECAR_RFDIFFUSION / _PROTEINMPNN / _BOLTZ / _AF2,
    #      R2 / S3 endpoint + bucket + keys
  3. Deploy & wait for health

    The deploy script pulls the pinned images, brings the stack up, and waits on the gateway's health endpoint. Bump the image tags to roll forward.

    deploy
    ./scripts/deploy.sh
    # pulls pinned images, `docker compose up`, waits on /v1/healthz
  4. Verify it's ready

    Readiness pings the orchestrator and DB end to end — a 200 means the whole path is live.

    verify
    curl https://your-gateway/v1/readyz
    # {"orchestrator":"ok","status":"ready"}
    No GPU yet? The pipeline runs end to end in mock mode, so you can stand up the control plane and integrate against the API before wiring real accelerators.

Operate

An optional distributed-tracing overlay (compose/docker-compose.trace.yml) adds an OpenTelemetry collector + Jaeger; the services already propagate a W3C traceparent end to end. Prometheus scrapes /metrics; /v1/healthz and /v1/readyz drive your load balancer.

Full runbooks live in the infra repo (provisioning, deploy, and the tracing overlay). The API reference covers the request side.