Skip to main content
A ready-to-run example is available here!
In warm-pool deployments server pods are booted before a user is matched to one. The pod starts in a dormant state — stateless services (tool preload, VSCode, etc.) come up normally, but all /api/* routes return 503 until POST /api/init delivers the per-user runtime configuration (credentials, workspace paths, session keys). This pattern reduces cold-start latency for users while keeping per-user data out of the image.

State Machine

dormant ──(POST /api/init)──▶ initializing ──▶ ready
   ▲                               │
   └───────────(on error)──────────┘
State/health, /readyGET /api/initPOST /api/init/api/*
dormant200200 state: dormant200 → starts init503
initializing200200 state: initializing400 (already running)503
ready200200 state: ready400 (already done)live
When deferred_init is false (the default), the /api/init endpoints return 404 and all /api/* routes are live immediately.

Enabling Dormant Mode

Set the OH_DEFERRED_INIT environment variable when starting the server:
OH_DEFERRED_INIT=true OH_SECRET_KEY=<bootstrap-secret> python -m openhands.agent_server
The OH_SECRET_KEY value is used to authenticate POST /api/init via the X-Init-API-Key request header. The orchestrator already holds this key for encryption purposes, so no additional secret distribution is required.

Checking the Init State

GET /api/init is unauthenticated and returns the current state at any time:
curl http://localhost:8000/api/init
# {"state":"dormant","error":null}

Activating the Server

Send POST /api/init with the X-Init-API-Key header set to the bootstrap secret. The body is an InitRequest and all fields are optional — only the values you provide override the dormant configuration:
import httpx

client = httpx.Client(base_url="http://localhost:8000")

resp = client.post(
    "/api/init",
    json={
        # Credentials that should not be baked into the warm image arrive here.
        "env": {"LLM_API_KEY": user_api_key},
        # Point at the user's mounted workspace.
        "conversations_path": "/mnt/user-workspace/conversations",
        # Lock down the API to this user's session key.
        "session_api_keys": [user_session_key],
    },
    headers={"X-Init-API-Key": BOOTSTRAP_SECRET_KEY},
)
assert resp.json()["state"] == "ready"
InitRequest fields:
FieldTypeDescription
session_api_keyslist[str]Per-user API keys for subsequent /api/* calls
secret_keystrEncryption secret (defaults to first session_api_key)
conversations_pathpathWhere conversations are persisted
bash_events_dirpathWhere bash events are persisted
envdict[str, str]Process env vars set before services start (e.g. credentials)
webhookslistPer-user webhooks for event streaming
web_urlstrExternal server URL for root-path calculation
allow_cors_originslist[str]CORS origins added to the localhost allowlist
max_concurrent_runsintOverride conversation-step concurrency limit

Error Handling

If initialization fails, the state rolls back to dormant and the error is stored in GET /api/init response’s error field. The orchestrator can then retry POST /api/init:
curl http://localhost:8000/api/init
# {"state":"dormant","error":"ConversationService failed to start: ..."}

Ready-to-run Example

This example is available on GitHub: examples/02_remote_agent_server/16_deferred_init.py
This example walks through the full warm-pool lifecycle: starting a dormant server, verifying the 503 gate, activating it via POST /api/init, and running a conversation on the ready server.
examples/02_remote_agent_server/16_deferred_init.py
<placeholder — auto-synced from agent-sdk>
You can run the example code as-is.
The model name should follow the LiteLLM convention: provider/model_name (e.g., anthropic/claude-sonnet-4-5-20250929, openai/gpt-4o). The LLM_API_KEY should be the API key for your chosen provider.
ChatGPT Plus/Pro subscribers: You can use LLM.subscription_login() to authenticate with your ChatGPT account and access Codex models without consuming API credits. See the LLM Subscriptions guide for details.

Next Steps