A ready-to-run example is available here!In warm-pool deployments server pods are booted before a user is matched to one. The pod starts in a dormant state — stateless services (tool preload, VSCode, etc.) come up normally, but all
/api/* routes return 503 until
POST /api/init delivers the per-user runtime configuration (credentials,
workspace paths, session keys).
This pattern reduces cold-start latency for users while keeping per-user data
out of the image.
State Machine
| State | /health, /ready | GET /api/init | POST /api/init | /api/* |
|---|---|---|---|---|
dormant | 200 | 200 state: dormant | 200 → starts init | 503 |
initializing | 200 | 200 state: initializing | 400 (already running) | 503 |
ready | 200 | 200 state: ready | 400 (already done) | live |
deferred_init is false (the default), the /api/init endpoints return
404 and all /api/* routes are live immediately.
Enabling Dormant Mode
Set theOH_DEFERRED_INIT environment variable when starting the server:
OH_SECRET_KEY value is used to authenticate POST /api/init via the
X-Init-API-Key request header. The orchestrator already holds this key for
encryption purposes, so no additional secret distribution is required.
Checking the Init State
GET /api/init is unauthenticated and returns the current state at any time:
Activating the Server
SendPOST /api/init with the X-Init-API-Key header set to the bootstrap
secret. The body is an InitRequest and all fields are optional — only the
values you provide override the dormant configuration:
InitRequest fields:
| Field | Type | Description |
|---|---|---|
session_api_keys | list[str] | Per-user API keys for subsequent /api/* calls |
secret_key | str | Encryption secret (defaults to first session_api_key) |
conversations_path | path | Where conversations are persisted |
bash_events_dir | path | Where bash events are persisted |
env | dict[str, str] | Process env vars set before services start (e.g. credentials) |
webhooks | list | Per-user webhooks for event streaming |
web_url | str | External server URL for root-path calculation |
allow_cors_origins | list[str] | CORS origins added to the localhost allowlist |
max_concurrent_runs | int | Override conversation-step concurrency limit |
Error Handling
If initialization fails, the state rolls back todormant and the error is
stored in GET /api/init response’s error field. The orchestrator can then
retry POST /api/init:
Ready-to-run Example
This example is available on GitHub: examples/02_remote_agent_server/16_deferred_init.py
503 gate, activating it via POST /api/init, and
running a conversation on the ready server.
examples/02_remote_agent_server/16_deferred_init.py
The model name should follow the LiteLLM convention:
provider/model_name (e.g., anthropic/claude-sonnet-4-5-20250929, openai/gpt-4o).
The LLM_API_KEY should be the API key for your chosen provider.Next Steps
- Local Agent Server — Run a server in the same process
- Docker Sandbox — Isolated Docker-based deployment
- Settings & Secrets API — Manage per-user secrets securely
- Agent Server Overview — Architecture and deployment options

