This section is for users who want to connect OpenHands to different LLMs.
OpenHands now delegates all LLM orchestration to the Agent SDK. The guidance on this
page focuses on how the OpenHands interfaces surface those capabilities. When in doubt, refer to the SDK documentation
for the canonical list of supported parameters.
Model Recommendations
Model quality for coding agents changes quickly. These recommendations are based on current OpenHands Index results where available. The linked openhands-index-results repository contains the full scores and trajectories for each run. Use the strongest model you can afford for long-running or high-stakes tasks. Use lower-cost profiles for routine edits, then switch back to a stronger model for planning, debugging, and review.Best Cloud Models by Family
| Family | Recommended Model | Model String | OpenHands Index Average |
|---|---|---|---|
| Claude | claude-opus-4-8 | Not yet listed | 71.9 |
| GPT | GPT-5.5 | openai/gpt-5.5 | 65.9 |
| Gemini | Gemini-3.5-Flash | Not yet listed | 62.6 |
Strong Open / Open-Weight Models
These open or open-weight models have good OpenHands Index scores or are recommended for local OpenHands setups:| Model | Suggested Model String | OpenHands Index Average |
|---|---|---|
| GLM-5.1 | openrouter/z-ai/glm-5.1 | 58.2 |
| MiniMax-M3 | openrouter/minimax/minimax-m3 | 57.2 |
| Kimi-K2.6 | openrouter/moonshotai/kimi-k2.6 | 57.1 |
| GLM-5 | openrouter/z-ai/glm-5 | 49.4 |
| Kimi-K2.5 | openrouter/moonshotai/kimi-k2.5 | 49.2 |
Hosted model strings can vary by provider and region. If a model string is not accepted, check the provider console and
the LiteLLM provider list, then use the provider-specific model ID shown there.
Local / Self-Hosted Models
For local and self-hosted usage, start with Qwen3.6-35B-A3B. See the local LLM guide for LM Studio, Ollama, SGLang, and vLLM setup examples.Known Issues
Open-weight and local models still vary widely in tool-use reliability. If you see long wait times, poor responses, or
errors about malformed JSON, try a stronger model, increase the context window, or switch to a frontier cloud model for
that task.
LLM Configuration
The following can be set in the OpenHands UI through the Settings. Each option is serialized into theLLM.load_from_env() schema before being passed to the Agent SDK:
LLM ProviderLLM ModelAPI KeyBase URL(throughAdvancedsettings)
config.toml) so the SDK picks them up during startup:
LLM_API_VERSIONLLM_EMBEDDING_MODELLLM_EMBEDDING_DEPLOYMENT_NAMELLM_DROP_PARAMSLLM_DISABLE_VISIONLLM_CACHING_PROMPT
LLM Provider Guides
We have a few guides for running OpenHands with specific model providers:- AWS Bedrock
- Azure
- Groq
- Local LLMs with SGLang or vLLM
- LiteLLM Proxy
- Moonshot AI
- OpenAI
- OpenHands
- OpenRouter
Model Customization
LLM providers have specific settings that can be customized to optimize their performance with OpenHands, such as:- Custom Tokenizers: For specialized models, you can add a suitable tokenizer.
- Native Tool Calling: Toggle native function/tool calling capabilities.
API retries and rate limits
LLM providers typically have rate limits, sometimes very low, and may require retries. OpenHands will automatically retry requests if it receives a Rate Limit Error (429 error code). You can customize these options as you need for the provider you’re using. Check their documentation, and set the following environment variables to control the number of retries and the time between retries:LLM_NUM_RETRIES(Default of 4 times)LLM_RETRY_MIN_WAIT(Default of 5 seconds)LLM_RETRY_MAX_WAIT(Default of 30 seconds)LLM_RETRY_MULTIPLIER(Default of 2)
config.toml file:

