class ImageContent
Bases:BaseContent
Properties
cache_prompt: boolimage_urls: list[str]model_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].type: Literal[‘image’]
Methods
to_llm_dict()
Convert to LLM API format.class LLM
Bases:BaseModel, RetryMixin, NonNativeToolCallingMixin
Language model interface for OpenHands agents.
The LLM class provides a unified interface for interacting with various
language models through the litellm library. It handles model configuration,
API authentication,
retry logic, and tool calling capabilities.
Example
Properties
OVERRIDE_ON_SERIALIZE: tuple[str, …]api_key: str | SecretStr | Noneapi_version: str | Noneaws_access_key_id: str | SecretStr | Noneaws_region_name: str | Noneaws_secret_access_key: str | SecretStr | Nonebase_url: str | Nonecaching_prompt: boolcustom_llm_provider: str | Nonecustom_tokenizer: str | Nonedisable_stop_word: bool | Nonedisable_vision: bool | Nonedrop_params: boolenable_encrypted_reasoning: boolextended_thinking_budget: int | Noneextra_headers: dict[str, str] | Noneinput_cost_per_token: float | Nonelitellm_extra_body: dict[str, Any]log_completions: boollog_completions_folder: strmax_input_tokens: int | Nonemax_message_chars: intmax_output_tokens: int | Nonemetrics: Metrics Get usage metrics for this LLM instance.- Returns: Metrics object containing token usage, costs, and other statistics.
model: strmodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].model_info: dict | None Returns the model info dictionary.modify_params: boolnative_tool_calling: boolnum_retries: intollama_base_url: str | Noneopenrouter_app_name: stropenrouter_site_url: stroutput_cost_per_token: float | Nonereasoning_effort: Literal[‘low’, ‘medium’, ‘high’, ‘none’] | Nonereasoning_summary: Literal[‘auto’, ‘concise’, ‘detailed’] | Noneretry_listener: SkipJsonSchema[Callable[[int, int], None] | None]retry_max_wait: intretry_min_wait: intretry_multiplier: floatsafety_settings: list[dict[str, str]] | Noneseed: int | Noneservice_id: strtemperature: float | Nonetimeout: int | Nonetop_k: float | Nonetop_p: float | Noneusage_id: str
Methods
completion()
Generate a completion from the language model. This is the method for getting responses from the model via Completion API. It handles message formatting, tool calling, and response processing.- Returns: LLMResponse containing the model’s response and metadata.
- Raises:
ValueError– If streaming is requested (not supported).
format_messages_for_llm()
Formats Message objects for LLM consumption.format_messages_for_responses()
Prepare (instructions, input[]) for the OpenAI Responses API.- Skips prompt caching flags and string serializer concerns
- Uses Message.to_responses_value to get either instructions (system) or input items (others)
- Concatenates system instructions into a single instructions string
get_token_count()
is_caching_prompt_active()
Check if prompt caching is supported and enabled for current model.- Returns: True if prompt caching is supported and enabled for the given : model.
- Return type: boolean
classmethod load_from_env()
classmethod load_from_json()
model_post_init()
This function is meant to behave like a BaseModel method to initialise private attributes. It takes context as an argument since that’s what pydantic-core passes when calling it.- Parameters:
self– The BaseModel instance.context– The context.
resolve_diff_from_deserialized()
Resolve differences between a deserialized LLM and the current instance. This is due to fields like api_key being serialized to “responses()
Alternative invocation path using OpenAI Responses API via LiteLLM. Maps Message[] -> (instructions, input[]) and returns LLMResponse. Non-stream only for v1.restore_metrics()
uses_responses_api()
Whether this model uses the OpenAI Responses API path.vision_is_active()
class LLMRegistry
Bases:object
A minimal LLM registry for managing LLM instances by usage ID.
This registry provides a simple way to manage multiple LLM instances,
avoiding the need to recreate LLMs with the same configuration.
Properties
registry_id: strretry_listener: Callable[[int, int], None] | Noneservice_to_llm: dict[str, LLM]subscriber: Callable[[RegistryEvent], None] | Noneusage_to_llm: dict[str, LLM] Access the internal usage-ID-to-LLM mapping.
Methods
init()
Initialize the LLM registry.- Parameters:
retry_listener– Optional callback for retry events.
add()
Add an LLM instance to the registry.- Parameters:
llm– The LLM instance to register. - Raises:
ValueError– If llm.usage_id already exists in the registry.
get()
Get an LLM instance from the registry.- Parameters:
usage_id– Unique identifier for the LLM usage slot. - Returns: The LLM instance.
- Raises:
KeyError– If usage_id is not found in the registry.
list_services()
Deprecated alias forlist_usage_ids().
list_usage_ids()
List all registered usage IDs.notify()
Notify subscribers of registry events.- Parameters:
event– The registry event to notify about.
subscribe()
Subscribe to registry events.- Parameters:
callback– Function to call when LLMs are created or updated.
class LLMResponse
Bases:BaseModel
Result of an LLM completion request.
This type provides a clean interface for LLM completion results, exposing
only OpenHands-native types to consumers while preserving access to the
raw LiteLLM response for internal use.
Properties
id: str Get the response ID from the underlying LLM response. This property provides a clean interface to access the response ID, supporting both completion mode (ModelResponse) and response API modes (ResponsesAPIResponse).- Returns: The response ID from the LLM response
message: Messagemetrics: MetricsSnapshotmodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].raw_response: ModelResponse | ResponsesAPIResponse
Methods
message
The completion message converted to OpenHands Message typemetrics
Snapshot of metrics from the completion requestraw_response
The original LiteLLM response (ModelResponse or ResponsesAPIResponse) for internal use- Type: litellm.types.utils.ModelResponse | litellm.types.llms.openai.ResponsesAPIResponse
class Message
Bases:BaseModel
Properties
cache_enabled: boolcontains_image: boolcontent: Sequence[TextContent | ImageContent]force_string_serializer: boolfunction_calling_enabled: boolmodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].name: str | Nonereasoning_content: str | Noneresponses_reasoning_item: ReasoningItemModel | Nonerole: Literal[‘user’, ‘system’, ‘assistant’, ‘tool’]thinking_blocks: Sequence[ThinkingBlock | RedactedThinkingBlock]tool_call_id: str | Nonetool_calls: list[MessageToolCall] | Nonevision_enabled: bool
Methods
classmethod from_llm_chat_message()
Convert a LiteLLMMessage (Chat Completions) to our Message class. Provider-agnostic mapping for reasoning:- Prefer message.reasoning_content if present (LiteLLM normalized field)
- Extract thinking_blocks from content array (Anthropic-specific)
classmethod from_llm_responses_output()
Convert OpenAI Responses API output items into a single assistant Message. Policy (non-stream):- Collect assistant text by concatenating output_text parts from message items
- Normalize function_call items to MessageToolCall list
to_chat_dict()
Serialize message for OpenAI Chat Completions. Chooses the appropriate content serializer and then injects threading keys:- Assistant tool call turn: role == “assistant” and self.tool_calls
- Tool result turn: role == “tool” and self.tool_call_id (with name)
to_responses_dict()
Serialize message for OpenAI Responses (input parameter). Produces a list of “input” items for the Responses API:- system: returns [], system content is expected in ‘instructions’
- user: one ‘message’ item with content parts -> input_text / input_image (when vision enabled)
- assistant: emits prior assistant content as input_text, and function_call items for tool_calls
- tool: emits function_call_output items (one per TextContent) with matching call_id
to_responses_value()
Return serialized form. Either an instructions string (for system) or input items (for other roles).class MessageToolCall
Bases:BaseModel
Transport-agnostic tool call representation.
One canonical id is used for linking across actions/observations and
for Responses function_call_output call_id.
Properties
arguments: strid: strmodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].name: strorigin: Literal[‘completion’, ‘responses’]
Methods
classmethod from_chat_tool_call()
Create a MessageToolCall from a Chat Completions tool call.classmethod from_responses_function_call()
Create a MessageToolCall from a typed OpenAI Responses function_call item. Note: OpenAI Responses function_call.arguments is already a JSON string.to_chat_dict()
Serialize to OpenAI Chat Completions tool_calls format.to_responses_dict()
Serialize to OpenAI Responses ‘function_call’ input item format.class Metrics
Bases:MetricsSnapshot
Metrics class can record various metrics during running and evaluation.
We track:
- accumulated_cost and costs
- max_budget_per_task (budget limit)
- A list of ResponseLatency
- A list of TokenUsage (one per call).
Properties
costs: list[Cost]model_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].response_latencies: list[ResponseLatency]token_usages: list[TokenUsage]
Methods
add_cost()
add_response_latency()
add_token_usage()
Add a single usage record.deep_copy()
Create a deep copy of the Metrics object.diff()
Calculate the difference between current metrics and a baseline. This is useful for tracking metrics for specific operations like delegates.- Parameters:
baseline– A metrics object representing the baseline state - Returns: A new Metrics object containing only the differences since the baseline
get()
Return the metrics in a dictionary.get_snapshot()
Get a snapshot of the current metrics without the detailed lists.initialize_accumulated_token_usage()
log()
Log the metrics.merge()
Merge ‘other’ metrics into this one.classmethod validate_accumulated_cost()
class MetricsSnapshot
Bases:BaseModel
A snapshot of metrics at a point in time.
Does not include lists of individual costs, latencies, or token usages.
Properties
accumulated_cost: floataccumulated_token_usage: TokenUsage | Nonemax_budget_per_task: float | Nonemodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].model_name: str
class ReasoningItemModel
Bases:BaseModel
OpenAI Responses reasoning item (non-stream, subset we consume).
Do not log or render encrypted_content.
Properties
content: list[str] | Noneencrypted_content: str | Noneid: str | Nonemodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].status: str | Nonesummary: list[str]
class RedactedThinkingBlock
Bases:BaseModel
Redacted thinking block for previous responses without extended thinking.
This is used as a placeholder for assistant messages that were generated
before extended thinking was enabled.
Properties
data: strmodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].type: Literal[‘redacted_thinking’]
class RegistryEvent
Bases:BaseModel
Properties
llm: LLMmodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
class RouterLLM
Bases:LLM
Base class for multiple LLM acting as a unified LLM.
This class provides a foundation for implementing model routing by
inheriting from LLM, allowing routers to work with multiple underlying
LLM models while presenting a unified LLM interface to consumers.
Key features:
- Works with multiple LLMs configured via llms_for_routing
- Delegates all other operations/properties to the selected LLM
- Provides routing interface through select_llm() method
Properties
active_llm: LLM | Nonellms_for_routing: dict[str, LLM]model_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].router_name: str
Methods
completion()
This method intercepts completion calls and routes them to the appropriate underlying LLM based on the routing logic implemented in select_llm().model_post_init()
This function is meant to behave like a BaseModel method to initialise private attributes. It takes context as an argument since that’s what pydantic-core passes when calling it.- Parameters:
self– The BaseModel instance.context– The context.
abstractmethod select_llm()
Select which LLM to use based on messages and events. This method implements the core routing logic for the RouterLLM. Subclasses should analyze the provided messages to determine which LLM from llms_for_routing is most appropriate for handling the request.- Parameters:
messages– List of messages in the conversation that can be used to inform the routing decision. - Returns: The key/name of the LLM to use from llms_for_routing dictionary.
classmethod set_placeholder_model()
Guarantee model exists before LLM base validation runs.classmethod validate_llms_not_empty()
class TextContent
Bases:BaseContent
Properties
cache_prompt: boolmodel_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].text: strtype: Literal[‘text’]
Methods
to_llm_dict()
Convert to LLM API format.class ThinkingBlock
Bases:BaseModel
Anthropic thinking block for extended thinking feature.
This represents the raw thinking blocks returned by Anthropic models
when extended thinking is enabled. These blocks must be preserved
and passed back to the API for tool use scenarios.
Properties
model_config: ClassVar[ConfigDict] = (configuration object) Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].signature: strthinking: strtype: Literal[‘thinking’]

