Overview
The OpenHands SDK provides comprehensive metrics tracking at two levels: individual LLM metrics and aggregated conversation-level costs:- You can access detailed metrics from each LLM instance using the
llm.metricsobject to track token usage, costs, and latencies per API call. - For a complete view, use
conversation.conversation_statsto get aggregated costs across all LLMs used in a conversation, including the primary agent LLM and any auxiliary LLMs (such as those used by the context condenser).
Getting Metrics from Individual LLMs
This example is available on GitHub: examples/01_standalone_sdk/13_get_llm_metrics.py
examples/01_standalone_sdk/13_get_llm_metrics.py
Running the Example
Accessing Individual LLM Metrics
Access metrics directly from the LLM object after running the conversation:llm.metrics object is an instance of the Metrics class, which provides detailed information including:
accumulated_cost- Total accumulated cost across all API callsaccumulated_token_usage- Aggregated token usage with fields like:prompt_tokens- Number of input tokens processedcompletion_tokens- Number of output tokens generatedcache_read_tokens- Cache hits (if supported by the model)cache_write_tokens- Cache writes (if supported by the model)reasoning_tokens- Reasoning tokens (for models that support extended thinking)context_window- Context window size used
costs- List of individual cost records per API calltoken_usages- List of detailed token usage records per API callresponse_latencies- List of response latency metrics per API call
Using LLM Registry for Cost Tracking
This example is available on GitHub: examples/01_standalone_sdk/05_use_llm_registry.py
usage_id. This is particularly useful for tracking costs across different LLMs used in your application.
examples/01_standalone_sdk/05_use_llm_registry.py
Running the Example
How the LLM Registry Works
Each LLM is created with a uniqueusage_id (e.g., “agent”, “condenser”) that serves as its identifier in the registry. The registry maintains references to all LLM instances, allowing you to:
- Register LLMs: Add LLM instances to the registry with
llm_registry.add(llm) - Retrieve LLMs: Get LLM instances by their usage ID with
llm_registry.get("usage_id") - List Usage IDs: View all registered usage IDs with
llm_registry.list_usage_ids() - Track Costs Separately: Each LLM’s metrics are tracked independently by its usage ID
Getting Aggregated Conversation Costs
This example is available on GitHub: examples/01_standalone_sdk/21_generate_extraneous_conversation_costs.py
conversation.conversation_stats. This is particularly useful when your conversation involves multiple LLMs, such as the main agent LLM and auxiliary LLMs for tasks like context condensing.
examples/01_standalone_sdk/21_generate_extraneous_conversation_costs.py
Running the Example
Understanding Conversation Stats
Theconversation.conversation_stats object provides comprehensive cost tracking across all LLMs used in a conversation. It is an instance of the ConversationStats class, which provides the following key features:
Key Methods and Properties
-
usage_to_metrics: A dictionary mapping usage IDs to their respectiveMetricsobjects. This allows you to track costs separately for each LLM used in the conversation. -
get_combined_metrics(): Returns a singleMetricsobject that aggregates costs across all LLMs used in the conversation. This gives you the total cost of the entire conversation. -
get_metrics_for_usage(usage_id: str): Retrieves theMetricsobject for a specific usage ID, allowing you to inspect costs for individual LLMs.
Next Steps
- Context Condenser - Learn about context management and how it uses separate LLMs
- LLM Routing - Optimize costs with smart routing between different models

