openhands.sdk) is the heart of the OpenHands Agent SDK. It provides the core framework for building agents locally or embedding them in applications.
Source: openhands/sdk/
Purpose
The SDK package handles:- Agent reasoning loop: How agents process messages and make decisions
- State management: Conversation lifecycle and persistence
- LLM integration: Provider-agnostic language model access
- Tool system: Typed actions and observations
- Workspace abstraction: Where code executes
- Extensibility: Microagents, condensers, MCP, security
Core Components
1. Conversation - State & Lifecycle
What it does: Manages the entire conversation lifecycle and state. Key responsibilities:- Maintains conversation state (immutable)
- Handles message flow between user and agent
- Manages turn-taking and async execution
- Persists and restores conversation state
- Emits events for monitoring
- Immutable state: Each operation returns a new Conversation instance
- Serializable: Can be saved to disk or database and restored
- Async-first: Built for streaming and concurrent execution
- Saving conversation to database after each turn
- Implementing undo/redo functionality
- Building multi-session chatbots
- Time-travel debugging
- Guide: Conversation Persistence
- Guide: Pause and Resume
- Source:
conversation.py
2. Agent - The Reasoning Loop
What it does: The core reasoning engine that processes messages and decides what to do. Key responsibilities:- Receives messages and current state
- Consults LLM to reason about next action
- Validates and executes tool calls
- Processes observations and loops until completion
- Integrates with microagents for specialized behavior
- Stateless: Agent doesnβt hold state, operates on Conversation
- Extensible: Behavior can be modified via microagents
- Provider-agnostic: Works with any LLM through unified interface
- Receive message from Conversation
- Add message to context
- Consult LLM with full conversation history
- If LLM returns tool call β validate and execute tool
- If tool returns observation β add to context, go to step 3
- If LLM returns response β done, return to user
- Planning agents that break tasks into steps
- Code review agents with specific checks
- Agents with domain-specific reasoning patterns
- Guide: Custom Agents
- Guide: Agent Stuck Detector
- Source:
agent.py
3. LLM - Language Model Integration
What it does: Provides a provider-agnostic interface to language models. Key responsibilities:- Abstracts different LLM providers (OpenAI, Anthropic, etc.)
- Handles message formatting and conversion
- Manages streaming responses
- Supports tool calling and reasoning modes
- Handles retries and error recovery
- Provider-agnostic: Same API works with any provider
- Streaming-first: Built for real-time responses
- Type-safe: Pydantic models for all messages
- Extensible: Easy to add new providers
- Cost optimization (switch to cheaper models)
- Testing with different models
- Avoiding vendor lock-in
- Supporting customer choice
- Routing requests to different models based on complexity
- Implementing custom caching strategies
- Adding observability hooks
- Guide: LLM Registry
- Guide: LLM Routing
- Guide: Reasoning and Tool Use
- Source:
llm.py
4. Tool System - Typed Capabilities
What it does: Defines what agents can do through a typed action/observation pattern. Key responsibilities:- Defines tool schemas (inputs and outputs)
- Validates actions before execution
- Executes tools and returns typed observations
- Generates JSON schemas for LLM tool calling
- Registers tools with the agent
- Action/Observation pattern: Tools are defined as type-safe input/output pairs
- Schema generation: Pydantic models auto-generate JSON schemas
- Executor pattern: Separation of tool definition and execution
- Composable: Tools can call other tools
- Action: Input schema (what the tool accepts)
- Observation: Output schema (what the tool returns)
- ToolExecutor: Logic that transforms Action β Observation
- Type safety catches errors early
- LLMs get accurate schemas for tool calling
- Tools are testable in isolation
- Easy to compose tools
- Database query tools
- API integration tools
- Custom file format parsers
- Domain-specific calculators
- Guide: Custom Tools
- Source:
tool/
5. Workspace - Execution Abstraction
What it does: Abstracts where code executes (local, Docker, remote). Key responsibilities:- Provides unified interface for code execution
- Handles file operations across environments
- Manages working directories
- Supports different isolation levels
- Abstract interface: LocalWorkspace in SDK, advanced types in workspace package
- Environment-agnostic: Code works the same locally or remotely
- Lazy initialization: Workspace setup happens on first use
- Architecture: Workspace Architecture
- Guides: Remote Agent Server
- Source:
workspace.py
6. Events - Component Communication
What it does: Enables observability and debugging through event emissions. Key responsibilities:- Defines event types (messages, actions, observations, errors)
- Emitted by Conversation, Agent, Tools
- Enables logging, debugging, and monitoring
- Supports custom event handlers
- Immutable: Events are snapshots, not mutable objects
- Serializable: Can be logged, stored, replayed
- Type-safe: Pydantic models for all events
- Debugging agent behavior
- Understanding decision-making
- Building observability dashboards
- Implementing custom logging
- Guide: Metrics and Observability
- Source:
event.py
7. Condenser - Memory Management
What it does: Compresses conversation history when it gets too long. Key responsibilities:- Monitors conversation length
- Summarizes older messages
- Preserves important context
- Keeps conversation within token limits
- Pluggable: Different condensing strategies
- Automatic: Triggered when context gets large
- Preserves semantics: Important information retained
- Summarize old messages
- Keep only last N turns
- Preserve task-related messages
- Guide: Context Condenser
- Source:
condenser/
8. MCP - Model Context Protocol
What it does: Integrates external tool servers via Model Context Protocol. Key responsibilities:- Connects to MCP-compatible tool servers
- Translates MCP tools to SDK tool format
- Manages server lifecycle
- Handles server communication
- Standard protocol: Uses MCP specification
- Transparent integration: MCP tools look like regular tools to agents
- Process management: Handles server startup/shutdown
- Already have MCP servers (fetch, filesystem, etc.)
- Are too complex to rewrite as SDK tools
- Need to run in separate processes
- Are provided by third parties
- Guide: MCP Integration
- Spec: Model Context Protocol
- Source:
mcp/
9. Microagents - Behavior Modules
What it does: Specialized modules that modify agent behavior for specific tasks. Key responsibilities:- Provide domain-specific instructions
- Modify system prompts
- Guide agent decision-making
- Compose to create specialized agents
- Composable: Multiple microagents can work together
- Declarative: Defined as configuration, not code
- Reusable: Share microagents across agents
- GitHub operations (issue creation, PRs)
- Code review guidelines
- Documentation style enforcement
- Project-specific conventions
- Guide: Agent Skills & Context
- Source:
microagents/
10. Security - Validation & Sandboxing
What it does: Validates inputs and enforces security constraints. Key responsibilities:- Input validation
- Command sanitization
- Path traversal prevention
- Resource limits
- Defense in depth: Multiple validation layers
- Fail-safe: Rejects suspicious inputs by default
- Configurable: Adjust security levels as needed
- Malicious prompts escaping sandboxes
- Path traversal attacks
- Resource exhaustion
- Unintended system access
- Guide: Security and Secrets
- Source:
security/
How Components Work Together
Example: User asks agent to create a file
- Events are emitted for observability
- Condenser may trigger if history gets long
- Microagents influence LLMβs decision-making
- Security validates file paths and operations
- MCP could provide additional tools if configured
Design Patterns
Immutability
All core objects are immutable. Operations return new instances:Composition Over Inheritance
Agents are composed from:- LLM provider
- Tool list
- Microagent list
- Condenser strategy
- Security policy
Type Safety
Everything uses Pydantic models:- Messages, actions, observations are typed
- Validation happens automatically
- Schemas generate from types
Next Steps
For Usage Examples
- Getting Started - Build your first agent
- Custom Tools - Extend capabilities
- LLM Configuration - Configure providers
- Conversation Management - State handling
For Related Architecture
- Tool System - Built-in tool implementations
- Workspace Architecture - Execution environments
- Agent Server Architecture - Remote execution
For Implementation Details
openhands/sdk/- Full source codeexamples/- Working examples

