The SDK package (Documentation Index
Fetch the complete documentation index at: https://docs.openhands.dev/llms.txt
Use this file to discover all available pages before exploring further.
openhands.sdk) is the heart of the OpenHands Software Agent SDK. It provides the core framework for building agents locally or embedding them in applications.
Source: sdk/
Purpose
The SDK package handles:- Agent reasoning loop: How agents process messages and make decisions
- State management: Conversation lifecycle and persistence
- LLM integration: Provider-agnostic language model access
- Tool system: Typed actions and observations
- Workspace abstraction: Where code executes
- Extensibility: Skills, condensers, MCP, security
Core Components
1. Conversation - State & Lifecycle
What it does: Manages the entire conversation lifecycle and state. Key responsibilities:- Maintains conversation state (immutable)
- Handles message flow between user and agent
- Manages turn-taking and async execution
- Persists and restores conversation state
- Emits events for monitoring
- Immutable state: Each operation returns a new Conversation instance
- Serializable: Can be saved to disk or database and restored
- Async-first: Built for streaming and concurrent execution
- Saving conversation to database after each turn
- Implementing undo/redo functionality
- Building multi-session chatbots
- Time-travel debugging
- Guide: Conversation Persistence
- Guide: Pause and Resume
- Source:
conversation/
2. Agent - The Reasoning Loop
What it does: The core reasoning engine that processes messages and decides what to do. Key responsibilities:- Receives messages and current state
- Consults LLM to reason about next action
- Validates and executes tool calls
- Processes observations and loops until completion
- Integrates with skills for specialized behavior
- Stateless: Agent doesn’t hold state, operates on Conversation
- Extensible: Behavior can be modified via skills
- Provider-agnostic: Works with any LLM through unified interface
- Receive message from Conversation
- Add message to context
- Consult LLM with full conversation history
- If LLM returns tool call → validate and execute tool
- If tool returns observation → add to context, go to step 3
- If LLM returns response → done, return to user
- Planning agents that break tasks into steps
- Code review agents with specific checks
- Agents with domain-specific reasoning patterns
- Guide: Custom Agents
- Guide: Agent Stuck Detector
- Source:
agent/
3. LLM - Language Model Integration
What it does: Provides a provider-agnostic interface to language models. Key responsibilities:- Abstracts different LLM providers (OpenAI, Anthropic, etc.)
- Handles message formatting and conversion
- Manages streaming responses
- Supports tool calling and reasoning modes
- Handles retries and error recovery
- Provider-agnostic: Same API works with any provider
- Streaming-first: Built for real-time responses
- Type-safe: Pydantic models for all messages
- Extensible: Easy to add new providers
- Cost optimization (switch to cheaper models)
- Testing with different models
- Avoiding vendor lock-in
- Supporting customer choice
- Routing requests to different models based on complexity
- Implementing custom caching strategies
- Adding observability hooks
- Guide: LLM Registry
- Guide: LLM Routing
- Guide: Reasoning and Tool Use
- Source:
llm/
4. Tool System - Typed Capabilities
What it does: Defines what agents can do through a typed action/observation pattern. Key responsibilities:- Defines tool schemas (inputs and outputs)
- Validates actions before execution
- Executes tools and returns typed observations
- Generates JSON schemas for LLM tool calling
- Registers tools with the agent
- Action/Observation pattern: Tools are defined as type-safe input/output pairs
- Schema generation: Pydantic models auto-generate JSON schemas
- Executor pattern: Separation of tool definition and execution
- Composable: Tools can call other tools
- Action: Input schema (what the tool accepts)
- Observation: Output schema (what the tool returns)
- ToolExecutor: Logic that transforms Action → Observation
- Type safety catches errors early
- LLMs get accurate schemas for tool calling
- Tools are testable in isolation
- Easy to compose tools
- Database query tools
- API integration tools
- Custom file format parsers
- Domain-specific calculators
- Guide: Custom Tools
- Source:
tools/
5. Workspace - Execution Abstraction
What it does: Abstracts where code executes (local, Docker, remote). Key responsibilities:- Provides unified interface for code execution
- Handles file operations across environments
- Manages working directories
- Supports different isolation levels
- Abstract interface: LocalWorkspace in SDK, advanced types in workspace package
- Environment-agnostic: Code works the same locally or remotely
- Lazy initialization: Workspace setup happens on first use
- Architecture: Workspace Architecture
- Guides: Remote Agent Server
- Source:
workspace/
6. Events - Component Communication
What it does: Enables observability and debugging through event emissions. Key responsibilities:- Defines event types (messages, actions, observations, errors)
- Emitted by Conversation, Agent, Tools
- Enables logging, debugging, and monitoring
- Supports custom event handlers
- Immutable: Events are snapshots, not mutable objects
- Serializable: Can be logged, stored, replayed
- Type-safe: Pydantic models for all events
- Debugging agent behavior
- Understanding decision-making
- Building observability dashboards
- Implementing custom logging
- Guide: Metrics and Observability
- Source:
event/
7. Condenser - Memory Management
What it does: Compresses conversation history when it gets too long. Key responsibilities:- Monitors conversation length
- Summarizes older messages
- Preserves important context
- Keeps conversation within token limits
- Pluggable: Different condensing strategies
- Automatic: Triggered when context gets large
- Preserves semantics: Important information retained
- Summarize old messages
- Keep only last N turns
- Preserve task-related messages
- Guide: Context Condenser
- Source:
condenser/
8. MCP - Model Context Protocol
What it does: Integrates external tool servers via Model Context Protocol. Key responsibilities:- Connects to MCP-compatible tool servers
- Translates MCP tools to SDK tool format
- Manages server lifecycle
- Handles server communication
- Standard protocol: Uses MCP specification
- Transparent integration: MCP tools look like regular tools to agents
- Process management: Handles server startup/shutdown
- Already have MCP servers (fetch, filesystem, etc.)
- Are too complex to rewrite as SDK tools
- Need to run in separate processes
- Are provided by third parties
- Guide: MCP Integration
- Spec: Model Context Protocol
- Source:
mcp/
9. Skills (formerly Microagents) - Behavior Modules
What it does: Specialized modules that modify agent behavior for specific tasks. Key responsibilities:- Provide domain-specific instructions
- Modify system prompts
- Guide agent decision-making
- Compose to create specialized agents
- Composable: Multiple skills can work together
- Declarative: Defined as configuration, not code
- Reusable: Share skills across agents
- GitHub operations (issue creation, PRs)
- Code review guidelines
- Documentation style enforcement
- Project-specific conventions
- Guide: Agent Skills & Context
- Source:
skills/
10. Security - Validation & Sandboxing
What it does: Validates inputs and enforces security constraints. Key responsibilities:- Input validation
- Command sanitization
- Path traversal prevention
- Resource limits
- Defense in depth: Multiple validation layers
- Fail-safe: Rejects suspicious inputs by default
- Configurable: Adjust security levels as needed
- Malicious prompts escaping sandboxes
- Path traversal attacks
- Resource exhaustion
- Unintended system access
- Guide: Security and Secrets
- Source:
security/
How Components Work Together
Example: User asks agent to create a file
- Events are emitted for observability
- Condenser may trigger if history gets long
- Skills influence LLM’s decision-making
- Security validates file paths and operations
- MCP could provide additional tools if configured
Design Patterns
Immutability
All core objects are immutable. Operations return new instances:Composition Over Inheritance
Agents are composed from:- LLM provider
- Tool list
- Skill list
- Condenser strategy
- Security policy
Type Safety
Everything uses Pydantic models:- Messages, actions, observations are typed
- Validation happens automatically
- Schemas generate from types
Next Steps
For Usage Examples
- Getting Started - Build your first agent
- Custom Tools - Extend capabilities
- LLM Configuration - Configure providers
- Conversation Management - State handling
For Related Architecture
- Tool System - Built-in tool implementations
- Workspace Architecture - Execution environments
- Agent Server Architecture - Remote execution
For Implementation Details
openhands-sdk/- SDK source codeopenhands-tools/- Tools source codeopenhands-workspace/- Workspace source codeexamples/- Working examples

