Skip to main content

Overview

Sometimes you need tools that can schedule actions to happen in the future without blocking the agent. The SDK supports this through deferred tool execution - the tool returns an observation immediately, but spawns a background task that can inject messages into the conversation later. This pattern is useful for:
  • Reminders - Schedule messages to be sent after a delay
  • Human-in-the-loop - Allow external input to be injected into conversations
  • Async operations - Start long-running tasks without blocking the agent

Example: Remind Tool

This example is available on GitHub: examples/01_standalone_sdk/33_remind_tool.py
The remind tool demonstrates how to create a deferred action tool:
examples/01_standalone_sdk/33_remind_tool.py
Running the Example
export LLM_API_KEY="your-api-key"
cd agent-sdk
uv run python examples/01_standalone_sdk/33_remind_tool.py

Key Implementation Details

1. The Executor Receives the Conversation

The key to deferred tools is that the executor receives the conversation parameter in its __call__ method:
class RemindExecutor(ToolExecutor[RemindAction, RemindObservation]):
    def __call__(
        self, action: RemindAction, conversation: "LocalConversation | None" = None
    ) -> RemindObservation:
        # conversation is available here for deferred actions

2. Background Thread for Deferred Action

The executor spawns a background thread that sleeps for the specified delay, then uses conversation.send_message() to inject the reminder:
def send_reminder():
    time.sleep(action.delay_seconds)
    reminder_text = f"[REMINDER]: {action.message}"
    conversation.send_message(reminder_text)

thread = threading.Thread(target=send_reminder, daemon=True)
thread.start()

3. Immediate Return

The tool returns immediately with an observation confirming the reminder was scheduled, allowing the agent to continue working:
return RemindObservation(
    scheduled=True,
    message=action.message,
    delay_seconds=action.delay_seconds,
)

Use Cases

Human-in-the-Loop

This pattern enables human-in-the-loop interactions where external input can be injected into a conversation at any time. For example, you could create a tool that:
  1. Registers a callback with an external system
  2. Returns immediately to let the agent continue
  3. Injects messages when the external system responds

Long-Running Operations

For operations that take significant time (like API calls to slow services), you can:
  1. Start the operation in a background thread
  2. Return a “processing” observation immediately
  3. Inject the results when they’re ready

Next Steps