Agents

Introduction

Agents are PydanticAI's primary interface for interacting with LLMs.

In some use cases a single Agent will control an entire application or component, but multiple agents can also interact to embody more complex workflows.

The Agent class is well documented, but in essence you can think of an agent as a container for:

A system prompt — a set of instructions for the LLM written by the developer
One or more retrievers — functions that the LLM may call to get information while generating a response
An optional structured result type — the structured datatype the LLM must return at the end of a run
A dependency type constraint — system prompt functions, retrievers and result validators may all use dependencies when they're run
Agents may optionally also have a default model associated with them, the model to use can also be defined when running the agent

In typing terms, agents are generic in their dependency and result types, e.g. an agent which required Foobar dependencies and returned data of type list[str] results would have type Agent[Foobar, list[str]].

Here's a toy example of an agent that simulates a roulette wheel:

roulette_wheel.py

from pydantic_ai import Agent, CallContext

roulette_agent = Agent(  # (1)!
    'openai:gpt-4o',
    deps_type=int,
    result_type=bool,
    system_prompt=(
        'Use the `roulette_wheel` to see if the '
        'customer has won based on the number they provide.'
    ),
)


@roulette_agent.retriever_context
async def roulette_wheel(ctx: CallContext[int], square: int) -> str:  # (2)!
    """check if the square is a winner"""
    return 'winner' if square == ctx.deps else 'loser'


# Run the agent
success_number = 18  # (3)!
result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number)
print(result.data)  # (4)!
#> True

result = roulette_agent.run_sync('I bet five is the winner', deps=success_number)
print(result.data)
#> False

Create an agent, which expects an integer dependency and returns a boolean result, this agent will ahve type of Agent[int, bool].
Define a retriever that checks if the square is a winner, here CallContext is parameterized with the dependency type int, if you got the dependency type wrong you'd get a typing error.
In reality, you might want to use a random number here e.g. random.randint(0, 36) here.
result.data will be a boolean indicating if the square is a winner, Pydantic performs the result validation, it'll be typed as a bool since its type is derived from the result_type generic parameter of the agent.

Agents are Singletons, like FastAPI

Agents are a singleton instance, you can think of them as similar to a small FastAPI app or an APIRouter.

Running Agents

There are three ways to run an agent:

agent.run() — a coroutine which returns a result containing a completed response, returns a RunResult
agent.run_sync() — a plain function which returns a result containing a completed response (internally, this just calls asyncio.run(self.run())), returns a RunResult
agent.run_stream() — a coroutine which returns a result containing methods to stream a response as an async iterable, returns a StreamedRunResult

Here's a simple example demonstrating all three:

run_agent.py

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.data)
#> Rome


async def main():
    result = await agent.run('What is the capital of France?')
    print(result.data)
    #> Paris

    async with agent.run_stream('What is the capital of the UK?') as response:
        print(await response.get_data())
        #> London

(This example is complete, it can be run "as is")

You can also pass messages from previous runs to continue a conversation or provide context, as described in Messages and Chat History.

Runs vs. Conversations

An agent run might represent an entire conversation — there's no limit to how many messages can be exchanged in a single run. However, a conversation might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls.

Here's an example of a conversation comprised of multiple runs:

conversation_example.py

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

# First run
result1 = agent.run_sync('Who was Albert Einstein?')
print(result1.data)
#> Albert Einstein was a German-born theoretical physicist.

# Second run, passing previous messages
result2 = agent.run_sync(
    'What was his most famous equation?', message_history=result1.new_messages()  # (1)!
)
print(result2.data)
#> Albert Einstein's most famous equation is (E = mc^2).

1. Continue the conversation, without message_history the model would not know who "he" was referring to.

System Prompts

System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want.

Generally, system prompts fall into two categories:

Static system prompts: These are known when writing the code and can be defined via the system_prompt parameter of the Agent constructor.
Dynamic system prompts: These aren't known until runtime and should be defined via functions decorated with @agent.system_prompt.

You can add both to a single agent; they're concatenated in the order they're defined at runtime.

Here's an example using both types of system prompts:

system_prompts.py

from datetime import date

from pydantic_ai import Agent, CallContext

agent = Agent(
    'openai:gpt-4o',
    deps_type=str,  # (1)!
    system_prompt="Use the customer's name while replying to them.",  # (2)!
)


@agent.system_prompt  # (3)!
def add_the_users_name(ctx: CallContext[str]) -> str:
    return f"The user's named is {ctx.deps}."


@agent.system_prompt
def add_the_date() -> str:  # (4)!
    return f'The date is {date.today()}.'


result = agent.run_sync('What is the date?', deps='Frank')
print(result.data)
#> Hello Frank, the date today is 2032-01-02.

The agent expects a string dependency.
Static system prompt defined at agent creation time.
Dynamic system prompt defined via a decorator.
Another dynamic system prompt, system prompts don't have to have the CallContext parameter.

Retrievers

two different retriever decorators (retriver_plain and retriever_context) depending on whether you want to use the context or not, show an example using both
retriever parameters are extracted and used to build the schema for the tool, then validated with pydantic
if a retriever has a single "model like" parameter (e.g. pydantic mode, dataclass, typed dict), the schema for the tool will but just that type
docstrings are parsed to get the tool description, thanks to griffe docs for each parameter are extracting using Google, numpy or sphinx docstring styling
You can raise ModelRetry from within a retriever to suggest to the model it should retry
the return type of retriever can either be str or a JSON object typed as dict[str, Any] as some models (e.g. Gemini) support structured return values, some expect text (OpenAI) but seem to be just as good at extracting meaning from the data

Reflection and self-correction

validation errors from both retrievers parameter validation and structured result validation can be passed back to the with a request to retry
as described above, you can also raise ModelRetry from within a retriever or result validator to tell the model it should retry
the default retry count is 1, but can be altered both on a whole agent, or on a per-retriever basis and result validator basis
you can access the current retry count from within a retriever or result validator via ctx.retry

Model errors

If models behave unexpectedly, e.g. the retry limit is exceed, agent runs will raise UnexpectedModelBehaviour exceptions
If you use PydanticAI in correctly, we try to raise a UserError with a helpful message
show an except of a UnexpectedModelBehaviour being raised
if a UnexpectedModelBehaviour is raised, you may want to access the .last_run_messages attribute of an agent to see the messages exchanged that led to the error, show an example of accessing .last_run_messages in an except block to get more details

API Reference

Bases: Generic[AgentDeps, ResultData]

Class for defining "agents" - a way to have a specific type of "conversation" with an LLM.

Agents are generic in the dependency type they take AgentDeps and the result data type they return, ResultData.

By default, if neither generic parameter is customised, agents have type Agent[None, str].

Minimal usage example:

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')
result = agent.run_sync('What is the capital of France?')
print(result.data)
#> Paris

Source code in pydantic_ai/agent.py

@final
@dataclass(init=False)
class Agent(Generic[AgentDeps, ResultData]):
    """Class for defining "agents" - a way to have a specific type of "conversation" with an LLM.

    Agents are generic in the dependency type they take [`AgentDeps`][pydantic_ai.dependencies.AgentDeps]
    and the result data type they return, [`ResultData`][pydantic_ai.result.ResultData].

    By default, if neither generic parameter is customised, agents have type `Agent[None, str]`.

    Minimal usage example:

    ```py
    from pydantic_ai import Agent

    agent = Agent('openai:gpt-4o')
    result = agent.run_sync('What is the capital of France?')
    print(result.data)
    #> Paris
    ```
    """

    # dataclass fields mostly for my sanity — knowing what attributes are available
    model: models.Model | models.KnownModelName | None
    """The default model configured for this agent."""
    _result_schema: _result.ResultSchema[ResultData] | None
    _result_validators: list[_result.ResultValidator[AgentDeps, ResultData]]
    _allow_text_result: bool
    _system_prompts: tuple[str, ...]
    _retrievers: dict[str, _r.Retriever[AgentDeps, Any]]
    _default_retries: int
    _system_prompt_functions: list[_system_prompt.SystemPromptRunner[AgentDeps]]
    _deps_type: type[AgentDeps]
    _max_result_retries: int
    _current_result_retry: int
    _override_deps: _utils.Option[AgentDeps] = None
    _override_model: _utils.Option[models.Model] = None
    last_run_messages: list[_messages.Message] | None = None
    """The messages from the last run, useful when a run raised an exception.

    Note: these are not used by the agent, e.g. in future runs, they are just stored for developers' convenience.
    """

    def __init__(
        self,
        model: models.Model | models.KnownModelName | None = None,
        result_type: type[ResultData] = str,
        *,
        system_prompt: str | Sequence[str] = (),
        deps_type: type[AgentDeps] = NoneType,
        retries: int = 1,
        result_tool_name: str = 'final_result',
        result_tool_description: str | None = None,
        result_retries: int | None = None,
        defer_model_check: bool = False,
    ):
        """Create an agent.

        Args:
            model: The default model to use for this agent, if not provide,
                you must provide the model when calling the agent.
            result_type: The type of the result data, used to validate the result data, defaults to `str`.
            system_prompt: Static system prompts to use for this agent, you can also register system
                prompts via a function with [`system_prompt`][pydantic_ai.Agent.system_prompt].
            deps_type: The type used for dependency injection, this parameter exists solely to allow you to fully
                parameterize the agent, and therefore get the best out of static type checking.
                If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright
                or add a type hint `: Agent[None, <return type>]`.
            retries: The default number of retries to allow before raising an error.
            result_tool_name: The name of the tool to use for the final result.
            result_tool_description: The description of the final result tool.
            result_retries: The maximum number of retries to allow for result validation, defaults to `retries`.
            defer_model_check: by default, if you provide a [named][pydantic_ai.models.KnownModelName] model,
                it's evaluated to create a [`Model`][pydantic_ai.models.Model] instance immediately,
                which checks for the necessary environment variables. Set this to `false`
                to defer the evaluation until the first run. Useful if you want to
                [override the model][pydantic_ai.Agent.override_model] for testing.
        """
        if model is None or defer_model_check:
            self.model = model
        else:
            self.model = models.infer_model(model)

        self._result_schema = _result.ResultSchema[result_type].build(
            result_type, result_tool_name, result_tool_description
        )
        # if the result tool is None, or its schema allows `str`, we allow plain text results
        self._allow_text_result = self._result_schema is None or self._result_schema.allow_text_result

        self._system_prompts = (system_prompt,) if isinstance(system_prompt, str) else tuple(system_prompt)
        self._retrievers: dict[str, _r.Retriever[AgentDeps, Any]] = {}
        self._deps_type = deps_type
        self._default_retries = retries
        self._system_prompt_functions = []
        self._max_result_retries = result_retries if result_retries is not None else retries
        self._current_result_retry = 0
        self._result_validators = []

    async def run(
        self,
        user_prompt: str,
        *,
        message_history: list[_messages.Message] | None = None,
        model: models.Model | models.KnownModelName | None = None,
        deps: AgentDeps = None,
    ) -> result.RunResult[ResultData]:
        """Run the agent with a user prompt in async mode.

        Args:
            user_prompt: User input to start/continue the conversation.
            message_history: History of the conversation so far.
            model: Optional model to use for this run, required if `model` was not set when creating the agent.
            deps: Optional dependencies to use for this run.

        Returns:
            The result of the run.
        """
        model_used, custom_model, agent_model = await self._get_agent_model(model)

        deps = self._get_deps(deps)

        new_message_index, messages = await self._prepare_messages(deps, user_prompt, message_history)
        self.last_run_messages = messages

        for retriever in self._retrievers.values():
            retriever.reset()

        cost = result.Cost()

        with _logfire.span(
            'agent run {prompt=}',
            prompt=user_prompt,
            agent=self,
            custom_model=custom_model,
            model_name=model_used.name(),
        ) as run_span:
            run_step = 0
            while True:
                run_step += 1
                with _logfire.span('model request {run_step=}', run_step=run_step) as model_req_span:
                    model_response, request_cost = await agent_model.request(messages)
                    model_req_span.set_attribute('response', model_response)
                    model_req_span.set_attribute('cost', request_cost)
                    model_req_span.message = f'model request -> {model_response.role}'

                messages.append(model_response)
                cost += request_cost

                with _logfire.span('handle model response') as handle_span:
                    either = await self._handle_model_response(model_response, deps)

                    if isinstance(either, _MarkFinalResult):
                        # we have a final result, end the conversation
                        result_data = either.data
                        run_span.set_attribute('all_messages', messages)
                        run_span.set_attribute('cost', cost)
                        handle_span.set_attribute('result', result_data)
                        handle_span.message = 'handle model response -> final result'
                        return result.RunResult(messages, new_message_index, result_data, cost)
                    else:
                        # continue the conversation
                        tool_responses = either
                        handle_span.set_attribute('tool_responses', tool_responses)
                        response_msgs = ' '.join(m.role for m in tool_responses)
                        handle_span.message = f'handle model response -> {response_msgs}'
                        messages.extend(tool_responses)

    def run_sync(
        self,
        user_prompt: str,
        *,
        message_history: list[_messages.Message] | None = None,
        model: models.Model | models.KnownModelName | None = None,
        deps: AgentDeps = None,
    ) -> result.RunResult[ResultData]:
        """Run the agent with a user prompt synchronously.

        This is a convenience method that wraps `self.run` with `asyncio.run()`.

        Args:
            user_prompt: User input to start/continue the conversation.
            message_history: History of the conversation so far.
            model: Optional model to use for this run, required if `model` was not set when creating the agent.
            deps: Optional dependencies to use for this run.

        Returns:
            The result of the run.
        """
        return asyncio.run(self.run(user_prompt, message_history=message_history, model=model, deps=deps))

    @asynccontextmanager
    async def run_stream(
        self,
        user_prompt: str,
        *,
        message_history: list[_messages.Message] | None = None,
        model: models.Model | models.KnownModelName | None = None,
        deps: AgentDeps = None,
    ) -> AsyncIterator[result.StreamedRunResult[AgentDeps, ResultData]]:
        """Run the agent with a user prompt in async mode, returning a streamed response.

        Args:
            user_prompt: User input to start/continue the conversation.
            message_history: History of the conversation so far.
            model: Optional model to use for this run, required if `model` was not set when creating the agent.
            deps: Optional dependencies to use for this run.

        Returns:
            The result of the run.
        """
        model_used, custom_model, agent_model = await self._get_agent_model(model)

        deps = self._get_deps(deps)

        new_message_index, messages = await self._prepare_messages(deps, user_prompt, message_history)
        self.last_run_messages = messages

        for retriever in self._retrievers.values():
            retriever.reset()

        cost = result.Cost()

        with _logfire.span(
            'agent run stream {prompt=}',
            prompt=user_prompt,
            agent=self,
            custom_model=custom_model,
            model_name=model_used.name(),
        ) as run_span:
            run_step = 0
            while True:
                run_step += 1
                with _logfire.span('model request {run_step=}', run_step=run_step) as model_req_span:
                    async with agent_model.request_stream(messages) as model_response:
                        model_req_span.set_attribute('response_type', model_response.__class__.__name__)
                        # We want to end the "model request" span here, but we can't exit the context manager
                        # in the traditional way
                        model_req_span.__exit__(None, None, None)

                        with _logfire.span('handle model response') as handle_span:
                            either = await self._handle_streamed_model_response(model_response, deps)

                            if isinstance(either, _MarkFinalResult):
                                result_stream = either.data
                                run_span.set_attribute('all_messages', messages)
                                handle_span.set_attribute('result_type', result_stream.__class__.__name__)
                                handle_span.message = 'handle model response -> final result'
                                yield result.StreamedRunResult(
                                    messages,
                                    new_message_index,
                                    cost,
                                    result_stream,
                                    self._result_schema,
                                    deps,
                                    self._result_validators,
                                )
                                return
                            else:
                                tool_responses = either
                                handle_span.set_attribute('tool_responses', tool_responses)
                                response_msgs = ' '.join(m.role for m in tool_responses)
                                handle_span.message = f'handle model response -> {response_msgs}'
                                messages.extend(tool_responses)
                                # the model_response should have been fully streamed by now, we can add it's cost
                                cost += model_response.cost()

    @contextmanager
    def override_deps(self, overriding_deps: AgentDeps) -> Iterator[None]:
        """Context manager to temporarily override agent dependencies, this is particularly useful when testing.

        Args:
            overriding_deps: The dependencies to use instead of the dependencies passed to the agent run.
        """
        override_deps_before = self._override_deps
        self._override_deps = _utils.Some(overriding_deps)
        try:
            yield
        finally:
            self._override_deps = override_deps_before

    @contextmanager
    def override_model(self, overriding_model: models.Model | models.KnownModelName) -> Iterator[None]:
        """Context manager to temporarily override the model used by the agent.

        Args:
            overriding_model: The model to use instead of the model passed to the agent run.
        """
        override_model_before = self._override_model
        self._override_model = _utils.Some(models.infer_model(overriding_model))
        try:
            yield
        finally:
            self._override_model = override_model_before

    def system_prompt(
        self, func: _system_prompt.SystemPromptFunc[AgentDeps]
    ) -> _system_prompt.SystemPromptFunc[AgentDeps]:
        """Decorator to register a system prompt function that takes `CallContext` as it's only argument."""
        self._system_prompt_functions.append(_system_prompt.SystemPromptRunner(func))
        return func

    def result_validator(
        self, func: _result.ResultValidatorFunc[AgentDeps, ResultData]
    ) -> _result.ResultValidatorFunc[AgentDeps, ResultData]:
        """Decorator to register a result validator function."""
        self._result_validators.append(_result.ResultValidator(func))
        return func

    @overload
    def retriever_context(
        self, func: RetrieverContextFunc[AgentDeps, RetrieverParams], /
    ) -> _r.Retriever[AgentDeps, RetrieverParams]: ...

    @overload
    def retriever_context(
        self, /, *, retries: int | None = None
    ) -> Callable[[RetrieverContextFunc[AgentDeps, RetrieverParams]], _r.Retriever[AgentDeps, RetrieverParams]]: ...

    def retriever_context(
        self,
        func: RetrieverContextFunc[AgentDeps, RetrieverParams] | None = None,
        /,
        *,
        retries: int | None = None,
    ) -> Any:
        """Decorator to register a retriever function."""
        if func is None:

            def retriever_decorator(
                func_: RetrieverContextFunc[AgentDeps, RetrieverParams],
            ) -> _r.Retriever[AgentDeps, RetrieverParams]:
                # noinspection PyTypeChecker
                return self._register_retriever(_utils.Either(left=func_), retries)

            return retriever_decorator
        else:
            # noinspection PyTypeChecker
            return self._register_retriever(_utils.Either(left=func), retries)

    @overload
    def retriever_plain(
        self, func: RetrieverPlainFunc[RetrieverParams], /
    ) -> _r.Retriever[AgentDeps, RetrieverParams]: ...

    @overload
    def retriever_plain(
        self, /, *, retries: int | None = None
    ) -> Callable[[RetrieverPlainFunc[RetrieverParams]], _r.Retriever[AgentDeps, RetrieverParams]]: ...

    def retriever_plain(
        self, func: RetrieverPlainFunc[RetrieverParams] | None = None, /, *, retries: int | None = None
    ) -> Any:
        """Decorator to register a retriever function."""
        if func is None:

            def retriever_decorator(
                func_: RetrieverPlainFunc[RetrieverParams],
            ) -> _r.Retriever[AgentDeps, RetrieverParams]:
                # noinspection PyTypeChecker
                return self._register_retriever(_utils.Either(right=func_), retries)

            return retriever_decorator
        else:
            return self._register_retriever(_utils.Either(right=func), retries)

    def _register_retriever(
        self, func: _r.RetrieverEitherFunc[AgentDeps, RetrieverParams], retries: int | None
    ) -> _r.Retriever[AgentDeps, RetrieverParams]:
        """Private utility to register a retriever function."""
        retries_ = retries if retries is not None else self._default_retries
        retriever = _r.Retriever[AgentDeps, RetrieverParams](func, retries_)

        if self._result_schema and retriever.name in self._result_schema.tools:
            raise ValueError(f'Retriever name conflicts with result schema name: {retriever.name!r}')

        if retriever.name in self._retrievers:
            raise ValueError(f'Retriever name conflicts with existing retriever: {retriever.name!r}')

        self._retrievers[retriever.name] = retriever
        return retriever

    async def _get_agent_model(
        self, model: models.Model | models.KnownModelName | None
    ) -> tuple[models.Model, models.Model | None, models.AgentModel]:
        """Create a model configured for this agent.

        Args:
            model: model to use for this run, required if `model` was not set when creating the agent.

        Returns:
            a tuple of `(model used, custom_model if any, agent_model)`
        """
        model_: models.Model
        if some_model := self._override_model:
            # we don't want `override_model()` to cover up errors from the model not being defined, hence this check
            if model is None and self.model is None:
                raise exceptions.UserError(
                    '`model` must be set either when creating the agent or when calling it. '
                    '(Even when `override_model()` is customizing the model that will actually be called)'
                )
            model_ = some_model.value
            custom_model = None
        elif model is not None:
            custom_model = model_ = models.infer_model(model)
        elif self.model is not None:
            # noinspection PyTypeChecker
            model_ = self.model = models.infer_model(self.model)
            custom_model = None
        else:
            raise exceptions.UserError('`model` must be set either when creating the agent or when calling it.')

        result_tools = list(self._result_schema.tools.values()) if self._result_schema else None
        return model_, custom_model, model_.agent_model(self._retrievers, self._allow_text_result, result_tools)

    async def _prepare_messages(
        self, deps: AgentDeps, user_prompt: str, message_history: list[_messages.Message] | None
    ) -> tuple[int, list[_messages.Message]]:
        # if message history includes system prompts, we don't want to regenerate them
        if message_history and any(m.role == 'system' for m in message_history):
            # shallow copy messages
            messages = message_history.copy()
        else:
            messages = await self._init_messages(deps)
            if message_history:
                messages += message_history

        new_message_index = len(messages)
        messages.append(_messages.UserPrompt(user_prompt))
        return new_message_index, messages

    async def _handle_model_response(
        self, model_response: _messages.ModelAnyResponse, deps: AgentDeps
    ) -> _MarkFinalResult[ResultData] | list[_messages.Message]:
        """Process a non-streamed response from the model.

        Returns:
            Return `Either` — left: final result data, right: list of messages to send back to the model.
        """
        if model_response.role == 'model-text-response':
            # plain string response
            if self._allow_text_result:
                result_data_input = cast(ResultData, model_response.content)
                try:
                    result_data = await self._validate_result(result_data_input, deps, None)
                except _result.ToolRetryError as e:
                    self._incr_result_retry()
                    return [e.tool_retry]
                else:
                    return _MarkFinalResult(result_data)
            else:
                self._incr_result_retry()
                response = _messages.RetryPrompt(
                    content='Plain text responses are not permitted, please call one of the functions instead.',
                )
                return [response]
        elif model_response.role == 'model-structured-response':
            if self._result_schema is not None:
                # if there's a result schema, and any of the calls match one of its tools, return the result
                # NOTE: this means we ignore any other tools called here
                if match := self._result_schema.find_tool(model_response):
                    call, result_tool = match
                    try:
                        result_data = result_tool.validate(call)
                        result_data = await self._validate_result(result_data, deps, call)
                    except _result.ToolRetryError as e:
                        self._incr_result_retry()
                        return [e.tool_retry]
                    else:
                        return _MarkFinalResult(result_data)

            if not model_response.calls:
                raise exceptions.UnexpectedModelBehaviour('Received empty tool call message')

            # otherwise we run all retriever functions in parallel
            messages: list[_messages.Message] = []
            tasks: list[asyncio.Task[_messages.Message]] = []
            for call in model_response.calls:
                if retriever := self._retrievers.get(call.tool_name):
                    tasks.append(asyncio.create_task(retriever.run(deps, call), name=call.tool_name))
                else:
                    messages.append(self._unknown_tool(call.tool_name))

            with _logfire.span('running {tools=}', tools=[t.get_name() for t in tasks]):
                messages += await asyncio.gather(*tasks)
            return messages
        else:
            assert_never(model_response)

    async def _handle_streamed_model_response(
        self, model_response: models.EitherStreamedResponse, deps: AgentDeps
    ) -> _MarkFinalResult[models.EitherStreamedResponse] | list[_messages.Message]:
        """Process a streamed response from the model.

        TODO: change the response type to `models.EitherStreamedResponse | list[_messages.Message]` once we drop 3.9
        (with 3.9 we get `TypeError: Subscripted generics cannot be used with class and instance checks`)

        Returns:
            Return `Either` — left: final result data, right: list of messages to send back to the model.
        """
        if isinstance(model_response, models.StreamTextResponse):
            # plain string response
            if self._allow_text_result:
                return _MarkFinalResult(model_response)
            else:
                self._incr_result_retry()
                response = _messages.RetryPrompt(
                    content='Plain text responses are not permitted, please call one of the functions instead.',
                )
                # stream the response, so cost is correct
                async for _ in model_response:
                    pass

                return [response]
        else:
            assert isinstance(model_response, models.StreamStructuredResponse), f'Unexpected response: {model_response}'
            if self._result_schema is not None:
                # if there's a result schema, iterate over the stream until we find at least one tool
                # NOTE: this means we ignore any other tools called here
                structured_msg = model_response.get()
                while not structured_msg.calls:
                    try:
                        await model_response.__anext__()
                    except StopAsyncIteration:
                        break
                    structured_msg = model_response.get()

                if self._result_schema.find_tool(structured_msg):
                    return _MarkFinalResult(model_response)

            # the model is calling a retriever function, consume the response to get the next message
            async for _ in model_response:
                pass
            structured_msg = model_response.get()
            if not structured_msg.calls:
                raise exceptions.UnexpectedModelBehaviour('Received empty tool call message')
            messages: list[_messages.Message] = [structured_msg]

            # we now run all retriever functions in parallel
            tasks: list[asyncio.Task[_messages.Message]] = []
            for call in structured_msg.calls:
                if retriever := self._retrievers.get(call.tool_name):
                    tasks.append(asyncio.create_task(retriever.run(deps, call), name=call.tool_name))
                else:
                    messages.append(self._unknown_tool(call.tool_name))

            with _logfire.span('running {tools=}', tools=[t.get_name() for t in tasks]):
                messages += await asyncio.gather(*tasks)
            return messages

    async def _validate_result(
        self, result_data: ResultData, deps: AgentDeps, tool_call: _messages.ToolCall | None
    ) -> ResultData:
        for validator in self._result_validators:
            result_data = await validator.validate(result_data, deps, self._current_result_retry, tool_call)
        return result_data

    def _incr_result_retry(self) -> None:
        self._current_result_retry += 1
        if self._current_result_retry > self._max_result_retries:
            raise exceptions.UnexpectedModelBehaviour(
                f'Exceeded maximum retries ({self._max_result_retries}) for result validation'
            )

    async def _init_messages(self, deps: AgentDeps) -> list[_messages.Message]:
        """Build the initial messages for the conversation."""
        messages: list[_messages.Message] = [_messages.SystemPrompt(p) for p in self._system_prompts]
        for sys_prompt_runner in self._system_prompt_functions:
            prompt = await sys_prompt_runner.run(deps)
            messages.append(_messages.SystemPrompt(prompt))
        return messages

    def _unknown_tool(self, tool_name: str) -> _messages.RetryPrompt:
        self._incr_result_retry()
        names = list(self._retrievers.keys())
        if self._result_schema:
            names.extend(self._result_schema.tool_names())
        if names:
            msg = f'Available tools: {", ".join(names)}'
        else:
            msg = 'No tools available.'
        return _messages.RetryPrompt(content=f'Unknown tool name: {tool_name!r}. {msg}')

    def _get_deps(self, deps: AgentDeps) -> AgentDeps:
        """Get deps for a run.

        If we've overridden deps via `_override_deps_stack`, use that, otherwise use the deps passed to the call.

        We could do runtime type checking of deps against `self._deps_type`, but that's a slippery slope.
        """
        if some_deps := self._override_deps:
            return some_deps.value
        else:
            return deps

init

__init__(
    model: Model | KnownModelName | None = None,
    result_type: type[ResultData] = str,
    *,
    system_prompt: str | Sequence[str] = (),
    deps_type: type[AgentDeps] = NoneType,
    retries: int = 1,
    result_tool_name: str = "final_result",
    result_tool_description: str | None = None,
    result_retries: int | None = None,
    defer_model_check: bool = False
)

Create an agent.

Parameters:

Name	Type	Description	Default
`model`	`Model \| KnownModelName \| None`	The default model to use for this agent, if not provide, you must provide the model when calling the agent.	`None`
`result_type`	`type[ResultData]`	The type of the result data, used to validate the result data, defaults to `str`.	`str`
`system_prompt`	`str \| Sequence[str]`	Static system prompts to use for this agent, you can also register system prompts via a function with `system_prompt`.	`()`
`deps_type`	`type[AgentDeps]`	The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, <return type>]`.	`NoneType`
`retries`	`int`	The default number of retries to allow before raising an error.	`1`
`result_tool_name`	`str`	The name of the tool to use for the final result.	`'final_result'`
`result_tool_description`	`str \| None`	The description of the final result tool.	`None`
`result_retries`	`int \| None`	The maximum number of retries to allow for result validation, defaults to `retries`.	`None`
`defer_model_check`	`bool`	by default, if you provide a named model, it's evaluated to create a `Model` instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to override the model for testing.	`False`

Source code in pydantic_ai/agent.py

def __init__(
    self,
    model: models.Model | models.KnownModelName | None = None,
    result_type: type[ResultData] = str,
    *,
    system_prompt: str | Sequence[str] = (),
    deps_type: type[AgentDeps] = NoneType,
    retries: int = 1,
    result_tool_name: str = 'final_result',
    result_tool_description: str | None = None,
    result_retries: int | None = None,
    defer_model_check: bool = False,
):
    """Create an agent.

    Args:
        model: The default model to use for this agent, if not provide,
            you must provide the model when calling the agent.
        result_type: The type of the result data, used to validate the result data, defaults to `str`.
        system_prompt: Static system prompts to use for this agent, you can also register system
            prompts via a function with [`system_prompt`][pydantic_ai.Agent.system_prompt].
        deps_type: The type used for dependency injection, this parameter exists solely to allow you to fully
            parameterize the agent, and therefore get the best out of static type checking.
            If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright
            or add a type hint `: Agent[None, <return type>]`.
        retries: The default number of retries to allow before raising an error.
        result_tool_name: The name of the tool to use for the final result.
        result_tool_description: The description of the final result tool.
        result_retries: The maximum number of retries to allow for result validation, defaults to `retries`.
        defer_model_check: by default, if you provide a [named][pydantic_ai.models.KnownModelName] model,
            it's evaluated to create a [`Model`][pydantic_ai.models.Model] instance immediately,
            which checks for the necessary environment variables. Set this to `false`
            to defer the evaluation until the first run. Useful if you want to
            [override the model][pydantic_ai.Agent.override_model] for testing.
    """
    if model is None or defer_model_check:
        self.model = model
    else:
        self.model = models.infer_model(model)

    self._result_schema = _result.ResultSchema[result_type].build(
        result_type, result_tool_name, result_tool_description
    )
    # if the result tool is None, or its schema allows `str`, we allow plain text results
    self._allow_text_result = self._result_schema is None or self._result_schema.allow_text_result

    self._system_prompts = (system_prompt,) if isinstance(system_prompt, str) else tuple(system_prompt)
    self._retrievers: dict[str, _r.Retriever[AgentDeps, Any]] = {}
    self._deps_type = deps_type
    self._default_retries = retries
    self._system_prompt_functions = []
    self._max_result_retries = result_retries if result_retries is not None else retries
    self._current_result_retry = 0
    self._result_validators = []

run `async`

run(
    user_prompt: str,
    *,
    message_history: list[Message] | None = None,
    model: Model | KnownModelName | None = None,
    deps: AgentDeps = None
) -> RunResult[ResultData]

Run the agent with a user prompt in async mode.

Parameters:

Name	Type	Description	Default
`user_prompt`	`str`	User input to start/continue the conversation.	required
`message_history`	`list[Message] \| None`	History of the conversation so far.	`None`
`model`	`Model \| KnownModelName \| None`	Optional model to use for this run, required if `model` was not set when creating the agent.	`None`
`deps`	`AgentDeps`	Optional dependencies to use for this run.	`None`

Returns:

Type	Description
`RunResult[ResultData]`	The result of the run.

Source code in pydantic_ai/agent.py

async def run(
    self,
    user_prompt: str,
    *,
    message_history: list[_messages.Message] | None = None,
    model: models.Model | models.KnownModelName | None = None,
    deps: AgentDeps = None,
) -> result.RunResult[ResultData]:
    """Run the agent with a user prompt in async mode.

    Args:
        user_prompt: User input to start/continue the conversation.
        message_history: History of the conversation so far.
        model: Optional model to use for this run, required if `model` was not set when creating the agent.
        deps: Optional dependencies to use for this run.

    Returns:
        The result of the run.
    """
    model_used, custom_model, agent_model = await self._get_agent_model(model)

    deps = self._get_deps(deps)

    new_message_index, messages = await self._prepare_messages(deps, user_prompt, message_history)
    self.last_run_messages = messages

    for retriever in self._retrievers.values():
        retriever.reset()

    cost = result.Cost()

    with _logfire.span(
        'agent run {prompt=}',
        prompt=user_prompt,
        agent=self,
        custom_model=custom_model,
        model_name=model_used.name(),
    ) as run_span:
        run_step = 0
        while True:
            run_step += 1
            with _logfire.span('model request {run_step=}', run_step=run_step) as model_req_span:
                model_response, request_cost = await agent_model.request(messages)
                model_req_span.set_attribute('response', model_response)
                model_req_span.set_attribute('cost', request_cost)
                model_req_span.message = f'model request -> {model_response.role}'

            messages.append(model_response)
            cost += request_cost

            with _logfire.span('handle model response') as handle_span:
                either = await self._handle_model_response(model_response, deps)

                if isinstance(either, _MarkFinalResult):
                    # we have a final result, end the conversation
                    result_data = either.data
                    run_span.set_attribute('all_messages', messages)
                    run_span.set_attribute('cost', cost)
                    handle_span.set_attribute('result', result_data)
                    handle_span.message = 'handle model response -> final result'
                    return result.RunResult(messages, new_message_index, result_data, cost)
                else:
                    # continue the conversation
                    tool_responses = either
                    handle_span.set_attribute('tool_responses', tool_responses)
                    response_msgs = ' '.join(m.role for m in tool_responses)
                    handle_span.message = f'handle model response -> {response_msgs}'
                    messages.extend(tool_responses)

run_sync

run_sync(
    user_prompt: str,
    *,
    message_history: list[Message] | None = None,
    model: Model | KnownModelName | None = None,
    deps: AgentDeps = None
) -> RunResult[ResultData]

Run the agent with a user prompt synchronously.

This is a convenience method that wraps self.run with asyncio.run().

Parameters:

Name	Type	Description	Default
`user_prompt`	`str`	User input to start/continue the conversation.	required
`message_history`	`list[Message] \| None`	History of the conversation so far.	`None`
`model`	`Model \| KnownModelName \| None`	Optional model to use for this run, required if `model` was not set when creating the agent.	`None`
`deps`	`AgentDeps`	Optional dependencies to use for this run.	`None`

Returns:

Type	Description
`RunResult[ResultData]`	The result of the run.

Source code in pydantic_ai/agent.py

def run_sync(
    self,
    user_prompt: str,
    *,
    message_history: list[_messages.Message] | None = None,
    model: models.Model | models.KnownModelName | None = None,
    deps: AgentDeps = None,
) -> result.RunResult[ResultData]:
    """Run the agent with a user prompt synchronously.

    This is a convenience method that wraps `self.run` with `asyncio.run()`.

    Args:
        user_prompt: User input to start/continue the conversation.
        message_history: History of the conversation so far.
        model: Optional model to use for this run, required if `model` was not set when creating the agent.
        deps: Optional dependencies to use for this run.

    Returns:
        The result of the run.
    """
    return asyncio.run(self.run(user_prompt, message_history=message_history, model=model, deps=deps))

run_stream `async`

run_stream(
    user_prompt: str,
    *,
    message_history: list[Message] | None = None,
    model: Model | KnownModelName | None = None,
    deps: AgentDeps = None
) -> AsyncIterator[
    StreamedRunResult[AgentDeps, ResultData]
]

Run the agent with a user prompt in async mode, returning a streamed response.

Parameters:

Name	Type	Description	Default
`user_prompt`	`str`	User input to start/continue the conversation.	required
`message_history`	`list[Message] \| None`	History of the conversation so far.	`None`
`model`	`Model \| KnownModelName \| None`	Optional model to use for this run, required if `model` was not set when creating the agent.	`None`
`deps`	`AgentDeps`	Optional dependencies to use for this run.	`None`

Returns:

Type	Description
`AsyncIterator[StreamedRunResult[AgentDeps, ResultData]]`	The result of the run.

Source code in pydantic_ai/agent.py

@asynccontextmanager
async def run_stream(
    self,
    user_prompt: str,
    *,
    message_history: list[_messages.Message] | None = None,
    model: models.Model | models.KnownModelName | None = None,
    deps: AgentDeps = None,
) -> AsyncIterator[result.StreamedRunResult[AgentDeps, ResultData]]:
    """Run the agent with a user prompt in async mode, returning a streamed response.

    Args:
        user_prompt: User input to start/continue the conversation.
        message_history: History of the conversation so far.
        model: Optional model to use for this run, required if `model` was not set when creating the agent.
        deps: Optional dependencies to use for this run.

    Returns:
        The result of the run.
    """
    model_used, custom_model, agent_model = await self._get_agent_model(model)

    deps = self._get_deps(deps)

    new_message_index, messages = await self._prepare_messages(deps, user_prompt, message_history)
    self.last_run_messages = messages

    for retriever in self._retrievers.values():
        retriever.reset()

    cost = result.Cost()

    with _logfire.span(
        'agent run stream {prompt=}',
        prompt=user_prompt,
        agent=self,
        custom_model=custom_model,
        model_name=model_used.name(),
    ) as run_span:
        run_step = 0
        while True:
            run_step += 1
            with _logfire.span('model request {run_step=}', run_step=run_step) as model_req_span:
                async with agent_model.request_stream(messages) as model_response:
                    model_req_span.set_attribute('response_type', model_response.__class__.__name__)
                    # We want to end the "model request" span here, but we can't exit the context manager
                    # in the traditional way
                    model_req_span.__exit__(None, None, None)

                    with _logfire.span('handle model response') as handle_span:
                        either = await self._handle_streamed_model_response(model_response, deps)

                        if isinstance(either, _MarkFinalResult):
                            result_stream = either.data
                            run_span.set_attribute('all_messages', messages)
                            handle_span.set_attribute('result_type', result_stream.__class__.__name__)
                            handle_span.message = 'handle model response -> final result'
                            yield result.StreamedRunResult(
                                messages,
                                new_message_index,
                                cost,
                                result_stream,
                                self._result_schema,
                                deps,
                                self._result_validators,
                            )
                            return
                        else:
                            tool_responses = either
                            handle_span.set_attribute('tool_responses', tool_responses)
                            response_msgs = ' '.join(m.role for m in tool_responses)
                            handle_span.message = f'handle model response -> {response_msgs}'
                            messages.extend(tool_responses)
                            # the model_response should have been fully streamed by now, we can add it's cost
                            cost += model_response.cost()

model `instance-attribute`

model: Model | KnownModelName | None

The default model configured for this agent.

override_deps

override_deps(overriding_deps: AgentDeps) -> Iterator[None]

Context manager to temporarily override agent dependencies, this is particularly useful when testing.

Parameters:

Name	Type	Description	Default
`overriding_deps`	`AgentDeps`	The dependencies to use instead of the dependencies passed to the agent run.	required

Source code in pydantic_ai/agent.py

@contextmanager
def override_deps(self, overriding_deps: AgentDeps) -> Iterator[None]:
    """Context manager to temporarily override agent dependencies, this is particularly useful when testing.

    Args:
        overriding_deps: The dependencies to use instead of the dependencies passed to the agent run.
    """
    override_deps_before = self._override_deps
    self._override_deps = _utils.Some(overriding_deps)
    try:
        yield
    finally:
        self._override_deps = override_deps_before

override_model

override_model(
    overriding_model: Model | KnownModelName,
) -> Iterator[None]

Context manager to temporarily override the model used by the agent.

Parameters:

Name	Type	Description	Default
`overriding_model`	`Model \| KnownModelName`	The model to use instead of the model passed to the agent run.	required

Source code in pydantic_ai/agent.py

@contextmanager
def override_model(self, overriding_model: models.Model | models.KnownModelName) -> Iterator[None]:
    """Context manager to temporarily override the model used by the agent.

    Args:
        overriding_model: The model to use instead of the model passed to the agent run.
    """
    override_model_before = self._override_model
    self._override_model = _utils.Some(models.infer_model(overriding_model))
    try:
        yield
    finally:
        self._override_model = override_model_before

last_run_messages `class-attribute` `instance-attribute`

last_run_messages: list[Message] | None = None

The messages from the last run, useful when a run raised an exception.

Note: these are not used by the agent, e.g. in future runs, they are just stored for developers' convenience.

system_prompt

system_prompt(
    func: SystemPromptFunc[AgentDeps],
) -> SystemPromptFunc[AgentDeps]

Decorator to register a system prompt function that takes CallContext as it's only argument.

Source code in pydantic_ai/agent.py

def system_prompt(
    self, func: _system_prompt.SystemPromptFunc[AgentDeps]
) -> _system_prompt.SystemPromptFunc[AgentDeps]:
    """Decorator to register a system prompt function that takes `CallContext` as it's only argument."""
    self._system_prompt_functions.append(_system_prompt.SystemPromptRunner(func))
    return func

retriever_plain

retriever_plain(
    func: RetrieverPlainFunc[RetrieverParams],
) -> Retriever[AgentDeps, RetrieverParams]

retriever_plain(
    *, retries: int | None = None
) -> Callable[
    [RetrieverPlainFunc[RetrieverParams]],
    Retriever[AgentDeps, RetrieverParams],
]

retriever_plain(
    func: RetrieverPlainFunc[RetrieverParams] | None = None,
    /,
    *,
    retries: int | None = None,
) -> Any

Decorator to register a retriever function.

Source code in pydantic_ai/agent.py

def retriever_plain(
    self, func: RetrieverPlainFunc[RetrieverParams] | None = None, /, *, retries: int | None = None
) -> Any:
    """Decorator to register a retriever function."""
    if func is None:

        def retriever_decorator(
            func_: RetrieverPlainFunc[RetrieverParams],
        ) -> _r.Retriever[AgentDeps, RetrieverParams]:
            # noinspection PyTypeChecker
            return self._register_retriever(_utils.Either(right=func_), retries)

        return retriever_decorator
    else:
        return self._register_retriever(_utils.Either(right=func), retries)

retriever_context

retriever_context(
    func: RetrieverContextFunc[AgentDeps, RetrieverParams]
) -> Retriever[AgentDeps, RetrieverParams]

retriever_context(
    *, retries: int | None = None
) -> Callable[
    [RetrieverContextFunc[AgentDeps, RetrieverParams]],
    Retriever[AgentDeps, RetrieverParams],
]

retriever_context(
    func: (
        RetrieverContextFunc[AgentDeps, RetrieverParams]
        | None
    ) = None,
    /,
    *,
    retries: int | None = None,
) -> Any

Decorator to register a retriever function.

Source code in pydantic_ai/agent.py

def retriever_context(
    self,
    func: RetrieverContextFunc[AgentDeps, RetrieverParams] | None = None,
    /,
    *,
    retries: int | None = None,
) -> Any:
    """Decorator to register a retriever function."""
    if func is None:

        def retriever_decorator(
            func_: RetrieverContextFunc[AgentDeps, RetrieverParams],
        ) -> _r.Retriever[AgentDeps, RetrieverParams]:
            # noinspection PyTypeChecker
            return self._register_retriever(_utils.Either(left=func_), retries)

        return retriever_decorator
    else:
        # noinspection PyTypeChecker
        return self._register_retriever(_utils.Either(left=func), retries)

result_validator

result_validator(
    func: ResultValidatorFunc[AgentDeps, ResultData]
) -> ResultValidatorFunc[AgentDeps, ResultData]

Decorator to register a result validator function.

Source code in pydantic_ai/agent.py

def result_validator(
    self, func: _result.ResultValidatorFunc[AgentDeps, ResultData]
) -> _result.ResultValidatorFunc[AgentDeps, ResultData]:
    """Decorator to register a result validator function."""
    self._result_validators.append(_result.ResultValidator(func))
    return func

ModelRetry

Bases: Exception

Exception raised when a retriever function should be retried.

The agent will return the message to the model and ask it to try calling the function/tool again.

Source code in pydantic_ai/exceptions.py

class ModelRetry(Exception):
    """Exception raised when a retriever function should be retried.

    The agent will return the message to the model and ask it to try calling the function/tool again.
    """

    message: str
    """The message to return to the model."""

    def __init__(self, message: str):
        self.message = message
        super().__init__(message)

message `instance-attribute`

message: str = message

The message to return to the model.

UserError

Bases: RuntimeError

Error caused by a usage mistake by the application developer — You!

Source code in pydantic_ai/exceptions.py

class UserError(RuntimeError):
    """Error caused by a usage mistake by the application developer — You!"""

    message: str
    """Description of the mistake."""

    def __init__(self, message: str):
        self.message = message
        super().__init__(message)

message `instance-attribute`

message: str = message

Description of the mistake.

UnexpectedModelBehaviour

Bases: RuntimeError

Error caused by unexpected Model behavior, e.g. an unexpected response code.

Source code in pydantic_ai/exceptions.py

class UnexpectedModelBehaviour(RuntimeError):
    """Error caused by unexpected Model behavior, e.g. an unexpected response code."""

    message: str
    """Description of the unexpected behavior."""
    body: str | None
    """The body of the response, if available."""

    def __init__(self, message: str, body: str | None = None):
        self.message = message
        if body is None:
            self.body: str | None = None
        else:
            try:
                self.body = json.dumps(json.loads(body), indent=2)
            except ValueError:
                self.body = body
        super().__init__(message)

    def __str__(self) -> str:
        if self.body:
            return f'{self.message}, body:\n{self.body}'
        else:
            return self.message

message `instance-attribute`

message: str = message

Description of the unexpected behavior.

body `instance-attribute`

body: str | None = dumps(loads(body), indent=2)

The body of the response, if available.

Agents

Introduction

Running Agents

Runs vs. Conversations

System Prompts

Retrievers

Reflection and self-correction

Model errors

API Reference

__init__

run async

run_sync

run_stream async

model instance-attribute

override_deps

override_model

last_run_messages class-attribute instance-attribute

system_prompt

retriever_plain

retriever_context

result_validator

ModelRetry

message instance-attribute

UserError

message instance-attribute

UnexpectedModelBehaviour

message instance-attribute

body instance-attribute

init

run `async`

run_stream `async`

model `instance-attribute`

last_run_messages `class-attribute` `instance-attribute`

message `instance-attribute`

message `instance-attribute`

message `instance-attribute`

body `instance-attribute`