Skip to content

Anthropic

Install

To use AnthropicModel models, you need to either install pydantic-ai, or install pydantic-ai-slim with the anthropic optional group:

pip install "pydantic-ai-slim[anthropic]"
uv add "pydantic-ai-slim[anthropic]"

Configuration

To use Anthropic through their API, go to console.anthropic.com/settings/keys to generate an API key.

AnthropicModelName contains a list of available Anthropic models.

Environment variable

Once you have the API key, you can set it as an environment variable:

export ANTHROPIC_API_KEY='your-api-key'

You can then use AnthropicModel by name:

Learn about Gateway
from pydantic_ai import Agent

agent = Agent('gateway/anthropic:claude-sonnet-4-5')
...
from pydantic_ai import Agent

agent = Agent('anthropic:claude-sonnet-4-5')
...

Or initialise the model directly with just the model name:

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel

model = AnthropicModel('claude-sonnet-4-5')
agent = Agent(model)
...

provider argument

You can provide a custom Provider via the provider argument:

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

model = AnthropicModel(
    'claude-sonnet-4-5', provider=AnthropicProvider(api_key='your-api-key')
)
agent = Agent(model)
...

Custom HTTP Client

You can customize the AnthropicProvider with a custom httpx.AsyncClient:

from httpx import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

custom_http_client = AsyncClient(timeout=30)
model = AnthropicModel(
    'claude-sonnet-4-5',
    provider=AnthropicProvider(api_key='your-api-key', http_client=custom_http_client),
)
agent = Agent(model)
...

Prompt Caching

Anthropic supports prompt caching to reduce costs by caching parts of your prompts. Pydantic AI provides three ways to use prompt caching:

  1. Cache User Messages with CachePoint: Insert a CachePoint marker in your user messages to cache everything before it
  2. Cache System Instructions: Enable the AnthropicModelSettings.anthropic_cache_instructions model setting to cache your system prompt
  3. Cache Tool Definitions: Enable the AnthropicModelSettings.anthropic_cache_tool_definitions model setting to cache your tool definitions

You can combine all three strategies for maximum savings:

Learn about Gateway
from pydantic_ai import Agent, CachePoint, RunContext
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'gateway/anthropic:claude-sonnet-4-5',
    system_prompt='Detailed instructions...',
    model_settings=AnthropicModelSettings(
        anthropic_cache_instructions=True,
        anthropic_cache_tool_definitions=True,
    ),
)

@agent.tool
def search_docs(ctx: RunContext, query: str) -> str:
    """Search documentation."""
    return f'Results for {query}'

async def main():
    # First call - writes to cache
    result1 = await agent.run([
        'Long context from documentation...',
        CachePoint(),
        'First question'
    ])

    # Subsequent calls - read from cache (90% cost reduction)
    result2 = await agent.run([
        'Long context from documentation...',  # Same content
        CachePoint(),
        'Second question'
    ])
    print(f'First: {result1.output}')
    print(f'Second: {result2.output}')
from pydantic_ai import Agent, CachePoint, RunContext
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-sonnet-4-5',
    system_prompt='Detailed instructions...',
    model_settings=AnthropicModelSettings(
        anthropic_cache_instructions=True,
        anthropic_cache_tool_definitions=True,
    ),
)

@agent.tool
def search_docs(ctx: RunContext, query: str) -> str:
    """Search documentation."""
    return f'Results for {query}'

async def main():
    # First call - writes to cache
    result1 = await agent.run([
        'Long context from documentation...',
        CachePoint(),
        'First question'
    ])

    # Subsequent calls - read from cache (90% cost reduction)
    result2 = await agent.run([
        'Long context from documentation...',  # Same content
        CachePoint(),
        'Second question'
    ])
    print(f'First: {result1.output}')
    print(f'Second: {result2.output}')

Access cache usage statistics via result.usage():

Learn about Gateway
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'gateway/anthropic:claude-sonnet-4-5',
    system_prompt='Instructions...',
    model_settings=AnthropicModelSettings(
        anthropic_cache_instructions=True
    ),
)

async def main():
    result = await agent.run('Your question')
    usage = result.usage()
    print(f'Cache write tokens: {usage.cache_write_tokens}')
    print(f'Cache read tokens: {usage.cache_read_tokens}')
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-sonnet-4-5',
    system_prompt='Instructions...',
    model_settings=AnthropicModelSettings(
        anthropic_cache_instructions=True
    ),
)

async def main():
    result = await agent.run('Your question')
    usage = result.usage()
    print(f'Cache write tokens: {usage.cache_write_tokens}')
    print(f'Cache read tokens: {usage.cache_read_tokens}')