Anthropic Chat Models

Anthropic is an AI safety and research company. It provides the Claude family of Large Language Models, designed with constitutional AI principles for safe and controllable output.

This extension allows you to integrate Claude models into your Quarkus applications via the Anthropic API.

Prerequisites

To use Anthropic models, you need an API key. Follow the steps on the Claude documentation portal to request access and retrieve your credentials.

Dependency

To enable Anthropic LLM integration in your project, add the following dependency:

<dependency>
  <groupId>io.quarkiverse.langchain4j</groupId>
  <artifactId>quarkus-langchain4j-anthropic</artifactId>
  <version>1.8.4</version>
</dependency>

Even better, if you use the Quarkus platformn BOM (default for projects generated), add the Quarkus Langchain4J BOM and all dependency versions will align:

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>${quarkus.platform.group-id}</groupId>
                <artifactId>${quarkus.platform.artifact-id}</artifactId>
                <version>${quarkus.platform.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
            <dependency>
                <groupId>${quarkus.platform.group-id}</groupId>
                <artifactId>quarkus-langchain4j-bom</artifactId> (1)
                <version>${quarkus.platform.version}</version> (2)
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
      <dependency>
        <groupId>io.quarkiverse.langchain4j</groupId>
        <artifactId>quarkus-langchain4j-anthropic</artifactId>
        (3)
      </dependency>
    </dependencies>
1 In your dependencyManagement section, add the quarkus-langchain4j-bom
2 Inherit the version from your platform version
3 Voilà, no need for version alignment anymore

If no other LLM extension is installed, AI Services will automatically use the configured Anthropic chat model.

Configuration

Set your API key in the application.properties file:

quarkus.langchain4j.anthropic.api-key=...

You can also set it using the environment variable:

QUARKUS_LANGCHAIN4J_ANTHROPIC_API_KEY=...

By default, the extension uses the latest available Claude model. You can specify the model explicitly using:

quarkus.langchain4j.anthropic.chat-model.model-name=claude-opus-4-20250514

Refer to Anthropic’s model catalog for available versions, such as:

  • claude-sonnet-4-20250514

  • claude-3-opus-20240229

  • claude-3-haiku-20240307

Usage

You can inject the chat model directly:

@Inject ChatModel chatModel;

Or declare an AI service interface:

@RegisterAiService
public interface Assistant {
    String chat(String input);
}

Advanced Tool Use Features

These features enable more efficient tool orchestration, improved accuracy, and reduced token consumption.

All features are optional and can be enabled independently.

Anthropic’s Tool Search Tool allows Claude to use search tools to access thousands of tools without consuming its context window

The Tool Search Tool lets Claude dynamically discover tools instead of loading all definitions upfront. You provide all your tool definitions to the API, but mark tools with defer_loading: true to make them discoverable on-demand. Deferred tools aren’t loaded into Claude’s context initially. Claude only sees the Tool Search Tool itself plus any tools with defer_loading: false (your most critical, frequently-used tools).

When Claude needs specific capabilities, it searches for relevant tools. The Tool Search Tool returns references to matching tools, which get expanded into full definitions in Claude’s context.

To enable this feature, set the following property:

quarkus.langchain4j.anthropic.chat-model.tool-search.enabled=true

You can optionally specify the search type (regex is the default, or bm25):

quarkus.langchain4j.anthropic.chat-model.tool-search.type=bm25

Under the hood, when Tool Search is enabled, the Quarkus extension automatically:

  • Adds the AnthropicServerTool (either regex or bm25 variant) to the model request.

  • Configures the toolMetadataKeysToSend to include defer_loading, allowing the model to see which tools can be discovered on-demand.

  • Includes the required advanced-tool-use-2025-11-20 beta header in all requests.

When tool search is enabled, you can mark specific tools to be discovered on-demand using the defer_loading metadata:

@Tool(name = "get_weather", value = "Get the weather for a city", metadata = "{\"defer_loading\": true}")
public String getWeather(String city) {
    // ...
}

Programmatic Tool Calling

Anthropic’s Programmatic Tool Calling allows Claude to invoke tools in a code execution environment reducing the impact on the model’s context window.

Programmatic Tool Calling enables Claude to orchestrate tools through code rather than through individual API round-trips. Instead of Claude requesting tools one at a time with each result being returned to its context, Claude writes code that calls multiple tools, processes their outputs, and controls what information actually enters its context window.

Claude excels at writing code and by letting it express orchestration logic in Python rather than through natural language tool invocations, you get more reliable, precise control flow. Loops, conditionals, data transformations, and error handling are all explicit in code rather than implicit in Claude’s reasoning.

To enable this feature, set the following property:

quarkus.langchain4j.anthropic.chat-model.programmatic-tool-calling.enabled=true

Under the hood, when Programmatic Tool Calling is enabled, the extension automatically:

  • Adds the Code Execution server tool

  • Sends the "allowed_callers" metadata key with tool definitions

  • Includes the required beta header in all requests

Tools that should be callable from generated code must include the allowed_callers metadata:

@Tool(
    name = "get_weather",
    value = "Get the weather for a city",
    metadata = "{\"allowed_callers\": [\"code_execution_20250825\"]}"
)
public String getWeather(String city) {
    // ...
}

Tool Use Examples

This feature provides a universal standard for demonstrating how to effectively use a given tool.

Tool Use Examples let you provide sample tool calls directly in your tool definitions. Instead of relying on schema alone, you show Claude concrete usage patterns.

For example, format ambiguity: if a date is passed as string, should it use "2024-11-06", "Nov 6, 2024", or "2024-11-06T00:00:00Z"?

To enable this feature, set the following property:

quarkus.langchain4j.anthropic.chat-model.tool-use-examples.enabled=true

Under the hood, when Tool Use Examples is enabled, the extension automatically:

  • Sends the "input_examples" metadata key with tool definitions

  • Includes the required beta header in all requests

Tools that include examples must define the input_examples metadata:

        private static final String METADATA = """
                {
                  "input_examples": [
                    {
                      "title": "Login page returns 500 error",
                      "priority": "critical",
                      "labels": ["bug", "authentication"],
                      "date": "2024-11-06"
                    },
                    {
                      "title": "Update API documentation",
                      "date": "2024-12-01"
                    },
                    {
                      "title": "Brainstorming session"
                    }
                  ]
                }
                """;

        @Tool(name = "create_ticket", value = "Create a support ticket", metadata = METADATA)
        public String createTicket(String title, String priority, String date) {
            return "TICKET-123";
        }

Configuration Reference

Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property

Type

Default

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_ENABLED

boolean

true

Base URL of the Anthropic API

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_BASE_URL

string

https://api.anthropic.com/v1/

Anthropic API key

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_API_KEY

string

dummy

The Anthropic version

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_VERSION

string

2023-06-01

If set to true, the "anthopic-beta" header will never be sent

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_DISABLE_BETA_HEADER

boolean

false

Timeout for Anthropic calls

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_TIMEOUT

Duration 

10s

Whether the Anthropic client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_LOG_REQUESTS

boolean

false

Whether the Anthropic client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_LOG_RESPONSES

boolean

false

Whether the Anthropic client should log requests as cURL commands

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_LOG_REQUESTS_CURL

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the Anthropic provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_ENABLE_INTEGRATION

boolean

true

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_MODEL_NAME

string

claude-3-haiku-20240307

What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It is generally recommended to set this or the top-k property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TEMPERATURE

double

0.7

The maximum number of tokens to generate in the completion.

The token count of your prompt plus max_tokens cannot exceed the model’s context length

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_MAX_TOKENS

int

1024

Double (0.0-1.0). Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to set this or the temperature property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TOP_P

double

1.0

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TOP_K

int

40

The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_MAX_RETRIES

int

1

The custom text sequences that will cause the model to stop generating

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_STOP_SEQUENCES

list of string

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_LOG_RESPONSES

boolean

false

Cache system messages to reduce costs for repeated prompts. Requires minimum 1024 tokens (Claude Opus/Sonnet) or 2048-4096 tokens (Haiku). Supported models: Claude Opus 4.1, Sonnet 4.5, Haiku 4.5, and later models.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_CACHE_SYSTEM_MESSAGES

boolean

false

Cache tool definitions to reduce costs. Requires minimum 1024 tokens (Claude Opus/Sonnet) or 2048-4096 tokens (Haiku). Supported models: Claude Opus 4.1, Sonnet 4.5, Haiku 4.5, and later models.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_CACHE_TOOLS

boolean

false

The thinking type to enable Claude’s reasoning process

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_TYPE

string

The token budget for the model’s thinking process

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_BUDGET_TOKENS

int

Whether thinking results should be returned in the response

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_RETURN_THINKING

boolean

false

Whether previously stored thinking should be sent in follow-up requests

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_SEND_THINKING

boolean

true

Enable interleaved thinking for Claude 4 models, allowing reasoning between tool calls. Requires Claude 4 model (e.g., claude-opus-4-20250514) and thinking.type: enabled.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_INTERLEAVED

boolean

false

Enable Anthropic’s Tool Search Tool for on-demand tool discovery. When enabled, this automatically adds the tool search server tool, sets the required beta header, and enables the "defer_loading" tool metadata key. Tools annotated with `@Tool(metadata = "{\"defer_loading\": true`")} will be discovered on demand instead of loaded upfront.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TOOL_SEARCH_ENABLED

boolean

false

The type of tool search to use. Available types: "regex" (default) or "bm25".

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TOOL_SEARCH_TYPE

string

regex

Enable Anthropic’s Programmatic Tool Calling via the Code Execution server tool. When enabled, this automatically adds the code execution server tool, the "allowed_callers" key is sent with tool definitions, and the required beta header is set. Claude can orchestrate multiple tool calls from within generated Python code, keeping intermediate results out of the context window rather than accumulating them in the conversation, significantly reducing token consumption.

Tools that should be callable from code must include: `@Tool(metadata = "{\"allowed_callers\": [\"code_execution_20250825\"]`")}

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_PROGRAMMATIC_TOOL_CALLING_ENABLED

boolean

false

Enable Anthropic’s Tool Use Examples feature. When enabled, the "input_examples" key is sent with tool definitions, and the required beta header is set. Providing concrete input examples alongside tool schemas helps Claude learn correct parameter usage, formats, and conventions that cannot be expressed in JSON Schema alone.

Tools with examples must include: @Tool(metadata = "{\"input_examples\": [{…​, …​]}")}

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TOOL_USE_EXAMPLES_ENABLED

boolean

false

Named model config

Type

Default

Base URL of the Anthropic API

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__BASE_URL

string

https://api.anthropic.com/v1/

Anthropic API key

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__API_KEY

string

dummy

The Anthropic version

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__VERSION

string

2023-06-01

If set to true, the "anthopic-beta" header will never be sent

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__DISABLE_BETA_HEADER

boolean

false

Timeout for Anthropic calls

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__TIMEOUT

Duration 

10s

Whether the Anthropic client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the Anthropic client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether the Anthropic client should log requests as cURL commands

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__LOG_REQUESTS_CURL

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the Anthropic provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_MODEL_NAME

string

claude-3-haiku-20240307

What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It is generally recommended to set this or the top-k property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

0.7

The maximum number of tokens to generate in the completion.

The token count of your prompt plus max_tokens cannot exceed the model’s context length

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_MAX_TOKENS

int

1024

Double (0.0-1.0). Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to set this or the temperature property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TOP_P

double

1.0

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TOP_K

int

40

The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_MAX_RETRIES

int

1

The custom text sequences that will cause the model to stop generating

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_STOP_SEQUENCES

list of string

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

Cache system messages to reduce costs for repeated prompts. Requires minimum 1024 tokens (Claude Opus/Sonnet) or 2048-4096 tokens (Haiku). Supported models: Claude Opus 4.1, Sonnet 4.5, Haiku 4.5, and later models.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_CACHE_SYSTEM_MESSAGES

boolean

false

Cache tool definitions to reduce costs. Requires minimum 1024 tokens (Claude Opus/Sonnet) or 2048-4096 tokens (Haiku). Supported models: Claude Opus 4.1, Sonnet 4.5, Haiku 4.5, and later models.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_CACHE_TOOLS

boolean

false

The thinking type to enable Claude’s reasoning process

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_TYPE

string

The token budget for the model’s thinking process

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_BUDGET_TOKENS

int

Whether thinking results should be returned in the response

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_RETURN_THINKING

boolean

false

Whether previously stored thinking should be sent in follow-up requests

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_SEND_THINKING

boolean

true

Enable interleaved thinking for Claude 4 models, allowing reasoning between tool calls. Requires Claude 4 model (e.g., claude-opus-4-20250514) and thinking.type: enabled.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_INTERLEAVED

boolean

false

Enable Anthropic’s Tool Search Tool for on-demand tool discovery. When enabled, this automatically adds the tool search server tool, sets the required beta header, and enables the "defer_loading" tool metadata key. Tools annotated with `@Tool(metadata = "{\"defer_loading\": true`")} will be discovered on demand instead of loaded upfront.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TOOL_SEARCH_ENABLED

boolean

false

The type of tool search to use. Available types: "regex" (default) or "bm25".

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TOOL_SEARCH_TYPE

string

regex

Enable Anthropic’s Programmatic Tool Calling via the Code Execution server tool. When enabled, this automatically adds the code execution server tool, the "allowed_callers" key is sent with tool definitions, and the required beta header is set. Claude can orchestrate multiple tool calls from within generated Python code, keeping intermediate results out of the context window rather than accumulating them in the conversation, significantly reducing token consumption.

Tools that should be callable from code must include: `@Tool(metadata = "{\"allowed_callers\": [\"code_execution_20250825\"]`")}

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_PROGRAMMATIC_TOOL_CALLING_ENABLED

boolean

false

Enable Anthropic’s Tool Use Examples feature. When enabled, the "input_examples" key is sent with tool definitions, and the required beta header is set. Providing concrete input examples alongside tool schemas helps Claude learn correct parameter usage, formats, and conventions that cannot be expressed in JSON Schema alone.

Tools with examples must include: @Tool(metadata = "{\"input_examples\": [{…​, …​]}")}

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TOOL_USE_EXAMPLES_ENABLED

boolean

false

About the Duration format

To write duration values, use the standard java.time.Duration format. See the Duration#parse() Java API documentation for more information.

You can also use a simplified format, starting with a number:

  • If the value is only a number, it represents time in seconds.

  • If the value is a number followed by ms, it represents time in milliseconds.

In other cases, the simplified format is translated to the java.time.Duration format for parsing:

  • If the value is a number followed by h, m, or s, it is prefixed with PT.

  • If the value is a number followed by d, it is prefixed with P.