AI Services Reference

AI Services provide a declarative mechanism for interacting with LLMs, abstracting complexities behind annotated interfaces.

Overview

AI Services act as the core connection between your application and Large Language Models (LLMs). They simplify the integration by encapsulating interactions declaratively within annotated Java interfaces, removing boilerplate and complexities typically associated with manual LLM integrations.

@RegisterAiService
public interface GreetingService {
    @UserMessage("Greet the user named {name}")
    String greet(String name);
}

Inject and use the generated service:

@Inject GreetingService service;

String greeting = service.greet("Quarkus");

You do not need to implement the interface. Quarkus LangChain4j automatically generates the implementation.

Annotations Reference

@RegisterAiService

Marks an interface as an AI Service managed by CDI, creating a bean for interactions with an LLM.

@RegisterAiService(
    modelName = "my-model",
    tools = {MyTool.class},
    chatMemoryProviderSupplier = CustomMemoryProvider.class,
    retrievalAugmentor = MyRetrieverSupplier.class
)

Attributes:

Attribute Description Default Note

Attribute	Description	Default	Note
`modelName`	Specifies the named LLM model configuration.	`"<default>"`	If not set, uses the default model
`tools`	Array of tool classes (CDI beans) accessible to the LLM.	Empty array	If not set, no tools are available. Also, we recommend using `@ToolBox` to define a toolbox on each method instead.
`toolProviderSupplier`	Configures a supplier that can add tools dynamically.		It is possible to use combine `tools` and the `toolProviderSupplier` to add tools dynamically at runtime.
`chatMemoryProviderSupplier`	Supplier class for chat memory provider.	Unset	If not set, the memory is managed using CDI-scope.
`retrievalAugmentor`	Supplier class for Retrieval Augmentor (for RAG).	Unset	See Retrieved Augmentation Generation (RAG) for details.
`moderationModelSupplier`	Supplier for moderation model (optional).	Unset	If not set, no moderation is applied. You can also use guardrails to control content.
`toolHallucinationStrategy`	Strategy to handle hallucinated tool calls.	Unset	If not set, no special handling is applied. See the Function Calling for details.
`streamingChatLanguageModelSupplier`	Supplier for streaming chat model (optional).	Unset	If not set, it uses the streaming support of the configured model.
`chatLanguageModelSupplier`	Supplier for chat model (optional).	Unset	If not set, it uses the streaming support of the configured model.

modelName

Specifies the named LLM model configuration.

"<default>"

If not set, uses the default model

tools

Array of tool classes (CDI beans) accessible to the LLM.

Empty array

If not set, no tools are available. Also, we recommend using @ToolBox to define a toolbox on each method instead.

toolProviderSupplier

Configures a supplier that can add tools dynamically.

It is possible to use combine tools and the toolProviderSupplier to add tools dynamically at runtime.

chatMemoryProviderSupplier

Supplier class for chat memory provider.

Unset

If not set, the memory is managed using CDI-scope.

retrievalAugmentor

Supplier class for Retrieval Augmentor (for RAG).

Unset

See Retrieved Augmentation Generation (RAG) for details.

moderationModelSupplier

Supplier for moderation model (optional).

Unset

If not set, no moderation is applied. You can also use guardrails to control content.

toolHallucinationStrategy

Strategy to handle hallucinated tool calls.

Unset

If not set, no special handling is applied. See the Function Calling for details.

streamingChatLanguageModelSupplier

Supplier for streaming chat model (optional).

Unset

If not set, it uses the streaming support of the configured model.

chatLanguageModelSupplier

Supplier for chat model (optional).

Unset

If not set, it uses the streaming support of the configured model.

In most cases, using an empty @RegisterAiService is sufficient, as it defaults to the configured model:

@RegisterAiService
public interface MyAiService {
    @UserMessage("Answer the following question: {input}")
    String process(String input);
}

@SystemMessage

The @SystemMessage annotation defines initial instructions or context for the LLM interaction. The system message can be set on the AI Service interface itself or on individual methods. In the first case, it applies to all methods in the service.

@SystemMessage("You are an assistant helping with users.")

Attributes:

value: prompt template.
delimiter: line delimiter for multi-line templates ("\n" by default).
fromResource: load message from a resource file.

@UserMessage

The @UserMessage annotation defines user instructions or prompts sent to the LLM. It can be applied to methods in the AI Service interface, allowing dynamic parameters to be included in the prompt. Method parameters can be used to fill in the prompt template.

@UserMessage("""
Analyze sentiment: {text}.
Respond with POSITIVE or NEGATIVE.
""")
public Sentiment analyzeSentiment(String text);

Attributes:

value: prompt template.
delimiter: line delimiter for multi-line templates ("\n" by default).
fromResource: load message from a resource file.

The @UserMessage can also be used on method parameters:

public String answer(@UserMessage question);

If the method has a single parameter, the @UserMessage annotation can be omitted, as it is applied by default.

@MemoryId

Use @MemoryId on method parameters to uniquely identify memory contexts. It’s crucial for distinguishing between different users or conversational sessions when using persistent or multi-user memory.

String chat(@MemoryId String userId, @UserMessage String message);

Always ensure unique memory IDs to avoid mixing conversational contexts.

@ImageUrl

Marks a parameter providing the URL of an image for LLM processing.

String describe(@ImageUrl String imageUrl);

Note that Quarkus LangChain4J also supports the Image type for more complex image handling, such as base64 encoding or MIME types.

Chat Memory Management

AI Services manage conversational state using ChatMemory. This chat memory is used to maintain context across multiple interactions with the LLM. It can distinguish between different users or sessions using the @MemoryId annotation. Also, the memory can be persisted in various ways, such as in a database or in-memory store.

Default memory provider: MessageWindowChatMemory with 10-message window.
Memory is request-scoped by default; configure using chatMemoryProviderSupplier.

To customize the memory provider, you can specify a custom class implementing ChatMemoryProvider:

@RegisterAiService(chatMemoryProviderSupplier = MyCustomMemoryProvider.class)

See the [Memory Management] reference for detailed customization.

LLM Response Mapping

AI Services map LLM responses to method return types:

Direct JSON mapping:

@UserMessage("Evaluate review: {review}. Respond in JSON.")
ReviewResponse evaluate(String review);

Quarkus automatically deserializes responses into ReviewResponse. Depending on the model, it forces the response format to match the expected type. It inserts the expected schema into the prompt automatically.

Dynamic schema with {response_schema} placeholder:

@UserMessage("""
Classify text: {text}. Schema: {response_schema}
""")

In this case, the schema is not automatically inserted.

You can disable automatic schema insertion via configuration:

quarkus.langchain4j.response-schema=false

Streaming Responses

AI Services can stream responses from the LLM token-by-token. Define the method to return a reactive stream (Multi<String>):

Multi<String> chat(@UserMessage String userMessage);

You can consume the stream internally:

chat(userMessage).subscribe().with(
    token -> { /* process token */ },
    error -> { /* handle error */ },
    () -> { /* on completion */ }
);

Or expose the stream directly in REST endpoints:

@GET
public Multi<String> stream(@QueryParam("question") String question) {
    return aiService.chat(question);
}

Configuring the chat model

While LLMs are the base AI models, the chat language model builds upon them, enabling chat-like interactions. If you have a single chat language model, no specific configuration is required.

However, when multiple model providers are present in the application (such as OpenAI, Azure OpenAI, HuggingFace, etc.) each model needs to be given a name, which is then referenced by the AI service like below. The same also applies when you want to use different models that use the same provider (e.g. via OpenAI-APIs).

@RegisterAiService(modelName="m1")

@Inject
@ModelName("m1")
EmbeddingModel embeddingModel;

The configuration of the various models could look like so:

# ensure that the model with the name 'm1', is provided by OpenAI
quarkus.langchain4j.m1.chat-model.provider=openai
# ensure that the model with the name 'm2', is provided by HuggingFace
quarkus.langchain4j.m2.chat-model.provider=huggingface
# ensure embedding model with the name 'm3', is provided by OpenAI again
quarkus.langchain4j.m3.chat-model.provider=openai


# configure the various aspects of each model
quarkus.langchain4j.openai.m1.api-key=sk-...
quarkus.langchain4j.huggingface.m2.api-key=sk-...
quarkus.langchain4j.openai.m3.api-key=sk-...
quarkus.langchain4j.openai.m3.embedding-model.model-name=text-emb...

Tools Integration

Integrate function calling (methods) callable by the LLM:

Define tools with @Tool:

@ApplicationScoped
public class CustomerService {

    @Tool("Fetch customer name by ID")
    public String getName(long id) { /*...*/ }
}

Configure tools:

@RegisterAiService(tools = CustomerService.class)

or, (recommended) use a toolbox:

@RegisterAiService
public interface MyAiService {
    @UserMessage("Answer the following question: {question}")
    @ToolBox(CustomerService.class)
    String answer(String question);
}

More details on function calling and tool integration can be found in the Function Calling guide.

Document Retrieval (RAG)

Integrate document retrieval using RetrievalAugmentor:

@RegisterAiService(retrievalAugmentor = MyRetrievalAugmentorSupplier.class)

If there is only one, no need to specify the retrievalAugmentor attribute, as it defaults to the configured retrieval augmentor.

Detailed setup in the RAG reference guide.

Moderation

By default, @RegisterAiService annotated interfaces don’t moderate content. However, users can opt in to having the LLM moderate content by annotating the method with @Moderate.

@Moderate
String respond(@UserMessage String message);

For moderation to work, the following criteria need to be met:

A CDI bean for dev.langchain4j.model.moderation.ModerationModel must be configured (the quarkus-langchain4j-openai and quarkus-langchain4j-azure-openai provide one out of the box)
or the moderationModelSupplier attribute of the @RegisterAiService annotation must be set to a custom moderation model supplier.

import dev.langchain4j.model.moderation.ModerationModel;

public class MyCustomModerationSupplier implements Supplier<ModerationModel> {

    @Override
    public ModerationModel get(){
        // ...
    }

}

Working with Images

AI Services support image processing and generation:

Describing images using an image URL:

@UserMessage("Describe the content of this image.")
String describe(@ImageUrl String imageUrl);

For local or base64-encoded images, use the Image class directly:

String describe(Image image);

Create the image instance like this:

Image image = Image.builder()
    .base64Data(encodeFileToBase64(someImage))
    .mimeType("image/png")
    .build();

Generating images:

Image generate(String prompt);

Limitations for image generation:

Memory and Retrieval Augmentors aren’t supported.
Limited guardrails (input only).

To summarize:

Image description:

String describe(@ImageUrl String imageUrl);

Image generation:

Image generate(String prompt);

AI Services Reference

Overview

Annotations Reference

@RegisterAiService

@SystemMessage

@UserMessage

@MemoryId

@ImageUrl

Chat Memory Management

LLM Response Mapping

Streaming Responses

Configuring the chat model

Tools Integration

Document Retrieval (RAG)

Moderation

Working with Images

Related Guides