AI Services Reference
AI Services provide a declarative mechanism for interacting with LLMs, abstracting complexities behind annotated interfaces.
Overview
AI Services act as the core connection between your application and Large Language Models (LLMs). They simplify the integration by encapsulating interactions declaratively within annotated Java interfaces, removing boilerplate and complexities typically associated with manual LLM integrations.
@RegisterAiService
public interface GreetingService {
@UserMessage("Greet the user named {name}")
String greet(String name);
}
Inject and use the generated service:
@Inject GreetingService service;
String greeting = service.greet("Quarkus");
You do not need to implement the interface. Quarkus LangChain4j automatically generates the implementation. |
Annotations Reference
@RegisterAiService
Marks an interface as an AI Service managed by CDI, creating a bean for interactions with an LLM.
@RegisterAiService(
modelName = "my-model",
tools = {MyTool.class},
chatMemoryProviderSupplier = CustomMemoryProvider.class,
retrievalAugmentor = MyRetrieverSupplier.class
)
Attributes:
Attribute | Description | Default | Note |
---|---|---|---|
|
Specifies the named LLM model configuration. |
|
If not set, uses the default model |
|
Array of tool classes (CDI beans) accessible to the LLM. |
Empty array |
If not set, no tools are available. Also, we recommend using |
|
Configures a supplier that can add tools dynamically. |
It is possible to use combine |
|
|
Supplier class for chat memory provider. |
Unset |
If not set, the memory is managed using CDI-scope. |
|
Supplier class for Retrieval Augmentor (for RAG). |
Unset |
See Retrieved Augmentation Generation (RAG) for details. |
|
Supplier for moderation model (optional). |
Unset |
If not set, no moderation is applied. You can also use guardrails to control content. |
|
Strategy to handle hallucinated tool calls. |
Unset |
If not set, no special handling is applied. See the Function Calling for details. |
|
Supplier for streaming chat model (optional). |
Unset |
If not set, it uses the streaming support of the configured model. |
|
Supplier for chat model (optional). |
Unset |
If not set, it uses the streaming support of the configured model. |
In most cases, using an empty @RegisterAiService
is sufficient, as it defaults to the configured model:
@RegisterAiService
public interface MyAiService {
@UserMessage("Answer the following question: {input}")
String process(String input);
}
@SystemMessage
The @SystemMessage
annotation defines initial instructions or context for the LLM interaction.
The system message can be set on the AI Service interface itself or on individual methods. In the first case, it applies to all methods in the service.
@SystemMessage("You are an assistant helping with users.")
Attributes:
-
value
: prompt template. -
delimiter
: line delimiter for multi-line templates ("\n"
by default). -
fromResource
: load message from a resource file.
@UserMessage
The @UserMessage
annotation defines user instructions or prompts sent to the LLM.
It can be applied to methods in the AI Service interface, allowing dynamic parameters to be included in the prompt.
Method parameters can be used to fill in the prompt template.
@UserMessage("""
Analyze sentiment: {text}.
Respond with POSITIVE or NEGATIVE.
""")
public Sentiment analyzeSentiment(String text);
Attributes:
-
value
: prompt template. -
delimiter
: line delimiter for multi-line templates ("\n"
by default). -
fromResource
: load message from a resource file.
The @UserMessage
can also be used on method parameters:
public String answer(@UserMessage question);
If the method has a single parameter, the @UserMessage
annotation can be omitted, as it is applied by default.
@MemoryId
Use @MemoryId
on method parameters to uniquely identify memory contexts.
It’s crucial for distinguishing between different users or conversational sessions when using persistent or multi-user memory.
String chat(@MemoryId String userId, @UserMessage String message);
Always ensure unique memory IDs to avoid mixing conversational contexts. |
Chat Memory Management
AI Services manage conversational state using ChatMemory
.
This chat memory is used to maintain context across multiple interactions with the LLM.
It can distinguish between different users or sessions using the @MemoryId
annotation.
Also, the memory can be persisted in various ways, such as in a database or in-memory store.
-
Default memory provider:
MessageWindowChatMemory
with 10-message window. -
Memory is request-scoped by default; configure using
chatMemoryProviderSupplier
.
To customize the memory provider, you can specify a custom class implementing ChatMemoryProvider
:
@RegisterAiService(chatMemoryProviderSupplier = MyCustomMemoryProvider.class)
See the [Memory Management] reference for detailed customization.
LLM Response Mapping
AI Services map LLM responses to method return types:
-
Direct JSON mapping:
@UserMessage("Evaluate review: {review}. Respond in JSON.") ReviewResponse evaluate(String review);
Quarkus automatically deserializes responses into ReviewResponse
.
Depending on the model, it forces the response format to match the expected type.
It inserts the expected schema into the prompt automatically.
-
Dynamic schema with
{response_schema}
placeholder:
@UserMessage(""" Classify text: {text}. Schema: {response_schema} """)
In this case, the schema is not automatically inserted.
You can disable automatic schema insertion via configuration:
quarkus.langchain4j.response-schema=false
Streaming Responses
AI Services can stream responses from the LLM token-by-token. Define the method to return a reactive stream (Multi<String>
):
Multi<String> chat(@UserMessage String userMessage);
You can consume the stream internally:
chat(userMessage).subscribe().with(
token -> { /* process token */ },
error -> { /* handle error */ },
() -> { /* on completion */ }
);
Or expose the stream directly in REST endpoints:
@GET
public Multi<String> stream(@QueryParam("question") String question) {
return aiService.chat(question);
}
Configuring the chat model
While LLMs are the base AI models, the chat language model builds upon them, enabling chat-like interactions. If you have a single chat language model, no specific configuration is required.
However, when multiple model providers are present in the application (such as OpenAI, Azure OpenAI, HuggingFace, etc.) each model needs to be given a name, which is then referenced by the AI service like below. The same also applies when you want to use different models that use the same provider (e.g. via OpenAI-APIs).
@RegisterAiService(modelName="m1")
or
@Inject
@ModelName("m1")
EmbeddingModel embeddingModel;
The configuration of the various models could look like so:
# ensure that the model with the name 'm1', is provided by OpenAI
quarkus.langchain4j.m1.chat-model.provider=openai
# ensure that the model with the name 'm2', is provided by HuggingFace
quarkus.langchain4j.m2.chat-model.provider=huggingface
# ensure embedding model with the name 'm3', is provided by OpenAI again
quarkus.langchain4j.m3.chat-model.provider=openai
# configure the various aspects of each model
quarkus.langchain4j.openai.m1.api-key=sk-...
quarkus.langchain4j.huggingface.m2.api-key=sk-...
quarkus.langchain4j.openai.m3.api-key=sk-...
quarkus.langchain4j.openai.m3.embedding-model.model-name=text-emb...
Tools Integration
Integrate function calling (methods) callable by the LLM:
-
Define tools with
@Tool
:
@ApplicationScoped public class CustomerService { @Tool("Fetch customer name by ID") public String getName(long id) { /*...*/ } }
-
Configure tools:
@RegisterAiService(tools = CustomerService.class)
or, (recommended) use a toolbox:
@RegisterAiService
public interface MyAiService {
@UserMessage("Answer the following question: {question}")
@ToolBox(CustomerService.class)
String answer(String question);
}
More details on function calling and tool integration can be found in the Function Calling guide.
Document Retrieval (RAG)
Integrate document retrieval using RetrievalAugmentor
:
@RegisterAiService(retrievalAugmentor = MyRetrievalAugmentorSupplier.class)
If there is only one, no need to specify the retrievalAugmentor
attribute, as it defaults to the configured retrieval augmentor.
Detailed setup in the RAG reference guide.
Moderation
By default, @RegisterAiService
annotated interfaces don’t moderate content.
However, users can opt in to having the LLM moderate content by annotating the method with @Moderate
.
@Moderate
String respond(@UserMessage String message);
For moderation to work, the following criteria need to be met:
-
A CDI bean for
dev.langchain4j.model.moderation.ModerationModel
must be configured (thequarkus-langchain4j-openai
andquarkus-langchain4j-azure-openai
provide one out of the box) -
or the
moderationModelSupplier
attribute of the@RegisterAiService
annotation must be set to a custom moderation model supplier.
import dev.langchain4j.model.moderation.ModerationModel;
public class MyCustomModerationSupplier implements Supplier<ModerationModel> {
@Override
public ModerationModel get(){
// ...
}
}
Working with Images
AI Services support image processing and generation:
Describing images using an image URL:
@UserMessage("Describe the content of this image.")
String describe(@ImageUrl String imageUrl);
For local or base64-encoded images, use the Image
class directly:
String describe(Image image);
Create the image instance like this:
Image image = Image.builder()
.base64Data(encodeFileToBase64(someImage))
.mimeType("image/png")
.build();
Generating images:
Image generate(String prompt);
Limitations for image generation:
|
To summarize:
-
Image description:
String describe(@ImageUrl String imageUrl);
-
Image generation:
Image generate(String prompt);