Guardrails - Controlling the Chaos
Quarkus-specific implementation deprecation
The Quarkus-specific guardrail implementation has been removed in favor of the LangChain4j-specific implementation. If you are currently using the Quarkus specific guardrail implementation, you MUST migrate to the upstream implementation. In most cases the switch is simply to change package imports from In fact, the upstream implementation was backported from the Quarkus implementation! |
Guardrail Scopes
Guardrails MUST be CDI beans. They can be in any CDI scope, including request scope, application scope, or session scope.
The scope of the guardrail is important as it defines the lifecycle of the guardrail, especially when the guardrail is stateful.
Output Guardrails configuration
By default, Quarkus Langchain4J will limit the number of retries to 3
(the default in upstream LangChain4j is 2
).
This is configurable using the quarkus.langchain4j.guardrails.max-retries
configuration property:
quarkus.langchain4j.guardrails.max-retries=5
Setting quarkus.langchain4j.guardrails.max-retries to 0 disables retries.
|
Configuration can also be set in the @OutputGuardrails
annotation directly, which will override any defaults set
for a specific operation.
Output Guardrails for Streamed Responses
Output guardrails can be applied to methods that return Multi
or TokenStream
.
By default, Quarkus will automatically assemble the full response before executing the guardrail chain.
Keep in mind that this may have a performance impact when handling large responses.
To control when the guardrail chain is invoked during streaming, configure an accumulator:
@UserMessage("...")
@OutputGuardrails(MyGuardrail.class)
@OutputGuardrailAccumulator(PassThroughAccumulator.class) // Defines the accumulator
Multi<String> ask();
The @OutputGuardrailAccumulator
annotation allows you to specify a custom accumulator.
The accumulator must implement the io.quarkiverse.langchain4j.guardrails.OutputTokenAccumulator
interface and be exposed as a CDI bean.
The following is an example of a pass-through accumulator that does not accumulate tokens:
@ApplicationScoped
public class PassThroughAccumulator implements OutputTokenAccumulator {
@Override
public Multi<String> accumulate(Multi<String> tokens) {
return tokens; // Passes the tokens through without accumulating
}
}
You can create accumulators based on various criteria, such as the number of tokens, a specific separator, or time intervals.
When an accumulator is set, the output guardrail chain is invoked for each item emitted by the Multi
returned by the accumulate
method.
In the case of a retry, the accumulator is called again with the new response, restarting the stream from the beginning. The same behavior applies for reprompts.
Going Further
-
AI Services Reference – Learn how to declare and customize AI services, including memory and streaming behaviors.
-
Retrieval-Augmented Generation (RAG) – Understand how augmentation results are passed into guardrails and used to detect hallucinations.
-
Messages and Memory – See how memory is stored, scoped, and accessed by guardrails.
-
Testing LLM Applications – Explore how to unit test your guardrails using AssertJ-based custom assertions.