Guardrails - Controlling the Chaos

Quarkus-specific implementation deprecation

The Quarkus-specific guardrail implementation has been removed in favor of the LangChain4j-specific implementation. If you are currently using the Quarkus specific guardrail implementation, you MUST migrate to the upstream implementation.

In most cases the switch is simply to change package imports from io.quarkiverse.langchain4j.guardrails to dev.langchain4j.guardrail or dev.langchain4j.service.guardrail. There are a few differences in the APIs, but the class/interface names are mostly the same.

In fact, the upstream implementation was backported from the Quarkus implementation!

Guardrail Scopes

Guardrails MUST be CDI beans. They can be in any CDI scope, including request scope, application scope, or session scope.

The scope of the guardrail is important as it defines the lifecycle of the guardrail, especially when the guardrail is stateful.

Output Guardrails configuration

By default, Quarkus Langchain4J will limit the number of retries to 3 (the default in upstream LangChain4j is 2). This is configurable using the quarkus.langchain4j.guardrails.max-retries configuration property:

quarkus.langchain4j.guardrails.max-retries=5

Setting quarkus.langchain4j.guardrails.max-retries to 0 disables retries.

Configuration can also be set in the @OutputGuardrails annotation directly, which will override any defaults set for a specific operation.

Output Guardrails for Streamed Responses

Output guardrails can be applied to methods that return Multi or TokenStream. By default, Quarkus will automatically assemble the full response before executing the guardrail chain. Keep in mind that this may have a performance impact when handling large responses.

To control when the guardrail chain is invoked during streaming, configure an accumulator:

@UserMessage("...")
@OutputGuardrails(MyGuardrail.class)
@OutputGuardrailAccumulator(PassThroughAccumulator.class) // Defines the accumulator
Multi<String> ask();

The @OutputGuardrailAccumulator annotation allows you to specify a custom accumulator. The accumulator must implement the io.quarkiverse.langchain4j.guardrails.OutputTokenAccumulator interface and be exposed as a CDI bean. The following is an example of a pass-through accumulator that does not accumulate tokens:

@ApplicationScoped
public class PassThroughAccumulator implements OutputTokenAccumulator {

    @Override
    public Multi<String> accumulate(Multi<String> tokens) {
        return tokens; // Passes the tokens through without accumulating
    }
}

You can create accumulators based on various criteria, such as the number of tokens, a specific separator, or time intervals.

When an accumulator is set, the output guardrail chain is invoked for each item emitted by the Multi returned by the accumulate method.

In the case of a retry, the accumulator is called again with the new response, restarting the stream from the beginning. The same behavior applies for reprompts.

Going Further

AI Services Reference – Learn how to declare and customize AI services, including memory and streaming behaviors.
Retrieval-Augmented Generation (RAG) – Understand how augmentation results are passed into guardrails and used to detect hallucinations.
Messages and Memory – See how memory is stored, scoped, and accessed by guardrails.
Testing LLM Applications – Explore how to unit test your guardrails using AssertJ-based custom assertions.