Kafka Streams Processor

This extension eases the implementation of event-driven architecture streaming microservices based on Kafka Streams’ Processor API. It avoids Kafka Streams' users some boilerplate code to have a dedicated topology and a proper production readiness (health, observability, error handling). All that is left to develop is a simple Processor class. The extension takes care of the rest.

Streaming processor

A streaming microservice has some requirements:

Kafka Streams processors are not thread-safe, so a processor scope is @jakarta.enterprise.context.Dependent by default.
An incoming message, from an input topic, can be acknowledged only when it has been processed and its result message has been produced and stored in a Kafka output topic.
If an incoming message could not be acknowledged (e.g: a microservice crash) then the incoming message will be redelivered (at-least-once guarantee).
Incoming messages are processed in the order of arrival (ordering guarantee)

There are currently several bootstraps or libraries that support writing stream processor-based applications on different broker technologies. This extension’s processor solution is based on Kafka Streams' Processor API.

Kafka Streams

Kafka Streams is an opensource client library that allows to write streaming applications.

It is enabled by the quarkus-kafka-streams extension:

pom.xml

<dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-kafka-streams</artifactId>
    <scope>runtime</scope>
</dependency>

Only one threading model is supported: 1 thread per topic-partition (see details)

The number of partitions on the source topic determines up to how many threads can consume in parallel.

Synchronous/blocking programing must be used to ensure guarantee of execution and ordering.

There is a proposal under discussion for adding asynchronous processing to Kafka Streams.

A simplified Processor API

On top of Kafka Streams, the quarkus-kafka-streams-processor extension adds features to help you to write streaming microservices.

The following dependency needs to be added:

<project>
    <dependencyManagement>
      <dependencies>
        <dependency>
          <groupId>io.quarkiverse.kafkastreamsprocessor</groupId>
          <artifactId>quarkus-kafka-streams-processor-bom</artifactId>
          <version>${quarkus.kafkastreamsprocessor.version}</version>
          <type>pom</type>
          <scope>import</scope>
        </dependency>
      </dependencies>
    </dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>io.quarkiverse.kafkastreamsprocessor</groupId>
        <artifactId>quarkus-kafka-streams-processor-api</artifactId>
      </dependency>
      <dependency>
        <groupId>io.quarkiverse.kafkastreamsprocessor</groupId>
        <artifactId>quarkus-kafka-streams-processor</artifactId>
        <scope>runtime</scope>
      </dependency>
    </dependencies>
</project>

With the extension, it is expected to define an annotated Kafka Streams' Processor. It is discovered as a CDI bean and instrumented by the Quarkus runtime. The difference here with the fully fledged Kafka Streams Processor API is that there is no need to build a custom Topology for each new microservice. It is built by the Quarkus runtime.

Example:

PingProcessor.java

@Slf4j
@Processor (1)
public class PingProcessor extends ContextualProcessor<String, Ping, String, Ping> { (2)

  @Override
  public void process(Record<String, Ping> record) { (3)
    resultMessage = countChars(record.value().getMessage());
    Ping pong = Ping.newBuilder().setMessage(resultMessage).build();
    context().forward(record.withValue(pong)); (4)
  }

  /**
   * Counts the nb of characters and returns it as string.
   */
  private String countChars(String input) {
    return String.valueOf(input.length());
  }
}

1	Processors are beans managed by CDI, and as such need to be annotated with `@io.quarkiverse.kafkastreamsprocessor.api.Processor` so that any other CDI dependency may be injected. Kafka Streams' processors are not thread-safe, so a processor scope is `@jakarta.enterprise.context.Dependent` by default.
2	Directly extend `org.apache.kafka.streams.processor.api.ContextualProcessor`, specifying the type of keys and values. For keys, only type `String` is supported. For values, see supported types.
3	The `process` method is called for each incoming message. Any processing must be done synchronously.
4	The `outputrecord` method allows to produce an outgoing message. Internally Kafka Streams links the acknowledgment on the outgoing message with the commit of the associated incoming message.

Topology builder

In short (see core concepts), in Kafka Streams a Topology is made of:

Sources: 1 or more input topics
Processors: a graph of 1 or more record handlers
Stores (optional): for stateful use cases, the processor must be assigned one or multiple stores to keep the state.
Global Stores (optional): for stateful use cases, when it’s replicated across all instances of the application.
Sinks: 1 or more output topics

The extension will automatically build the following topology:

Scan for a CDI bean of type org.apache.kafka.streams.processor.api.Processor<KIn, VIn, KOut, VOut> (single processor). The Kafka 2 API org.apache.kafka.streams.processor.Processor<K, V> is no longer supported.
Register a source for a single topic in kafkastreamsprocessor.input.topic, String key serializer and Protobuf value deserializer based on the declared type <V> of the processor. If multiple topics are declared the configuration should be kafkastreamsprocessor.input.topics. Also, to manipulate topics coming from different sources the property should include the source name in the declaration as kafkastreamsprocessor.input.sources.<sources>.topic.
Register as many sinks as topics listed in properties matching kafkastreamsprocessor.output.sinks.<sink>.topic, String key serializer and Protobuf value serializer. Sink name is taken from the properties name. If no kafkastreamsprocessor.output.sinks.<sink>.topic properties are defined and a single topic is defined in kafkastreamsprocessor.output.topic, a default sink is used.

Configuration

Please consider two kind of properties:

First class citizen properties

First class citizen properties are a set of properties prefixed with quarkus.kafka-streams. They are consumed by the Quarkus kafka-streams extension itself (used at build time and overridable, for some of them, at runtime).

In Kafka Streams Processor the following configuration keys are required:

kafkastreamsprocessor.input.topic, kafkastreamsprocessor.input.topics or kafkastreamsprocessor.input.sources.<source>.topic
kafkastreamsprocessor.output.topic or kafkastreamsprocessor.output.sinks.<sink>.topic
quarkus.kafka-streams.topics
quarkus.kafka-streams.application-id

Refer to Kafka Streams extension’s configuration reference for a full list.

All the properties used in the extension are recapped here

Pass-through properties

Due to the important number of properties that can be set to configure a Kafka Streams application, it’s impossible for Quarkus extension to know them all.

But it’s possible to use pass-through properties in the configuration of your application, using the name of property but prefixed with kafka-streams.

# First citizen property
quarkus.kafka-streams.bootstrap-servers=localhost:9092

# Pass-through properties
kafka-streams.producer.linger.ms=0
kafka-streams.compression.type=gzip

Override pass-through properties

Overriding pass-through properties at runtime is a little bit hacky because you need to prefix the property with quarkus. Then, you can pass it as a Java system property.

Passing the property as an environment variable with Microprofile Config convention IS NOT working.

Fo example, the possible way to override a pass-through property in a Kafka Streams Processor microservice would be to use the corresponding system property in the JVM commandline:

-Dquarkus.kafka-streams.producer.linger.ms=50

Data serialization

Data serialization refers to the process of converting complex data structures or objects into a format that can be easily transmitted over a network. The extension’s processors support different serialization formats. By analyzing the types of declared input and output payloads of the processor, it’s possible to infer the type of data to serialize and set up the correct serializer and deserializer.

As of today, the extension’s processors support three first-class citizen formats and one customization feature:

JSON objects
Protobuf generated classes
Other data format can be supported through the Serializer/Deserializer customization

JSON is the default serialization format. If the payload cannot be serialized to JSON, an error is thrown. Depending on the Global DLQ Error management the message will be produced in the DLQ, and the incoming message will be acknowledged in the input topic.

JSON

JSON is a first class citizen serialization format, the SDK will use JSON as default serialization format. To used JSON, a processor needs the generated data model as value type. The data model is read from kafka, or written to kafka, with the data model’s JSON textual representation.

PojoProcessor.java

@Slf4j
@Processor (1)
public class PojoProcessor extends ContextualProcessor<String, SamplePojo, String, SamplePojo> { (2)
    @Override
    public void process(Record<String, SamplePojo> record) { (3)
        String reversedMsg = new StringBuilder(record.value().getStringField()).reverse().toString();
        log.info("Received value {} sending back {} in response", record.value().getStringField(), reversedMsg);
        SamplePojo pojo = new SamplePojo(reversedMsg, record.value().getNumericalField() + 37,
                !record.value().getBooleanField());
        context().forward(record.withValue(pojo)); (4)
    }
}

1	Your Processor is declared with the annotation as for a regular processor.
2	The handled value type, is a class that can be serialized in JSON.
3	Same data model value in the process() method.
4	Similarly, a data model value can be sent down to the output topic.

In order to customize the JSON serialization, the SDK relies on the ObjectMapper customization from Quarkus

Serialization to Protobuf and its caveats

Protobuf values and String are the first class citizen data types in Kafka Streams Processor messaging.

To ease development with Protobuf, various libraries support code generation from .proto files or OpenAPI descriptors, for instance openapi-generator.

The main problem with this approach is that certain details of the serialization logic are implementation-dependent and may have impacts on the usability of the generated APIs.

One such case in the handling of optional elements, i.e. how to detect in business code if an element was absent, or present but with an empty value.

There are multiple solutions to this problem, e.g.:

Use built-in primitive types
Use custom Nullable types instead of primitives
Use built-in wrapper types instead of primitives
Use optional in proto3

Serde customization

The SDK provides the interface io.quarkiverse.kafkastreamsprocessor.api.configuration.ConfigurationCustomizer which allows you to specify the serializer and deserializer you need. You need to declare a bean implementing that interface.

It is available through the following dependency:

pom.xml

<dependency>
  <groupId>io.quarkiverse.kafkastreamsprocessor</groupId>
  <artifactId>quarkus-kafka-streams-processor-api</artifactId>
</dependency>

Multiple customizers can be defined, and their execution order controlled through @Priority annotations.

Then the processor should be declared with the custom data model as value type. The data model is read from kafka, or written to kafka, with a custom serialization format provided by sink and source serializer/deserializer that need to be known at build time.

PojoProcessor.java

@Slf4j
@Processor (1)
public class PojoProcessor extends ContextualProcessor<String, SamplePojo, String, SamplePojo> { (2)
    @Override
    public void process(Record<String, SamplePojo> record) { (3)
        String reversedMsg = new StringBuilder(record.value().getStringField()).reverse().toString();
        log.info("Received value {} sending back {} in response", record.value().getStringField(), reversedMsg);
        SamplePojo pojo = new SamplePojo(reversedMsg, record.value().getNumericalField() + 37,
                !record.value().getBooleanField());
        context().forward(record.withValue(pojo)); (4)
    }
}

1	Your Processor is declared with the annotation as for a regular processor.
2	The handled value type, in this example, is a simple data model, nothing fancy.
3	Same data model value in the process() method.
4	Similarly, a data model value can be sent down to the output topic.

And then you inject your custom serde with:

SampleConfigurationCustomizer.java

@Dependent
public class SampleConfigurationCustomizer implements ConfigurationCustomizer {
  @Override
  public void fillConfiguration(Configuration configuration) {
      configuration.setSinkValueSerializer(new MyCustomSerializer<SamplePojo>());
    configuration.setSourceValueSerde(new MyCustomSerde(SamplePojo));
  }
}

The io.quarkiverse.kafkastreamsprocessor.api.configuration.ConfigurationCustomizer implementation provided sets the required Serializer and Deserializer into the method’s io.quarkiverse.kafkastreamsprocessor.api.configuration.Configuration object. In this example, we set a custom serialization through the usage of Kafka Serdes.

CloudEvent support

Cloud Events are a specification for describing event data in a common way. Events are everywhere, yet event publishers tend to describe events differently. The motivations are consistency, accessibility and portability.

Activation

The activation is done separately for input and output:

kafkastreamsprocessor.input.is-cloud-event to true so consumed messages are parsed as CloudEvent.
kafkastreamsprocessor.output.is-cloud-event to true so produced messages are encapsulated as CloudEvent.

API

After activation of the input, you will be able in your processor to access the metadata that was sent along with the CloudEvent through the injection of the CloudEventContextHandler:

@Processor
public class MyProcessor implements ContextualProcessor<String, String, String, String> {
  @Inject
  CloudEventContextHandler handler;

  @Override
  public void process(Record<String, String> record) {
    if (handler.getIncomingContext().getSource().getHost().contains("myhost.io")) { (1)
      // ignoring record
    }
    context.forward(record);
  }
}

1	Accessing some metadata of the `CloudEvent` using `CloudEventContextHandler#getIncomingContext()`

If you activated CloudEvent as output, you will have to provide the metadata for it. This can be done using the CloudEventContextHandler again:

@Processor
public class MyProcessor implements ContextualProcessor<String, String, String, String> {
  @Inject
  CloudEventContextHandler handler;

  @Override
  public void process(Record<String, String> record) {
    cloudEventContextHandler.setOutgoingContext( (1)
          cloudEventContextHandler.contextBuilder() (2)
              .withExtension("extension", "value")
              .build());
    context.forward(record);
  }
}

1 Setting the CloudEvent metadata used to build the outgoing CloudEvent object using CloudEventContextHandler#setOutgoingContext()

Using a builder provider by the handler to avoid having exceptions because no source, no type and no id have been provided. If no id is defined, a UUID is generated. For source and type, it is taken from configuration:

kafkastreamsprocessor.output.cloud-events-source with a default source for all outgoing CloudEvent
kafkastreamsprocessor.output.cloud-events-type with a default type for all outgoing CloudEvent

Serialization customization

The CloudEventSerializer can be configured with some key/value configuration. This can be customized with kafkastreamsprocessor.output.cloud-event-serializer-config. The potential values are quickly documented in the Javadoc of clientevents-kafka but follows a more pieced-together version for convenience.

cloudevents.serializer.encoding: Potential values are:
- STRUCTURED: Structured mode. The payload and the metadata are all serialized as JSON and used as the payload of the Kafka message.
- BINARY: Binary mode (Default). The metadata is inserted as headers to the Kafka message.
cloudevents.serializer.event_format: if encoding (previous key) is set to STRUCTURED, what will be the data format of the payload. The available formats provided by the cloud event Java SDK can be found here. Potential values:
- application/cloudevents+json: JSON format with Jackson. You will need the cloudevents-json-jackson artifact on the classpath.
- application/cloudevents+protobuf: Protobuf format. You will need the cloudevents-protobuf artifact on the classpath.
- application/cloudevents+xml: XML format using JAXP. You will need the cloudevents-xml artifact on the classpath.
- application/cloudevents+avrocompact: the Avro Compact format. You will need the cloudevents-avro-compact artifact on the classpath.

Error management

Strategy for error management

When an exception occurs while processing a record, different strategies have been defined:

"continue" (default) : drop the message and continue processing
"fail" (not implemented yet) : fail and stop processing more message
"dead-letter-queue" : send the message to the DLQ and continue processing

DLQ monitoring

Such exception occurs are counted by a custom metric of this extension. If the dead-letter-queue strategy is chosen, the messages produced to the DLQ topics are counted.

Dead-letter-queue strategy

In the case of a dead-letter-queue strategy, the DLQ topic is added to the readiness probe to ensure the ability to produce the poisonous message to this topic.

This local DLQ is meant only to store the poisonous message that the application could not consume successfully. For error happening at production time, one can use a global DLQ defined for this purpose.

When a microservice cannot produce to a downstream Kafka broker and that the error cannot be managed by the application, the message will be produced in a global DLQ.

The following value must be set in the application.properties file

application.properties

kafkastreamsprocessor.global-dlq.topic=<global-dlq-topic>

By default, the maximum size for the message is java.lang.Integer.MAX_VALUE for the global DLQ. It can be modified with:

application.properties

kafkastreamsprocessor.global-dlq.max-message-size=3000

All the properties used in the extension are recapped here

For tracking purpose following headers are added to the record:

dead-letter-reason: The reason of the failure if any (Exception message)
dead-letter-cause: The cause of the failure if any (Exception cause)
dead-letter-topic: The original topic of the record, in this case the output topic where production failed
dead-letter-partition: The original partition of the record

Retries

In order to improve resiliency and to avoid losing messages due to the temporary unavailability of an external system (database, HTTP endpoint, …), the Kafka Streams Processor SDK has a retry logic for the exceptions raised during the process of the message.

The retry relies on the at-least-once guarantees of messaging: the processing must already support being called multiple times with the same message, and must make sure any external interaction is idempotent.

A RetryDecorator has been introduced inside the topology definition. Its responsibility is to retry indefinitely the @io.quarkiverse.kafkastreamsprocessor.api.Processor retryableProcess method when a io.quarkiverse.kafkastreamsprocessor.api.exception.RetryableException is raised.

The default behavior can be overridden via the following configuration:

application.properties

kafkastreamsprocessor.retry.max-retries=-1
kafkastreamsprocessor.retry.retry-on=io.quarkiverse.kafkastreamsprocessor.api.exception.RetryableException

As this implementation is based on Eclipse Microprofile Fault Tolerance it’s also possible to override some default policy (i.e.: maxDuration, jitter…) under the prefix kafkastreamsprocessor.retry..

You can also use the Fault Tolerance @org.eclipse.microprofile.faulttolerance.Retry annotation for some other logic in your code.

PingerService.java

public class PingerService {
    @Retry (1)
    @Override
    public Response ping(String message) {
    }
}

1	Define the method to retry with `org.eclipse.microprofile.faulttolerance.Retry` annotation

application.properties

# Override max retry default value
io.quarkiverse.kafkastreamsprocessor.example.PingerService/Retry/maxRetries=3

Monitoring

Kafka metrics cardinality

Users have to pay attention to some specificities of Kafka Streams metrics: some metrics have labels whose values are dynamic. There’s a possible impact on Prometheus because this could drastically increase the amount of data needed to store the time series. To understand how IDs label are computed, one can read the following documentation on KafkaStreams runtime information

To mitigate the impact, it is a good practice to fix arbitrary prefix to dynamic labels but still, the variability cannot be removed entirely.

As an example, all Kafka metrics have in common the label client_id, the resulting computation of the label by the SDK is:

Example metric

kafka_consumer_fetch_manager_records_consumed_total{client_id="application-name-StreamThread-1-consumer",kafka_version="3.6.3",topic="ping-events",} 1.0

The metric client_id value is formed of :

quarkus-kafka-streams-processor-simple-sample which is the client id defined in kafka-streams.client.id config property
StreamThread-1-consumer which is computed by KafkaStreams and depends on the number of threads.

So it is a good practice to set kafka-streams.client.id with the name of the microservice.

quarkus-kafka-streams-processor metric list

This list includes the additional metrics, on top of the Kafka Streams and the general Kafka ones, that this extension instruments:

Table 1. Messaging metrics
Micrometer metric name	Type	Description
`kafkastreamsprocessor.processor.errors`	Counter	Total number of errors encountered during Kafka Streams message processing

Table 2. Punctuation metrics
Micrometer metric name	Type	Description
`kafkastreamsprocessor.punctuation.errors`	Counter	The number of times the execution of a `Punctuator` failed with an exception since the start of the microservice.

Table 3. Dead Letter Queue Metrics
Micrometer metric name	Type	Description
`kafkastreamsprocessor.dlq.sent`	Counter	The number of messages sent to DLQ. This metrics differs from the `kafkastreamsprocessor.processor.errors` in that the deserialization errors are taken into account. If no DLQ strategy is chosen, then this metric will be 0. Messages too large for the DLQ will be counted both in this metrics and in `kafkastreamsprocessor.global.dlq.sent`.
`kafkastreamsprocessor.global.dlq.sent`	Counter	The number of messages sent to global DLQ.

A comparison between Reactive Messaging Kafka and Kafka Streams

These two technologies can be used to create streaming microservices to be used in Event-Driven architecture applications.

As explained in the Streaming processor introduction, a streaming microservice has some requirements:

An incoming message, from an input topic, can be acknowledged only when it has been processed and its result message has been produced and stored in a Kafka output topic.
If an incoming message could not be acknowledged (e.g: a microservice crash) then the incoming message will be redelivered (at-least-once guarantee).
Incoming messages are processed in the order of arrival (ordering guarantee)

In the Kafka Streams bootstrap, the guarantee of ordering from the source topic has a direct impact on the message consumption and the scalability model. As explained in the Kafka Streams, the number of partitions of the source topic determines how many threads can consume in parallel.

In other terms, for an input topic, it’s not possible to process more that x messages in parallel where x is the number of partitions of the source topic.

Starting asynchronous processing in a Kafka Streams microservice would lead to loss of messages: acknowledgment of the incoming messages will be de-synchronized from the production of messages once outgoing messages have been processed.

Reactive messaging has a different architecture and threading model compared to Kafka Streams. But by default, when processing messages it comes with the same guarantees seen above.

However, it’s possible to alter the behaviour of the message processing with annotations to improve, in a drastic manner, the concurrency of a microservice. The purpose of increasing concurrency is to be able to cope with streaming microservice that need to call slow remote APIs, this is the bridging from event-driven to service-oriented architecture.

Concurrent processing in Reactive Messaging

  @Incoming("ping-events") (1)
  @Outgoing("pong-events") (2)
  @Blocking(ordered = false) (3)
  public Ping process(Ping ping) {
    return api.remoteCall();
  }

1	`@Incoming` is declaring this method as a subscriber for the channel named `ping-events`
2	`@Outgoing` is declaring this method as a producer for the channel named `pong-events`
3	`@io.smallrye.reactive.messaging.annotations.Blocking` Indicates that this method is running out of the processing thread, inside a worker thread and the order of the messages is not important. Note that once the execution is finished the result is handled back to the processing thread.

Guarantee of delivery in concurrent processing

In Kafka Streams, performing asynchronous calls would lead to the possibility of messages not being redelivered. What about the guarantee of delivery when processing concurrent messages in Reactive Messaging?

It’s up to the acknowledgment policy to commit incoming messages when it has been processed concurrently. The default strategy is throttled and guarantees at-least-once delivery

throttled keeps track of received messages and commits to the next offset after the latest acked message in sequence. This strategy guarantees at-least-once delivery even if the channel performs asynchronous processing.

Effects of throttled.unprocessed-record-max-age.ms on the liveness probe

When using the throttled strategy, the commit of the sequence of messages will happen only when all messages have been processed (ordered or not).

If the processing method does not return before reaching the throttled.unprocessed-record-max-age.ms timeout, the microservice will be considered as unhealthy and the liveness probe will fail and the microservice will be restarted. It means that, for example, if a remote API takes too long to respond for some queries, and its timeout is configured with a value above throttled.unprocessed-record-max-age.ms, then messages will be received again after the restart, acting like a poison pill.

Processors summary

Here’s a recap of the guarantees offered by the different way of processing messages.

Processor	Guarantee of order	At-least-once delivery	Max concurrency
Default Kafka Streams	Yes	Yes	number of partitions
Kafka Streams asynchronous	No	No	number of threads
Default Reactive Messaging	Yes	Yes	number of partitions
Reactive Messaging with @Blocking	Yes	Yes	number of partitions
Reactive Messaging with @Blocking(order=false)	No	Yes	number of threads

Processor

Guarantee of order

At-least-once delivery

Max concurrency

Default Kafka Streams

Yes

number of partitions

Kafka Streams asynchronous

number of threads

Default Reactive Messaging

Yes

number of partitions

Reactive Messaging with @Blocking

Yes

number of partitions

Reactive Messaging with @Blocking(order=false)

Yes

number of threads

Stateful EDA

Introduction

To support stateful requirements, Kafka Streams' Processor needs to implement a State-Store. By default, no State-Store is linked to the io.quarkiverse.kafkastreamsprocessor.api.Processor but the application can override this configuration.

The quarkus-kafka-streams-processor extension provides support for both local and global state stores in Kafka Streams applications.

A local state store is scoped to a single application instance and is associated with a specific partition. It holds data that is only accessible to that instance and its assigned partition, ensuring the state is not shared with others. This is useful for storing state that is specific to the processing done by a particular instance and partition.

A global state store is replicated across all instances of the application. It contains data that is shared and accessible from any instance, ensuring that every node has a complete and consistent view of the store’s contents. Unlike local state stores, a global state store does not care about partition assignment: every instance receives all records, regardless of partition.

In both cases, you need to add the following dependency to your pom.xml

pom.xml

<dependency>
  <groupId>io.quarkiverse.kafkastreamsprocessor</groupId>
  <artifactId>quarkus-kafka-streams-processor-api</artifactId>
</dependency>

Local State-Store

The SDK provides the interface io.quarkiverse.kafkastreamsprocessor.api.configuration.ConfigurationCustomizer which allows you to specify the State-Store you need. You need to declare a bean implementing that interface.

Multiple customizers can be defined, and their execution order controlled through @Priority annotations.

In this example, a processor is implemented that interacts with a State-Store to persist and update data during stream processing.

StateStoreProcessor.java

@Slf4j
@Processor (1)
public class PingProcessor extends ContextualProcessor<String, Ping, String, Ping> {
    private KeyValueStore<String, String> pingData;

    @Override
    public void init(ProcessorContext<String, Ping> context) {
        super.init(context);
        pingData = context.getStateStore("ping-data"); (2)
        context.schedule(Duration.ofMillis(1L), PunctuationType.STREAM_TIME, new DuplicateValuePunctuator(pingData));
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void process(Record<String, Ping> record) {
        log.info("Process the message: {}", record.value().getMessage());

        String previousValue = pingData.get(record.key());
        pingData.put(record.key(), record.value().getMessage());

        if (previousValue == null) {
            context().forward(
                    record.withValue(Ping.newBuilder().setMessage("Store initialization OK for " + record.key()).build()));
        } else {
            context().forward(record.withValue(
                    Ping.newBuilder().setMessage("Previous value for " + record.key() + " is " + previousValue).build()));
        }
    }
}

1	Your Processor is declared with the annotation as for a regular processor.
2	The definition and initialization of your state store.

To configure a state store, implement the ConfigurationCustomizer interface and register it as a bean. In the fillConfiguration method, add your desired state store definitions to the provided Configuration object. This allows you to customize the state stores used by your processors according to your application’s requirements.

SampleConfigurationCustomizer.java

@Dependent (1)
public class SampleConfigurationCustomizer implements ConfigurationCustomizer {  (2)
    @Override
    public void fillConfiguration(Configuration configuration) {
        List<StoreConfiguration> storeConfigurations = new ArrayList<>(); (3)
        // Add a key value store for indexes
        StoreBuilder<KeyValueStore<String, String>> storeBuilder = Stores.keyValueStoreBuilder( (4)
                Stores.persistentKeyValueStore("ping-data"),
                Serdes.String(),
                Serdes.String());
        storeConfigurations.add(new StoreConfiguration(storeBuilder));
        configuration.setStoreConfigurations(storeConfigurations);
    }
}

1	Annotate your class with `@Dependent` to ensure it is managed by the CDI container.
2	Implement the `ConfigurationCustomizer` interface to customize the configuration of your Kafka Streams application.
3	Create a list of `StoreConfiguration` objects to define the state stores you want to use in your application.
4	Use the `Stores` utility class to create a `StoreBuilder` for your state store, specifying the store type and key/value serdes.

Global State-Store

Global state stores follow a key principle: they use two distinct processor types - global processors and business processors. Global processors are responsible for writing data to the global state store, while business processors should only read from it.

The SDK provides the interface io.quarkiverse.kafkastreamsprocessor.api.configuration.ConfigurationCustomizer which allows you to define your own configuration for global state stores. A key API for global state stores is the io.quarkiverse.kafkastreamsprocessor.api.configuration.store..GlobalStoreConfiguration that is used to add global state store to your topology.

In this example, a processor is implemented that retrieves data from a global state store and processes it.

BusinessProcessor.java

@Slf4j
@Processor
public class PingProcessor extends ContextualProcessor<String, Ping, String, Ping> {
    private KeyValueStore<String, String> pingData;

    @Override
    public void init(ProcessorContext<String, Ping> context) {
        super.init(context);
        pingData = context.getStateStore("ping-data");
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void process(Record<String, Ping> record) {
        String storedValue = pingData.get(record.key()); (1)

        log.info("Retrieve the value for key from global data store: {}", record.key());

        context().forward(record.withValue(
                Ping.newBuilder().setMessage("Stored value for " + record.key() + " is " + storedValue)
                        .build()));
    }
}

1	Retrieve the value from the global state store using the key from the incoming record.

SampleConfigurationCustomizer.java

@Dependent (1)
public class SampleConfigurationCustomizer implements ConfigurationCustomizer { (2)
    @Override
    public void fillConfiguration(Configuration configuration) {
        List<GlobalStoreConfiguration> globalStoreConfigurations = new ArrayList<>(); (3)

        StoreBuilder<KeyValueStore<String, String>> storeBuilder = Stores.keyValueStoreBuilder( (4)
                Stores.persistentKeyValueStore("ping-data"),
                Serdes.String(),
                Serdes.String())
                .withLoggingDisabled(); (5)

        globalStoreConfigurations.add(new GlobalStoreConfiguration<String, String>( (6)
                storeBuilder,
                new StringDeserializer(),
                new StringDeserializer(),
                () -> new CustomStoreProcessor("ping-data"))); (7)
        configuration.setGlobalStoreConfigurations(globalStoreConfigurations);
    }
}

1	Annotate your class with `@Dependent` to ensure it is managed by the CDI container.
2	Implement the `ConfigurationCustomizer` interface to customize the configuration of your Kafka Streams application.
3	Create a list of `GlobalStoreConfiguration` objects to define the global state stores you want to use in your application.
4	Use the `Stores` utility class to create a `StoreBuilder` for your global state store, specifying the store type and key/value serdes. Here, the store key and value are both of type `String`.
5	Having logging disabled is mandatory, otherwise you will get an exception at runtime.
6	Add a new `GlobalStoreConfiguration` to the list, specifying the store builder, key and value deserializers, and a supplier for the global processor.
7	The supplier provides an instance of the `CustomStoreProcessor`, which is responsible for processing records and storing them in the global state store. In the next step you will find the declaration of `CustomStoreProcessor` that is used to process records and store them in capitalized form. If you don’t specify a custom global processor it will use `io.quarkiverse.kafkastreamsprocessor.impl.configuration.store.DefaultGlobalStateStoreProcessor` by default, which simply store the records in the global state store without any forwarding.

Do not annotate the CustomStoreProcessor with @Processor. This class is intended solely as a global store processor, not as a standard functional processor.

GlobalStateStoreProcessor.java

    public class CustomStoreProcessor extends ContextualProcessor<String, String, Void, Void> { (1)

        private final String storeName;

        private KeyValueStore<String, String> store;

        CustomStoreProcessor(String storeName) { (2)
            this.storeName = storeName;
        }

        @Override
        public void init(ProcessorContext<Void, Void> context) {
            super.init(context);
            // Initialize the store
            this.store = context.getStateStore(storeName);
        }

        @Override
        public void process(Record<String, String> record) {
            // Process the record and store it in capitalized form
            store.put(record.key(), record.value().toUpperCase()); (3)
        }
    }

1	In this example, the processor is only meant to store records in a global state store, so it does not need to forward any records, therefore the output key and value types are `Void`.
2	The constructor takes the name of the store as a parameter, which is used to retrieve the store in the `init` method.
3	The `process` method retrieves the record’s key and value, processes the value by converting it to uppercase, and stores it in the global state store using the key.

The following property must be set in your application.properties to associate a global state store with its Kafka topic:

application.properties

kafkastreamsprocessor.global-stores.<store-name>.topic=<global-data-topic>

Replace <store-name> with the name of your global state store and <global-data-topic> with the Kafka topic containing the data to be loaded into the store. Make sure that the store name matches the one defined in the GlobalStoreConfiguration and that you include the topic to the list of topics, i.e. quarkus.kafka-streams.topics=topicA,topicB,<global-data-topic>.

Punctuation

Kafka Streams allows you to define Punctuator that are sort of scheduled tasks that Kafka Streams triggers (Kafka Streams documentation). One key issue with Punctuators is that they do not support Exceptions:

a checked Exception cannot be thrown as the method signature does not allow it
a RuntimeException because Kafka Streams does not catch it. It basically crashes your whole microservice.

To work around the latter point and increase stability, the quarkus-kafka-streams-processor extension wraps a Punctuator before it is added in Kafka Streams to catch the RuntimeException and log an error instead. Exceptions are also counted with a dedicated metric.

Custom decorators

The extension proposes some capabilities to customize more finely the behaviour of the different layers of decoration in addition to the ones that this extension brings to the table.

Processor decorator

The following decoration layer is already extensively used in this extension’s source code and allows to use composition around the main processor class you have to define. The pattern to implement is:

@Dependent (1)
@Priority(150) (2)
public class ProcessorDecorator extends AbstractProcessorDecorator { (3)
    @Override
    public void process(Record record) { (4)
      // use bean before
      getDelegate().process(record);
      // use bean after
    }
}

1	We have to mark the bean `Dependent` so it is instantiated at every use. Indeed, `KStreamProcessorSupplier` needs to return a new `Processor` instance everytime it is called, by Kafka Streams' specification.
2	We add a `Priority`, to control exactly the order of decorators. The priority is to be set based on the priorities of the existing decorators which are: ProcessorDecoratorPriorities.java /** * Priority of the decorator in charge of tracing, creating a span around the * {@link ContextualProcessor#process(Record)} method. / public static final int TRACING = 100; /* * Priority of the decorator in charge or initializing a "request context" for the duration of the processing of the * {@link ContextualProcessor#process(Record)} method. It is closed afterward. / public static final int CDI_REQUEST_SCOPE = 200; /* * Priority of the decorator that will handle exception and potentially redirect the message in a dead letter queue * topic, if configured. / public static final int DLQ = 300; /* * Priority of the decorator in charge of measuring the processing time and the number of exceptions thrown. / public static final int METRICS = 400; /* * Priority of the decorator in charge of injecting all {@link DecoratedPunctuator} configured by the framework and * your custom potential additions. / public static final int PUNCTUATOR_DECORATION = 500; /* * Priority for the decorator that wraps the {@link org.apache.kafka.streams.processor.api.ProcessorContext} to * intercept calls to its <code>forward</code> methods. / public static final int CONTEXT_FORWARD = 550; /* * Priority of the decorator in charge of implementing a form of fault tolerance by means of calling again the * {@link ContextualProcessor#process(Record)} method. */ public static final int RETRY = 600;
3	We remove the generic types from the class signature, because CDI does not like generics in beans.
4	Example of override of process method and call to underlying decorator.

Such a decorator will automatically been taken into account by CDI. The priority will control at which point your decorator will be called among all other decorators.

Output record interceptor

A Processor usually ends with the forwarding of an outgoing message. For that purpose, the ProcessorContext#foward methods are to be used.

From quarkus-kafka-streams-processor 4.1, calls to those methods can be intercepted. If you define a bean of type OutputRecordInterceptor, it will be taken up by the framework and executed. There is an order to multiple instances of interceptors, driven by the value returned by OutputRecordInterceptor#priority.

Example of interceptor:

@ApplicationScoped
public class MyOutputRecordInterceptor implements OutputRecordInterceptor {
    @Override
    public Record interceptOutputRecord(Record record) {
        record.headers().add("MyHeader", "MyValue".getBytes(StandardCharsets.UTF_8));
        return record;
    }

    @Override
    public int priority() {
        return 150; (1)
    }
}

1	Priority of this interceptor. Default priority is 100. References for priorities of existing interceptors this library provides: `/** * Priority of the interceptor that will inject the tracing headers for propagation. */ public static final int TRACING = 100;` Lower priority are executed first.

Producer interceptor

Kafka Streams already has the concept of a ProducerInterceptor. ProducerInterceptor are called after the key and value serialization, at the difference with OutputRecordInterceptor which are called before.

But as the rest of Kafka Streams SPI, it is based on a class name and a default constructor for instantiation. It does not support CDI resolution.

This is why this extension’s API defines a ProducerOnSendInterceptor interface that is instrumented through CDI. Example of usage:

MyProducerInterceptor.java

@ApplicationScoped (1)
public class HeaderAddingProducerInterceptor implements ProducerOnSendInterceptor { (2)
    @Override
    public int priority() { (3)
        return 200;
    }

    @Override
    public ProducerRecord<byte[], byte[]> onSend(ProducerRecord<byte[], byte[]> producerRecord) { (4)
        producerRecord.headers().remove("header");
        producerRecord.headers().add("header", "headerValue".getBytes(StandardCharsets.UTF_8));
        return producerRecord;
    }
}

1	Producer interceptors are discovered by CDI by the `ApplicationScoped` annotation
2	The interceptor class should extend `ProducerOnSendInterceptor`. `ProducerOnSendInterceptor` extends `ProducerInterceptor<byte[], byte[]>` and overrides some of its methods with default implementations to exempt their forced implementations further down the line.
3	This is not mandatory. But it allows to control in which order multiple interceptor that would defined are called. The default priority is `100`.
4	The intercepted method call which allow to modify the ProducerRecord before it is finally sent to Kafka. In this example, we replace a header named `header` with another value `headervalue`.

Punctuator decorator

A Kafka Streams Punctuator is a callback to use with ProcessorContext#schedule(…). It allows to schedule a periodic operation, depending on batch of incoming messages or on a timely manner.

We propose in the extension a way to decorate any Punctuator a microservice would create.

The extension does not give the capability to narrow the decoration on only a given instance or class of Punctuator.

The decoration happens through the implementation of a DecoratedPunctuator interface with a bit the same idea as for the Processor decorator: usage of the CDI Decorator feature. Example of decoration:

MyDecoratedPunctuator.java

@Decorator (1)
@Priority(150) (2)
public class CdiRequestContextPunctuatorDecorator implements DecoratedPunctuator { (3)
    @lombok.experimental.Delegate(excludes = Excludes.class)
    private final DecoratedPunctuator delegate; (4)

    private final MyBean myBean;

    @Inject
    public CdiRequestContextPunctuatorDecorator(@Delegate DecoratedPunctuator delegate, MyBean myBean) { (5)
        this.delegate = delegate;
        this.myBean = myBean;
    }

    @Override
    public void punctuate(long timestamp) { (6)
        // use before punctuate the myBean reference
        delegate.punctuate(timestamp);
        // use after punctuate the myBean reference
    }

    private interface Excludes {
        void punctuate(long timestamp);
    }
}

1	Decorator annotation to profit from the decorator feature of CDI
2	Force the instantiation of the decorator with the Priority annotation. Indeed, otherwise the decorator is not taken into account by Quarkus. The priority is to be set based on the priorities of the existing decorators which are: PunctuatorDecoratorPriorities.java `/** * Priority of the {@link DecoratedPunctuator} that enabled a "request context" for the duration of the * {@link Punctuator#punctuate(long)} processing. / public static final int CDI_REQUEST_SCOPE = 100; /* * Priority of the {@link DecoratedPunctuator} that catches punctuation exception to avoid making the entire * microservice crash and counts those exceptions in a metric. */ public static final int METRICS = 200;`
3	The decorator should extend the `DecoratedPunctuator` interface defined by this extension. It won’t be considered otherwise.
4	Delegate reference to use when decorating methods. It is annotated with Delegate annotation from Lombok to generate passthrough decorated methods that this Decorator class won’t decorate. The selection is done through a blacklist of method signatures gathered in a private `Excludes` interface declared at the end of the class.
5	Injection constructor which must have a delegate argument annotated with the `Delegate` annotation from CDI. Is showcased here also the injection of another bean to be used in this decorator.
6	Example of decorated method, here the main `punctuate` method of the `Punctuator` interface.

Extension configuration reference

Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property	Type	Default
`kafkastreamsprocessor.input.topic` Single topic to listen to. If you need more than one, use `topics()` or `sources()`. Environment variable: `KAFKASTREAMSPROCESSOR_INPUT_TOPIC`	string
`kafkastreamsprocessor.input.topics` List of topics to listen to. If you need only one, use `topic()` Environment variable: `KAFKASTREAMSPROCESSOR_INPUT_TOPICS`	list of string
`kafkastreamsprocessor.input.sources."sources".topics` To which topics will KafkaStreams connect to for this source. Environment variable: `KAFKASTREAMSPROCESSOR_INPUT_SOURCES__SOURCES__TOPICS`	list of string	required
`kafkastreamsprocessor.input.is-cloud-event` Whether cloud events are used on the input Environment variable: `KAFKASTREAMSPROCESSOR_INPUT_IS_CLOUD_EVENT`	boolean	`false`
`kafkastreamsprocessor.input.cloud-event-deserializer-config."cloud-event-deserializer-config"` Allows to inject custom configuration for the CloudEventDeserializer. As of now, only one configuration property is supported: `cloudevents.datamapper` and its value must be a reference to a CloudEventDataMapper, which with the String value type here, cannot be passed. But the Cloud Events Java SDK could add other configuration properties later on, and this map makes `quarkus-kafka-streams-processor` more future-proof. Environment variable: `KAFKASTREAMSPROCESSOR_INPUT_CLOUD_EVENT_DESERIALIZER_CONFIG__CLOUD_EVENT_DESERIALIZER_CONFIG_`	Map<String,String>
`kafkastreamsprocessor.output.topic` The processor is mono-output, we designate one topic Environment variable: `KAFKASTREAMSPROCESSOR_OUTPUT_TOPIC`	string
`kafkastreamsprocessor.output.sinks."sinks".topic` The topic associated to this sink Environment variable: `KAFKASTREAMSPROCESSOR_OUTPUT_SINKS__SINKS__TOPIC`	string	required
`kafkastreamsprocessor.output.is-cloud-event` Whether cloud events are used on the output Environment variable: `KAFKASTREAMSPROCESSOR_OUTPUT_IS_CLOUD_EVENT`	boolean	`false`
`kafkastreamsprocessor.output.cloud-event-serializer-config."cloud-event-serializer-config"` Allows to inject custom configuration for the CloudEventSerializer. The potential values are documented in the cloudevents Java SDK (linked thereafter) and in the quarkus-kafka-streams-processor userguide Environment variable: `KAFKASTREAMSPROCESSOR_OUTPUT_CLOUD_EVENT_SERIALIZER_CONFIG__CLOUD_EVENT_SERIALIZER_CONFIG_`	Map<String,String>
`kafkastreamsprocessor.output.cloud-events-type` Allows to define the type field of the CloudEvent for all the configured sinks. It is used only if `OutputConfig#isCloudEvent()` is `true`. Environment variable: `KAFKASTREAMSPROCESSOR_OUTPUT_CLOUD_EVENTS_TYPE`	string
`kafkastreamsprocessor.output.cloud-events-source` Allows to define the source field of the CloudEvent for all the configured sinks. It is used only if `OutputConfig#isCloudEvent()` is `true`. Environment variable: `KAFKASTREAMSPROCESSOR_OUTPUT_CLOUD_EVENTS_SOURCE`	URI
`kafkastreamsprocessor.output.cloud-events-spec-version` Version of the CloudEvents spec to use Environment variable: `KAFKASTREAMSPROCESSOR_OUTPUT_CLOUD_EVENTS_SPEC_VERSION`	`v03`, `v1`	`v1`
`kafkastreamsprocessor.global-stores."global-stores".topic` The name of the global state store topic Environment variable: `KAFKASTREAMSPROCESSOR_GLOBAL_STORES__GLOBAL_STORES__TOPIC`	string	required
`kafkastreamsprocessor.dlq.topic` Topic to use as dead-letter-queue Environment variable: `KAFKASTREAMSPROCESSOR_DLQ_TOPIC`	string
`kafkastreamsprocessor.dlq.cloud-event-serializer-config."cloud-event-serializer-config"` Allows to inject custom configuration for the CloudEventSerializer that will be used to serialize the input cloud event to the DLQ. The possible keys that can be used are documented in `CloudEventOutputConfig#cloudEventSerializerConfig()`. Environment variable: `KAFKASTREAMSPROCESSOR_DLQ_CLOUD_EVENT_SERIALIZER_CONFIG__CLOUD_EVENT_SERIALIZER_CONFIG_`	Map<String,String>
`kafkastreamsprocessor.global-dlq.topic` Global Dead letter Queue to produce error messages not managed by the application Environment variable: `KAFKASTREAMSPROCESSOR_GLOBAL_DLQ_TOPIC`	string
`kafkastreamsprocessor.global-dlq.max-message-size` Global Dead letter Queue maximum message size in bytes. Default is 2147483647 bytes (which means about 2GB). Environment variable: `KAFKASTREAMSPROCESSOR_GLOBAL_DLQ_MAX_MESSAGE_SIZE`	int	`2147483647`
`kafkastreamsprocessor.error-strategy` Kafka error handling strategy. Possible values are: `continue`: (default) drop the message and continue processing `dead-letter-queue`: send the message to the DLQ and continue processing `fail`: (not implemented yet) fail and stop processing more message Environment variable: `KAFKASTREAMSPROCESSOR_ERROR_STRATEGY`	string	`continue`
`kafkastreamsprocessor.retry.max-retries` Max number of retries. Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_MAX_RETRIES`	int	`-1`
`kafkastreamsprocessor.retry.delay` The delay between retries. Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_DELAY`	long	`0`
`kafkastreamsprocessor.retry.delay-unit` The unit for `delay()`. Default is milliseconds. Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_DELAY_UNIT`	`nanos`, `micros`, `millis`, `seconds`, `minutes`, `hours`, `half-days`, `days`, `weeks`, `months`, `years`, `decades`, `centuries`, `millennia`, `eras`, `forever`	`millis`
`kafkastreamsprocessor.retry.max-duration` The max duration. Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_MAX_DURATION`	long	`180000`
`kafkastreamsprocessor.retry.duration-unit` The duration unit for `max-duration()`. Milliseconds by default. Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_DURATION_UNIT`	`nanos`, `micros`, `millis`, `seconds`, `minutes`, `hours`, `half-days`, `days`, `weeks`, `months`, `years`, `decades`, `centuries`, `millennia`, `eras`, `forever`	`millis`
`kafkastreamsprocessor.retry.jitter` Jitter value to randomly vary retry delays for. Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_JITTER`	long	`200`
`kafkastreamsprocessor.retry.jitter-delay-unit` The delay unit for `jitter()`. Default is milliseconds. Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_JITTER_DELAY_UNIT`	`nanos`, `micros`, `millis`, `seconds`, `minutes`, `hours`, `half-days`, `days`, `weeks`, `months`, `years`, `decades`, `centuries`, `millennia`, `eras`, `forever`	`millis`
`kafkastreamsprocessor.retry.retry-on` The list of exception types that should trigger a retry. Default is the extension’s RetryableException Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_RETRY_ON`	list of string	`io.quarkiverse.kafkastreamsprocessor.api.exception.RetryableException`
`kafkastreamsprocessor.retry.abort-on` The list of exception types that should not trigger a retry. Default is empty list Environment variable: `KAFKASTREAMSPROCESSOR_RETRY_ABORT_ON`	list of string

Configuration property

Type

Default

kafkastreamsprocessor.input.topic

Single topic to listen to.

If you need more than one, use topics() or sources().

Environment variable: KAFKASTREAMSPROCESSOR_INPUT_TOPIC

string

kafkastreamsprocessor.input.topics

List of topics to listen to.

If you need only one, use topic()

Environment variable: KAFKASTREAMSPROCESSOR_INPUT_TOPICS

list of string

kafkastreamsprocessor.input.sources."sources".topics

To which topics will KafkaStreams connect to for this source.

Environment variable: KAFKASTREAMSPROCESSOR_INPUT_SOURCES__SOURCES__TOPICS

list of string

required

kafkastreamsprocessor.input.is-cloud-event

Whether cloud events are used on the input

Environment variable: KAFKASTREAMSPROCESSOR_INPUT_IS_CLOUD_EVENT

boolean

false

kafkastreamsprocessor.input.cloud-event-deserializer-config."cloud-event-deserializer-config"

Allows to inject custom configuration for the CloudEventDeserializer.

As of now, only one configuration property is supported: cloudevents.datamapper and its value must be a reference to a CloudEventDataMapper, which with the String value type here, cannot be passed.

But the Cloud Events Java SDK could add other configuration properties later on, and this map makes quarkus-kafka-streams-processor more future-proof.

Environment variable: KAFKASTREAMSPROCESSOR_INPUT_CLOUD_EVENT_DESERIALIZER_CONFIG__CLOUD_EVENT_DESERIALIZER_CONFIG_

Map<String,String>

kafkastreamsprocessor.output.topic

The processor is mono-output, we designate one topic

Environment variable: KAFKASTREAMSPROCESSOR_OUTPUT_TOPIC

string

kafkastreamsprocessor.output.sinks."sinks".topic

The topic associated to this sink

Environment variable: KAFKASTREAMSPROCESSOR_OUTPUT_SINKS__SINKS__TOPIC

string

required

kafkastreamsprocessor.output.is-cloud-event

Whether cloud events are used on the output

Environment variable: KAFKASTREAMSPROCESSOR_OUTPUT_IS_CLOUD_EVENT

boolean

false

kafkastreamsprocessor.output.cloud-event-serializer-config."cloud-event-serializer-config"

Allows to inject custom configuration for the CloudEventSerializer.

The potential values are documented in the cloudevents Java SDK (linked thereafter) and in the quarkus-kafka-streams-processor userguide

Environment variable: KAFKASTREAMSPROCESSOR_OUTPUT_CLOUD_EVENT_SERIALIZER_CONFIG__CLOUD_EVENT_SERIALIZER_CONFIG_

Map<String,String>

kafkastreamsprocessor.output.cloud-events-type

Allows to define the type field of the CloudEvent for all the configured sinks.

It is used only if OutputConfig#isCloudEvent() is true.

Environment variable: KAFKASTREAMSPROCESSOR_OUTPUT_CLOUD_EVENTS_TYPE

string

kafkastreamsprocessor.output.cloud-events-source

Allows to define the source field of the CloudEvent for all the configured sinks.

It is used only if OutputConfig#isCloudEvent() is true.

Environment variable: KAFKASTREAMSPROCESSOR_OUTPUT_CLOUD_EVENTS_SOURCE

URI

kafkastreamsprocessor.output.cloud-events-spec-version

Version of the CloudEvents spec to use

Environment variable: KAFKASTREAMSPROCESSOR_OUTPUT_CLOUD_EVENTS_SPEC_VERSION

v03, v1

v1

kafkastreamsprocessor.global-stores."global-stores".topic

The name of the global state store topic

Environment variable: KAFKASTREAMSPROCESSOR_GLOBAL_STORES__GLOBAL_STORES__TOPIC

string

required

kafkastreamsprocessor.dlq.topic

Topic to use as dead-letter-queue

Environment variable: KAFKASTREAMSPROCESSOR_DLQ_TOPIC

string

kafkastreamsprocessor.dlq.cloud-event-serializer-config."cloud-event-serializer-config"

Allows to inject custom configuration for the CloudEventSerializer that will be used to serialize the input cloud event to the DLQ.

The possible keys that can be used are documented in CloudEventOutputConfig#cloudEventSerializerConfig().

Environment variable: KAFKASTREAMSPROCESSOR_DLQ_CLOUD_EVENT_SERIALIZER_CONFIG__CLOUD_EVENT_SERIALIZER_CONFIG_

Map<String,String>

kafkastreamsprocessor.global-dlq.topic

Global Dead letter Queue to produce error messages not managed by the application

Environment variable: KAFKASTREAMSPROCESSOR_GLOBAL_DLQ_TOPIC

string

kafkastreamsprocessor.global-dlq.max-message-size

Global Dead letter Queue maximum message size in bytes.

Default is 2147483647 bytes (which means about 2GB).

Environment variable: KAFKASTREAMSPROCESSOR_GLOBAL_DLQ_MAX_MESSAGE_SIZE

int

2147483647

kafkastreamsprocessor.error-strategy

Kafka error handling strategy.

Possible values are:

continue: (default) drop the message and continue processing
dead-letter-queue: send the message to the DLQ and continue processing
fail: (not implemented yet) fail and stop processing more message

Environment variable: KAFKASTREAMSPROCESSOR_ERROR_STRATEGY

string

continue

kafkastreamsprocessor.retry.max-retries

Max number of retries.

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_MAX_RETRIES

int

-1

kafkastreamsprocessor.retry.delay

The delay between retries.

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_DELAY

long

0

kafkastreamsprocessor.retry.delay-unit

The unit for delay().

Default is milliseconds.

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_DELAY_UNIT

nanos, micros, millis, seconds, minutes, hours, half-days, days, weeks, months, years, decades, centuries, millennia, eras, forever

millis

kafkastreamsprocessor.retry.max-duration

The max duration.

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_MAX_DURATION

long

180000

kafkastreamsprocessor.retry.duration-unit

The duration unit for max-duration().

Milliseconds by default.

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_DURATION_UNIT

nanos, micros, millis, seconds, minutes, hours, half-days, days, weeks, months, years, decades, centuries, millennia, eras, forever

millis

kafkastreamsprocessor.retry.jitter

Jitter value to randomly vary retry delays for.

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_JITTER

long

200

kafkastreamsprocessor.retry.jitter-delay-unit

The delay unit for jitter().

Default is milliseconds.

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_JITTER_DELAY_UNIT

nanos, micros, millis, seconds, minutes, hours, half-days, days, weeks, months, years, decades, centuries, millennia, eras, forever

millis

kafkastreamsprocessor.retry.retry-on

The list of exception types that should trigger a retry.

Default is the extension’s RetryableException

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_RETRY_ON

list of string

io.quarkiverse.kafkastreamsprocessor.api.exception.RetryableException

kafkastreamsprocessor.retry.abort-on

The list of exception types that should not trigger a retry.

Default is empty list

Environment variable: KAFKASTREAMSPROCESSOR_RETRY_ABORT_ON

list of string

Configuration from other extension

Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property	Type	Default
`quarkus.kafka-streams.topics` A comma-separated list of topic names. The micro-service will only be ready once all these topics are present in the Kafka cluster. Environment variable: `QUARKUS_KAFKA_STREAMS_TOPIC`	list of string
`quarkus.kafka-streams.application-id` A unique identifier for this Kafka Streams application. It is used as Kafka consumer group id, so multiple replicas of micro-service balance their partition workload. Environment variable: `QUARKUS_KAFKA_STREAMS_APPLICATION_ID`	string	${quarkus.application.name}
`kafka-streams.producer.linger.ms` It is the number of milliseconds a producer is willing to wait before sending a batch out. Environment variable: `KAFKA_STREAMS_PRODUCER_LINGER_MS`	int	0
`kafka-streams.compression.type` The compression type for all data generated by the producer. Environment variable: `KAFKA_STREAMS_COMPRESSION_TYPE`	Possible values: none gzip snappy lz4 zstd	none

Configuration property

Type

Default

quarkus.kafka-streams.topics

A comma-separated list of topic names.

The micro-service will only be ready once all these topics are present in the Kafka cluster.

Environment variable: QUARKUS_KAFKA_STREAMS_TOPIC

list of string

quarkus.kafka-streams.application-id

A unique identifier for this Kafka Streams application.

It is used as Kafka consumer group id, so multiple replicas of micro-service balance their partition workload.

Environment variable: QUARKUS_KAFKA_STREAMS_APPLICATION_ID