Amazon Kinesis Client
Kinesis Data Streams is a managed service provided by Amazon Web Services (AWS) that scales elastically for real-time processing of streaming big data.
You can find more information about Kinesis at AWS Kinesis Data Streams Service API Reference.
The Kinesis extension is based on AWS Java SDK 2.x. It’s a major rewrite of the 1.x code base that offers two programming models (Blocking & Async). |
The Quarkus extension supports two programming models:
-
Blocking access using URL Connection HTTP client (by default) or the Apache HTTP Client
-
Asynchronous programming based on JDK’s
CompletableFuture
objects and the Netty HTTP client.
In this guide, we see how you can get your REST services to use Kinesis locally and on AWS.
Prerequisites
To complete this guide, you need:
-
JDK 17+ installed with
JAVA_HOME
configured appropriately -
an IDE
-
Apache Maven 3.8.1+
-
An AWS Account to access the Kinesis service
-
Docker for your system to run Kinesis locally for testing purposes
Provision Kinesis locally via Dev Services
The easiest way to start working with Kinesis is to run a local instance using Dev Services.
Provision Kinesis locally manually
You can also set up a local version of Kinesis manually, first start a LocalStack container:
docker run --rm --name local-kinesis --publish 4592:4592 -e SERVICES=kinesis -e START_WEB=0 -d localstack/localstack:3.7.2
This starts a Kinesis instance that is accessible on port 4592
.
Create an AWS profile for your local instance using AWS CLI:
$ aws configure --profile localstack
AWS Access Key ID [None]: test-key
AWS Secret Access Key [None]: test-secret
Default region name [None]: us-east-1
Default output format [None]:
Creating the Maven project
First, we need a new project. Create a new project with the following command:
mvn io.quarkus.platform:quarkus-maven-plugin:3.17.0:create \
-DprojectGroupId=org.acme \
-DprojectArtifactId=amazon-kinesis-quickstart \
-DclassName="org.acme.kinesis.QuarkusKinesisSyncResource" \
-Dpath="/sync" \
-Dextensions="resteasy-reactive-jackson,amazon-kinesis"
cd amazon-kinesis-quickstart
This command generates a Maven structure importing the RESTEasy Reactive/JAX-RS and Amazon Kinesis Client extensions.
After this, the amazon-kinesis
extension has been added to your pom.xml
as well as the Mutiny support for RESTEasy.
Configuring Kinesis clients
Both Kinesis clients (sync and async) are configurable via the application.properties
file that can be provided in the src/main/resources
directory.
Additionally, you need to add to the classpath a proper implementation of the sync client. By default the extension uses the URL connection HTTP client, so
you need to add a URL connection client dependency to the pom.xml
file:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>url-connection-client</artifactId>
</dependency>
If you want to use the Apache HTTP client instead, configure it as follows:
quarkus.kinesis.sync-client.type=apache
And add the following dependency to the application pom.xml
:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>apache-client</artifactId>
</dependency>
If you want to use the AWS CRT-based HTTP client instead, configure it as follows:
quarkus.kinesis.sync-client.type=aws-crt
And add the following dependency to the application pom.xml
:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>aws-crt-client</artifactId>
</dependency>
If you’re going to use a local Kinesis instance, configure it as follows:
quarkus.kinesis.endpoint-override=http://localhost:4592 (1)
quarkus.kinesis.aws.region=us-east-1 (2)
quarkus.kinesis.aws.credentials.type=static (3)
quarkus.kinesis.aws.credentials.static-provider.access-key-id=test-key
quarkus.kinesis.aws.credentials.static-provider.secret-access-key=test-secret
1 | Override the Kinesis client to use LocalStack instead of the actual AWS service |
2 | Localstack defaults to us-east-1 |
3 | The static credentials provider lets you set the access-key-id and secret-access-key directly |
If you want to work with an AWS account, you can simply remove or comment out all Amazon Kinesis related properties. By default, the ST client extension will use the default
credentials provider chain that looks for credentials in this order:
-
Java System Properties -
aws.accessKeyId
andaws.secretAccessKey
-
Environment Variables -
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
-
Credential profiles file at the default location (
~/.aws/credentials
) shared by all AWS SDKs and the AWS CLI -
Credentials delivered through the Amazon ECS if the
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
environment variable is set and the security manager has permission to access the variable, -
Instance profile credentials delivered through the Amazon EC2 metadata service
And the region from your AWS CLI profile will be used.
Next steps
Packaging
Packaging your application is as simple as ./mvnw clean package
.
It can then be run with java -Dparameters.path=/quarkus/is/awesome/ -jar target/quarkus-app/quarkus-run.jar
.
With GraalVM installed, you can also create a native executable binary: ./mvnw clean package -Dnative
.
Depending on your system, that will take some time.
Going asynchronous
Thanks to the AWS SDK v2.x used by the Quarkus extension, you can use the asynchronous programming model out of the box.
We need to add the Netty HTTP client dependency to the pom.xml
:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>netty-nio-client</artifactId>
</dependency>
If you want to use the AWS CRT-based HTTP client instead, configure it as follows:
quarkus.kinesis.async-client.type=aws-crt
And add the following dependency to the application pom.xml
:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>aws-crt-client</artifactId>
</dependency>
Configuration Reference
Configuration property fixed at build time - All other configuration properties are overridable at runtime
Configuration property |
Type |
Default |
---|---|---|
List of execution interceptors that will have access to read and modify the request and response objects as they are processed by the AWS SDK. The list should consists of class names which implements Environment variable: |
list of string |
|
OpenTelemetry AWS SDK instrumentation will be enabled if the OpenTelemetry extension is present and this value is true. Environment variable: |
boolean |
|
Type of the sync HTTP client implementation Environment variable: |
|
|
Type of the async HTTP client implementation Environment variable: |
|
|
If a local AWS stack should be used. (default to true) If this is true and endpoint-override is not configured then a local AWS stack will be started and will be used instead of the given configuration. For all services but Cognito, the local AWS stack will be provided by LocalStack. Otherwise, it will be provided by Moto Environment variable: |
boolean |
|
Indicates if the LocalStack container managed by Dev Services is shared. When shared, Quarkus looks for running containers using label-based service discovery. If a matching container is found, it is used, and so a second one is not started. Otherwise, Dev Services starts a new container. The discovery uses the Sharing is not supported for the Cognito extension. Environment variable: |
boolean |
|
Indicates if shared LocalStack services managed by Dev Services should be isolated. When true, the service will be started in its own container and the value of the Environment variable: |
boolean |
|
The value of the This property is used when you need multiple shared LocalStack instances. Environment variable: |
string |
|
Generic properties that are pass for additional container configuration. Environment variable: |
Map<String,String> |
|
Type |
Default |
|
The endpoint URI with which the SDK should communicate. If not specified, an appropriate endpoint to be used for the given service and region. Environment variable: |
||
The amount of time to allow the client to complete the execution of an API call. This timeout covers the entire client execution except for marshalling. This includes request handler execution, all HTTP requests including retries, unmarshalling, etc. This value should always be positive, if present. Environment variable: |
||
The amount of time to wait for the HTTP request to complete before giving up and timing out. This value should always be positive, if present. Environment variable: |
||
Whether the Quarkus thread pool should be used for scheduling tasks such as async retry attempts and timeout task. When disabled, the default sdk behavior is to create a dedicated thread pool for each client, resulting in competition for CPU resources among these thread pools. Environment variable: |
boolean |
|
Type |
Default |
|
An Amazon Web Services region that hosts the given service. It overrides region provider chain with static value of region with which the service client should communicate. If not set, region is retrieved via the default providers chain in the following order:
See Environment variable: |
Region |
|
Configure the credentials provider that should be used to authenticate with AWS. Available values:
Environment variable: |
|
|
Type |
Default |
|
Whether this provider should fetch credentials asynchronously in the background. If this is Environment variable: |
boolean |
|
Whether the provider should reuse the last successful credentials provider in the chain. Reusing the last successful credentials provider will typically return credentials faster than searching through the chain. Environment variable: |
boolean |
|
Type |
Default |
|
AWS Access key id Environment variable: |
string |
|
AWS Secret access key Environment variable: |
string |
|
AWS Session token Environment variable: |
string |
|
Type |
Default |
|
The name of the profile that should be used by this credentials provider. If not specified, the value in Environment variable: |
string |
|
Type |
Default |
|
Whether the provider should fetch credentials asynchronously in the background. If this is true, threads are less likely to block when credentials are loaded, but additional resources are used to maintain the provider. Environment variable: |
boolean |
|
The amount of time between when the credentials expire and when the credentials should start to be refreshed. This allows the credentials to be refreshed *before* they are reported to expire. Environment variable: |
|
|
The maximum size of the output that can be returned by the external process before an exception is raised. Environment variable: |
|
|
The command that should be executed to retrieve credentials. Command and parameters are seperated list entries. Environment variable: |
list of string |
|
Type |
Default |
|
The name of custom AwsCredentialsProvider bean. Environment variable: |
string |
|
Type |
Default |
|
The maximum amount of time to establish a connection before timing out. Environment variable: |
|
|
The amount of time to wait for data to be transferred over an established, open connection before the connection is timed out. Environment variable: |
|
|
TLS key managers provider type. Available providers:
Environment variable: |
|
|
Path to the key store. Environment variable: |
path |
|
Key store type. See the KeyStore section in the https://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#KeyStore[Java Cryptography Architecture Standard Algorithm Name Documentation] for information about standard keystore types. Environment variable: |
string |
|
Key store password Environment variable: |
string |
|
TLS trust managers provider type. Available providers:
Environment variable: |
|
|
Path to the key store. Environment variable: |
path |
|
Key store type. See the KeyStore section in the https://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#KeyStore[Java Cryptography Architecture Standard Algorithm Name Documentation] for information about standard keystore types. Environment variable: |
string |
|
Key store password Environment variable: |
string |
|
Type |
Default |
|
The amount of time to wait when acquiring a connection from the pool before giving up and timing out. Environment variable: |
|
|
The maximum amount of time that a connection should be allowed to remain open while idle. Environment variable: |
|
|
The maximum amount of time that a connection should be allowed to remain open, regardless of usage frequency. Environment variable: |
||
The maximum number of connections allowed in the connection pool. Each built HTTP client has its own private connection pool. Environment variable: |
int |
|
Whether the client should send an HTTP expect-continue handshake before each request. Environment variable: |
boolean |
|
Whether the idle connections in the connection pool should be closed asynchronously. When enabled, connections left idling for longer than Environment variable: |
boolean |
|
Configure whether to enable or disable TCP KeepAlive. Environment variable: |
boolean |
|
Enable HTTP proxy Environment variable: |
boolean |
|
The endpoint of the proxy server that the SDK should connect through. Currently, the endpoint is limited to a host and port. Any other URI components will result in an exception being raised. Environment variable: |
||
The username to use when connecting through a proxy. Environment variable: |
string |
|
The password to use when connecting through a proxy. Environment variable: |
string |
|
For NTLM proxies - the Windows domain name to use when authenticating with the proxy. Environment variable: |
string |
|
For NTLM proxies - the Windows workstation name to use when authenticating with the proxy. Environment variable: |
string |
|
Whether to attempt to authenticate preemptively against the proxy server using basic authentication. Environment variable: |
boolean |
|
The hosts that the client is allowed to access without going through the proxy. Environment variable: |
list of string |
|
Type |
Default |
|
The maximum amount of time that a connection should be allowed to remain open while idle. Environment variable: |
|
|
The maximum number of allowed concurrent requests. Environment variable: |
int |
|
Enable HTTP proxy Environment variable: |
boolean |
|
The endpoint of the proxy server that the SDK should connect through. Currently, the endpoint is limited to a host and port. Any other URI components will result in an exception being raised. Environment variable: |
||
The username to use when connecting through a proxy. Environment variable: |
string |
|
The password to use when connecting through a proxy. Environment variable: |
string |
|
Type |
Default |
|
The maximum number of allowed concurrent requests. For HTTP/1.1 this is the same as max connections. For HTTP/2 the number of connections that will be used depends on the max streams allowed per connection. Environment variable: |
int |
|
The maximum number of pending acquires allowed. Once this exceeds, acquire tries will be failed. Environment variable: |
int |
|
The amount of time to wait for a read on a socket before an exception is thrown. Specify Environment variable: |
|
|
The amount of time to wait for a write on a socket before an exception is thrown. Specify Environment variable: |
|
|
The amount of time to wait when initially establishing a connection before giving up and timing out. Environment variable: |
|
|
The amount of time to wait when acquiring a connection from the pool before giving up and timing out. Environment variable: |
|
|
The maximum amount of time that a connection should be allowed to remain open, regardless of usage frequency. Environment variable: |
||
The maximum amount of time that a connection should be allowed to remain open while idle. Currently has no effect if Environment variable: |
|
|
Whether the idle connections in the connection pool should be closed. When enabled, connections left idling for longer than Environment variable: |
boolean |
|
Configure whether to enable or disable TCP KeepAlive. Environment variable: |
boolean |
|
The HTTP protocol to use. Environment variable: |
|
|
The SSL Provider to be used in the Netty client. Default is Environment variable: |
|
|
The maximum number of concurrent streams for an HTTP/2 connection. This setting is only respected when the HTTP/2 protocol is used. Environment variable: |
long |
|
The initial window size for an HTTP/2 stream. This setting is only respected when the HTTP/2 protocol is used. Environment variable: |
int |
|
Sets the period that the Netty client will send This setting is only respected when the HTTP/2 protocol is used. Environment variable: |
|
|
Enable HTTP proxy. Environment variable: |
boolean |
|
The endpoint of the proxy server that the SDK should connect through. Currently, the endpoint is limited to a host and port. Any other URI components will result in an exception being raised. Environment variable: |
||
The hosts that the client is allowed to access without going through the proxy. Environment variable: |
list of string |
|
TLS key managers provider type. Available providers:
Environment variable: |
|
|
Path to the key store. Environment variable: |
path |
|
Key store type. See the KeyStore section in the https://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#KeyStore[Java Cryptography Architecture Standard Algorithm Name Documentation] for information about standard keystore types. Environment variable: |
string |
|
Key store password Environment variable: |
string |
|
TLS trust managers provider type. Available providers:
Environment variable: |
|
|
Path to the key store. Environment variable: |
path |
|
Key store type. See the KeyStore section in the https://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#KeyStore[Java Cryptography Architecture Standard Algorithm Name Documentation] for information about standard keystore types. Environment variable: |
string |
|
Key store password Environment variable: |
string |
|
Enable the custom configuration of the Netty event loop group. Environment variable: |
boolean |
|
Number of threads to use for the event loop group. If not set, the default Netty thread count is used (which is double the number of available processors unless the Environment variable: |
int |
|
The thread name prefix for threads created by this thread factory used by event loop group. The prefix will be appended with a number unique to the thread factory and a number unique to the thread. If not specified it defaults to Environment variable: |
string |
|
Whether the default thread pool should be used to complete the futures returned from the HTTP client request. When disabled, futures will be completed on the Netty event loop thread. Environment variable: |
boolean |
|
About the Duration format
To write duration values, use the standard You can also use a simplified format, starting with a number:
In other cases, the simplified format is translated to the
|
About the MemorySize format
A size configuration option recognizes strings in this format (shown as a regular expression): If no suffix is given, assume bytes. |