OpenAI
OpenAI stands as a pioneering AI research organization, famous for its groundbreaking Large Language Models (LLMs) like GPT-3 and GPT-4, setting new benchmarks in natural language understanding and generation.
OpenAI’s LLMs offer extensive support for:
-
Tools facilitating seamless interaction between the LLM and applications.
-
Document retrievers enabling the transmission of pertinent content to the LLM.
Using OpenAI Models
To employ OpenAI LLMs, integrate the following dependency into your project:
<dependency>
<groupId>io.quarkiverse.langchain4j</groupId>
<artifactId>quarkus-langchain4j-openai</artifactId>
<version>0.20.3</version>
</dependency>
If no other LLM extension is installed, AI Services will automatically utilize the configured OpenAI model.
Configuration
Configuring OpenAI models mandates an API key, obtainable by creating an account on the OpenAI platform.
The API key can be set in the application.properties
file:
quarkus.langchain4j.openai.api-key=sk-...
Alternatively, leverage the QUARKUS_LANGCHAIN4J_OPENAI_API_KEY environment variable.
|
Several configuration properties are available:
Configuration property fixed at build time - All other configuration properties are overridable at runtime
Type |
Default |
|
---|---|---|
Whether the model should be enabled Environment variable: |
boolean |
|
Whether the model should be enabled Environment variable: |
boolean |
|
Whether the model should be enabled Environment variable: |
boolean |
|
Whether the model should be enabled Environment variable: |
boolean |
|
Base URL of OpenAI API Environment variable: |
string |
|
OpenAI API key Environment variable: |
string |
|
OpenAI Organization ID (https://platform.openai.com/docs/api-reference/organization-optional) Environment variable: |
string |
|
Timeout for OpenAI calls Environment variable: |
|
|
The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled. Environment variable: |
int |
|
Whether the OpenAI client should log requests Environment variable: |
boolean |
|
Whether the OpenAI client should log responses Environment variable: |
boolean |
|
Whether to enable the integration. Defaults to Environment variable: |
boolean |
|
The Proxy type Environment variable: |
string |
|
The Proxy host Environment variable: |
string |
|
The Proxy port Environment variable: |
int |
|
Model name to use Environment variable: |
string |
|
What sampling temperature to use, with values between 0 and 2. Higher values means the model will take more risks. A value of 0.9 is good for more creative applications, while 0 (argmax sampling) is good for ones with a well-defined answer. It is recommended to alter this or topP, but not both. Environment variable: |
double |
|
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with topP probability mass. 0.1 means only the tokens comprising the top 10% probability mass are considered. It is recommended to alter this or temperature, but not both. Environment variable: |
double |
|
The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens can’t exceed the model’s context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096). Environment variable: |
int |
|
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Environment variable: |
double |
|
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Environment variable: |
double |
|
Whether chat model requests should be logged Environment variable: |
boolean |
|
Whether chat model responses should be logged Environment variable: |
boolean |
|
The response format the model should use. Some models are not compatible with some response formats, make sure to review OpenAI documentation. Environment variable: |
string |
|
The list of stop words to use. Environment variable: |
list of string |
|
Model name to use Environment variable: |
string |
|
Whether embedding model requests should be logged Environment variable: |
boolean |
|
Whether embedding model responses should be logged Environment variable: |
boolean |
|
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Environment variable: |
string |
|
Model name to use Environment variable: |
string |
|
Whether moderation model requests should be logged Environment variable: |
boolean |
|
Whether moderation model responses should be logged Environment variable: |
boolean |
|
Model name to use Environment variable: |
string |
|
Configure whether the generated images will be saved to disk. By default, persisting is disabled, but it is implicitly enabled when Environment variable: |
boolean |
|
The path where the generated images will be persisted to disk. This only applies of Environment variable: |
path |
|
The format in which the generated images are returned. Must be one of Environment variable: |
string |
|
The size of the generated images. Must be one of Must be one of Environment variable: |
string |
|
The quality of the image that will be generated.
This param is only supported for when the model is Environment variable: |
string |
|
The number of images to generate. Must be between 1 and 10. When the model is dall-e-3, only n=1 is supported. Environment variable: |
int |
|
The style of the generated images. Must be one of This param is only supported for when the model is Environment variable: |
string |
|
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Environment variable: |
string |
|
Whether image model requests should be logged Environment variable: |
boolean |
|
Whether image model responses should be logged Environment variable: |
boolean |
|
Type |
Default |
|
Base URL of OpenAI API Environment variable: |
string |
|
OpenAI API key Environment variable: |
string |
|
OpenAI Organization ID (https://platform.openai.com/docs/api-reference/organization-optional) Environment variable: |
string |
|
Timeout for OpenAI calls Environment variable: |
|
|
The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled. Environment variable: |
int |
|
Whether the OpenAI client should log requests Environment variable: |
boolean |
|
Whether the OpenAI client should log responses Environment variable: |
boolean |
|
Whether to enable the integration. Defaults to Environment variable: |
boolean |
|
The Proxy type Environment variable: |
string |
|
The Proxy host Environment variable: |
string |
|
The Proxy port Environment variable: |
int |
|
Model name to use Environment variable: |
string |
|
What sampling temperature to use, with values between 0 and 2. Higher values means the model will take more risks. A value of 0.9 is good for more creative applications, while 0 (argmax sampling) is good for ones with a well-defined answer. It is recommended to alter this or topP, but not both. Environment variable: |
double |
|
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with topP probability mass. 0.1 means only the tokens comprising the top 10% probability mass are considered. It is recommended to alter this or temperature, but not both. Environment variable: |
double |
|
The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens can’t exceed the model’s context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096). Environment variable: |
int |
|
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Environment variable: |
double |
|
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Environment variable: |
double |
|
Whether chat model requests should be logged Environment variable: |
boolean |
|
Whether chat model responses should be logged Environment variable: |
boolean |
|
The response format the model should use. Some models are not compatible with some response formats, make sure to review OpenAI documentation. Environment variable: |
string |
|
The list of stop words to use. Environment variable: |
list of string |
|
Model name to use Environment variable: |
string |
|
Whether embedding model requests should be logged Environment variable: |
boolean |
|
Whether embedding model responses should be logged Environment variable: |
boolean |
|
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Environment variable: |
string |
|
Model name to use Environment variable: |
string |
|
Whether moderation model requests should be logged Environment variable: |
boolean |
|
Whether moderation model responses should be logged Environment variable: |
boolean |
|
Model name to use Environment variable: |
string |
|
Configure whether the generated images will be saved to disk. By default, persisting is disabled, but it is implicitly enabled when Environment variable: |
boolean |
|
The path where the generated images will be persisted to disk. This only applies of Environment variable: |
path |
|
The format in which the generated images are returned. Must be one of Environment variable: |
string |
|
The size of the generated images. Must be one of Must be one of Environment variable: |
string |
|
The quality of the image that will be generated.
This param is only supported for when the model is Environment variable: |
string |
|
The number of images to generate. Must be between 1 and 10. When the model is dall-e-3, only n=1 is supported. Environment variable: |
int |
|
The style of the generated images. Must be one of This param is only supported for when the model is Environment variable: |
string |
|
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Environment variable: |
string |
|
Whether image model requests should be logged Environment variable: |
boolean |
|
Whether image model responses should be logged Environment variable: |
boolean |
|
About the Duration format
To write duration values, use the standard You can also use a simplified format, starting with a number:
In other cases, the simplified format is translated to the
|
Document Retriever
When utilizing OpenAI models, the recommended practice involves leveraging the OpenAiEmbeddingModel
. If no other LLM extension is installed, retrieve the embedding model as follows:
@Inject EmbeddingModel model; // Injects the OpenAIEmbeddingModel
The OpenAIEmbeddingModel transmits the document to OpenAI for embedding computation.
|
Azure OpenAI
Applications can leverage the Azure’s version of OpenAI services simply by using the quarkus-langchain4j-azure-openai
extension instead of the quarkus-langchain4j-openai
extension.
When this extension is used, the following configuration properties are required:
quarkus.langchain4j.azure-openai.resource-name=
quarkus.langchain4j.azure-openai.deployment-name=
# And one of the below depending on your scenario
quarkus.langchain4j.azure-openai.api-key=
quarkus.langchain4j.azure-openai.ad-token=
In the case of Azure, the api-key
and ad-token
properties are mutually exclusive. The api-key
property should be used when the Azure OpenAI service is configured to use API keys, while the ad-token
property should be used when the Azure OpenAI service is configured to use Azure Active Directory tokens.
In both cases, the key will be placed in the Authorization header when communicating with the Azure OpenAI service.
Advanced usage
quarkus-langchain4j-openai
and quarkus-langchain4j-azure-openai
extensions use a REST Client under the hood to make the REST calls required by LangChain4j.
This client is however available for use in a Quarkus application in the same way as any other REST client.
An example usage is the following:
import java.net.URI;
import java.net.URISyntaxException;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import org.jboss.resteasy.reactive.RestStreamElementType;
import dev.ai4j.openai4j.completion.CompletionChoice;
import io.quarkiverse.langchain4j.openai.common.OpenAiRestApi;
import io.quarkus.rest.client.reactive.QuarkusRestClientBuilder;
import io.smallrye.mutiny.Multi;
@Path("restApi")
@ApplicationScoped
public class QuarkusRestApiResource {
private final OpenAiRestApi restApi;
private final String token;
public QuarkusRestApiResource() throws URISyntaxException {
this.restApi = QuarkusRestClientBuilder.newBuilder()
.baseUri(new URI("https://api.openai.com/v1/"))
.build(OpenAiRestApi.class);
this.token = "sometoken";
}
@GET
@Path("language/streaming")
@RestStreamElementType(MediaType.TEXT_PLAIN)
public Multi<String> languageStreaming() {
return restApi.streamingCompletion(
createCompletionRequest("Write a short 1 paragraph funny poem about Enterprise Java"), token, null)
.map(r -> {
if (r.choices() != null) {
if (r.choices().size() == 1) {
CompletionChoice choice = r.choices().get(0);
var text = choice.text();
if (text != null) {
return text;
}
}
}
return "";
});
}
}
This example allows for streaming OpenAIs response back to the user’s browser as Server Sent Events.
We used |
Dynamic Authorization Headers
There are cases where one may need to provide dynamic authorization headers, to be passed to OpenAI endpoints (in OpenAI or Azure OpenAI)
There are two ways to achieve this:
Using a ContainerRequestFilter annotated with @Provider
.
As the underlying HTTP communication relies on the Quarkus Rest Client, it is possible to apply a filter that will be called in all OpenAI requests and set the headers accordingly.
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
import jakarta.ws.rs.ext.Provider;
import org.jboss.resteasy.reactive.client.spi.ResteasyReactiveClientRequestContext;
import org.jboss.resteasy.reactive.client.spi.ResteasyReactiveClientRequestFilter;
@Provider
@ApplicationScoped
public class RequestFilter implements ResteasyReactiveClientRequestFilter {
@Inject
MyAuthorizationService myAuthorizationService;
@Override
public void filter(ResteasyReactiveClientRequestContext requestContext) {
/*
* All requests will be filtered here, therefore make sure that you make
* the necessary checks to avoid putting the Authorization header in
* requests that do not need it.
*/
requestContext.getHeaders().putSingle("Authorization", ...);
}
}
Using AuthProvider
An even simpler approach consists of implementing the ModelAuthProvider
interface and provide the implementation of the getAuthorization
method.
This is useful when you need to provide different authorization headers for different OpenAI models. The @ModelName
annotation can be used to specify the model name in this scenario.
import io.quarkiverse.langchain4j.ModelName;
import io.quarkiverse.langchain4j.auth.ModelAuthProvider;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
@ApplicationScoped
@ModelName("my-model-name") //you can omit this if you have only one model or if you want to use the default model
public class TestClass implements ModelAuthProvider {
@Inject MyTokenProviderService tokenProviderService;
@Override
public String getAuthorization(Input input) {
/*
* The `input` will contain some information about the request
* about to be passed to the remote model endpoints
*/
return "Bearer " + tokenProviderService.getToken();
}
}