Azure OpenAI Embedding Models
Azure OpenAI supports the development of Retrieval-Augmented Generation (RAG) applications by offering embedding models that transform text into high-dimensional vector representations. These embeddings enable similarity search, semantic retrieval, and other vector-based operations.
Prerequisites
Azure OpenAI Account and API Key
To use Azure OpenAI models in your Quarkus application, configure your Azure credentials and endpoint.
-
Obtain your Azure OpenAI endpoint, resource name, deployment name, and either an api-key or an Azure AD access token from the Azure Portal.
-
Configure your application.properties with the necessary details:
quarkus.langchain4j.azure-openai.resource-name=
quarkus.langchain4j.azure-openai.deployment-name=
# And one of the below depending on your scenario
quarkus.langchain4j.azure-openai.api-key=
quarkus.langchain4j.azure-openai.ad-token=
Azure OpenAI Quarkus Extension
To use Azure OpenAI embedding models in your Quarkus application, add the quarkus-langchain4j-azure-openai
extension:
<dependency>
<groupId>io.quarkiverse.langchain4j</groupId>
<artifactId>quarkus-langchain4j-azure-openai</artifactId>
<version>1.0.2</version>
</dependency>
If no other LLM extension is present, AI Services will automatically select the configured Azure OpenAI embedding model.
This extension also provides support for Azure OpenAI chat and image models. See the corresponding sections for details. |
Configuration
Configuration property fixed at build time - All other configuration properties are overridable at runtime
Configuration property |
Type |
Default |
---|---|---|
Whether the model should be enabled Environment variable: |
boolean |
|
Whether the model should be enabled Environment variable: |
boolean |
|
Whether the model should be enabled Environment variable: |
boolean |
|
Whether the model should be enabled Environment variable: |
boolean |
|
The name of your Azure OpenAI Resource. You’re required to first deploy a model before you can make calls. This and Environment variable: |
string |
|
The domain name of your Azure OpenAI Resource. You’re required to first deploy a model before you can make calls. This and Environment variable: |
string |
|
The name of your model deployment. You’re required to first deploy a model before you can make calls. This and Environment variable: |
string |
|
The endpoint for the Azure OpenAI resource. If not specified, then Environment variable: |
string |
|
The Azure AD token to use for this operation. If present, then the requests towards OpenAI will include this in the Authorization header. Note that this property overrides the functionality of Environment variable: |
string |
|
The API version to use for this operation. This follows the YYYY-MM-DD format Environment variable: |
string |
|
Azure OpenAI API key Environment variable: |
string |
|
Timeout for OpenAI calls Environment variable: |
|
|
The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled. Environment variable: |
int |
|
Whether the OpenAI client should log requests Environment variable: |
boolean |
|
Whether the OpenAI client should log responses Environment variable: |
boolean |
|
Whether to enable the integration. Defaults to Environment variable: |
boolean |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
The Azure AD token to use for this operation. If present, then the requests towards OpenAI will include this in the Authorization header. Note that this property overrides the functionality of Environment variable: |
string |
|
The API version to use for this operation. This follows the YYYY-MM-DD format Environment variable: |
string |
|
Azure OpenAI API key Environment variable: |
string |
|
What sampling temperature to use, with values between 0 and 2. Higher values means the model will take more risks. A value of 0.9 is good for more creative applications, while 0 (argmax sampling) is good for ones with a well-defined answer. It is recommended to alter this or topP, but not both. Environment variable: |
double |
|
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with topP probability mass. 0.1 means only the tokens comprising the top 10% probability mass are considered. It is recommended to alter this or temperature, but not both. Environment variable: |
double |
|
The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens can’t exceed the model’s context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096). Environment variable: |
int |
|
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Environment variable: |
double |
|
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Environment variable: |
double |
|
Whether chat model requests should be logged Environment variable: |
boolean |
|
Whether chat model responses should be logged Environment variable: |
boolean |
|
The response format the model should use. Some models are not compatible with some response formats, make sure to review OpenAI documentation. Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
The Azure AD token to use for this operation. If present, then the requests towards OpenAI will include this in the Authorization header. Note that this property overrides the functionality of Environment variable: |
string |
|
The API version to use for this operation. This follows the YYYY-MM-DD format Environment variable: |
string |
|
Azure OpenAI API key Environment variable: |
string |
|
Whether embedding model requests should be logged Environment variable: |
boolean |
|
Whether embedding model responses should be logged Environment variable: |
boolean |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
The Azure AD token to use for this operation. If present, then the requests towards OpenAI will include this in the Authorization header. Note that this property overrides the functionality of Environment variable: |
string |
|
The API version to use for this operation. This follows the YYYY-MM-DD format Environment variable: |
string |
|
Azure OpenAI API key Environment variable: |
string |
|
Model name to use Environment variable: |
string |
|
Configure whether the generated images will be saved to disk. By default, persisting is disabled, but it is implicitly enabled when Environment variable: |
boolean |
|
The path where the generated images will be persisted to disk. This only applies of Environment variable: |
path |
|
The format in which the generated images are returned. Must be one of Environment variable: |
string |
|
The size of the generated images. Must be one of Must be one of Environment variable: |
string |
|
The quality of the image that will be generated.
This param is only supported for when the model is Environment variable: |
string |
|
The number of images to generate. Must be between 1 and 10. When the model is dall-e-3, only n=1 is supported. Environment variable: |
int |
|
The style of the generated images. Must be one of This param is only supported for when the model is Environment variable: |
string |
|
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Environment variable: |
string |
|
Whether image model requests should be logged Environment variable: |
boolean |
|
Whether image model responses should be logged Environment variable: |
boolean |
|
Type |
Default |
|
The name of your Azure OpenAI Resource. You’re required to first deploy a model before you can make calls. This and Environment variable: |
string |
|
The domain name of your Azure OpenAI Resource. You’re required to first deploy a model before you can make calls. This and Environment variable: |
string |
|
The name of your model deployment. You’re required to first deploy a model before you can make calls. This and Environment variable: |
string |
|
The endpoint for the Azure OpenAI resource. If not specified, then Environment variable: |
string |
|
The Azure AD token to use for this operation. If present, then the requests towards OpenAI will include this in the Authorization header. Note that this property overrides the functionality of Environment variable: |
string |
|
The API version to use for this operation. This follows the YYYY-MM-DD format Environment variable: |
string |
|
Azure OpenAI API key Environment variable: |
string |
|
Timeout for OpenAI calls Environment variable: |
|
|
The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled. Environment variable: |
int |
|
Whether the OpenAI client should log requests Environment variable: |
boolean |
|
Whether the OpenAI client should log responses Environment variable: |
boolean |
|
Whether to enable the integration. Defaults to Environment variable: |
boolean |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
The Azure AD token to use for this operation. If present, then the requests towards OpenAI will include this in the Authorization header. Note that this property overrides the functionality of Environment variable: |
string |
|
The API version to use for this operation. This follows the YYYY-MM-DD format Environment variable: |
string |
|
Azure OpenAI API key Environment variable: |
string |
|
What sampling temperature to use, with values between 0 and 2. Higher values means the model will take more risks. A value of 0.9 is good for more creative applications, while 0 (argmax sampling) is good for ones with a well-defined answer. It is recommended to alter this or topP, but not both. Environment variable: |
double |
|
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with topP probability mass. 0.1 means only the tokens comprising the top 10% probability mass are considered. It is recommended to alter this or temperature, but not both. Environment variable: |
double |
|
The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens can’t exceed the model’s context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096). Environment variable: |
int |
|
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Environment variable: |
double |
|
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Environment variable: |
double |
|
Whether chat model requests should be logged Environment variable: |
boolean |
|
Whether chat model responses should be logged Environment variable: |
boolean |
|
The response format the model should use. Some models are not compatible with some response formats, make sure to review OpenAI documentation. Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
The Azure AD token to use for this operation. If present, then the requests towards OpenAI will include this in the Authorization header. Note that this property overrides the functionality of Environment variable: |
string |
|
The API version to use for this operation. This follows the YYYY-MM-DD format Environment variable: |
string |
|
Azure OpenAI API key Environment variable: |
string |
|
Whether embedding model requests should be logged Environment variable: |
boolean |
|
Whether embedding model responses should be logged Environment variable: |
boolean |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
This property will override the Environment variable: |
string |
|
The Azure AD token to use for this operation. If present, then the requests towards OpenAI will include this in the Authorization header. Note that this property overrides the functionality of Environment variable: |
string |
|
The API version to use for this operation. This follows the YYYY-MM-DD format Environment variable: |
string |
|
Azure OpenAI API key Environment variable: |
string |
|
Model name to use Environment variable: |
string |
|
Configure whether the generated images will be saved to disk. By default, persisting is disabled, but it is implicitly enabled when Environment variable: |
boolean |
|
The path where the generated images will be persisted to disk. This only applies of Environment variable: |
path |
|
The format in which the generated images are returned. Must be one of Environment variable: |
string |
|
The size of the generated images. Must be one of Must be one of Environment variable: |
string |
|
The quality of the image that will be generated.
This param is only supported for when the model is Environment variable: |
string |
|
The number of images to generate. Must be between 1 and 10. When the model is dall-e-3, only n=1 is supported. Environment variable: |
int |
|
The style of the generated images. Must be one of This param is only supported for when the model is Environment variable: |
string |
|
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Environment variable: |
string |
|
Whether image model requests should be logged Environment variable: |
boolean |
|
Whether image model responses should be logged Environment variable: |
boolean |
|
About the Duration format
To write duration values, use the standard You can also use a simplified format, starting with a number:
In other cases, the simplified format is translated to the
|
You can configure multiple Azure OpenAI embedding models in your application using named configurations:
# Default configuration
quarkus.langchain4j.azure-openai.embedding-model.model-name=text-embedding-3-large
# Custom configuration (under ‘my-retriever’)
quarkus.langchain4j.azure-openai.my-retriever.embedding-model.model-name=text-embedding-ada-002
In your RAG implementation, select a model using the @ModelName
annotation:
import io.quarkiverse.langchain4j.ModelName;
import dev.langchain4j.model.embedding.EmbeddingModel;
import jakarta.inject.Inject;
@Inject EmbeddingModel defaultEmbeddingModel;
@Inject @ModelName("my-retriever") EmbeddingModel namedEmbeddingModel;