Configuration property fixed at build time - All other configuration properties are overridable at runtime

LangChain4j Easy RAG

Type

Default

Path to the directory containing the documents to be ingested. This is either an absolute or relative path in the filesystem. A relative path is resolved against the current working directory at runtime.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_PATH

string

required

Does path() represent a filesystem reference or a classpath reference?

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_PATH_TYPE

tooltip:filesystem[The path() represents a filesystem reference], tooltip:classpath[The path() represents a classpath reference]

tooltip:filesystem[The {@link #path()} represents a filesystem reference]

Matcher used for filtering which files from the directory should be ingested. This uses the java.nio.file.FileSystem path matcher syntax. Example: glob:**.txt to recursively match all files with the .txt extension. The default is glob:**, recursively matching all files.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_PATH_MATCHER

string

glob:**

Whether to recursively ingest documents from subdirectories.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_RECURSIVE

boolean

true

Maximum segment size when splitting documents, in tokens.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_SEGMENT_SIZE

int

300

Maximum overlap (in tokens) when splitting documents.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_OVERLAP_SIZE

int

30

Maximum number of results to return when querying the retrieval augmentor.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_MAX_RESULTS

int

5

The minimum score for results to return when querying the retrieval augmentor.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_MIN_SCORE

double

The strategy to decide whether document ingestion into the store should happen at startup or not. The default is ON. Changing to OFF generally only makes sense if running against a persistent embedding store that was already populated. When set to MANUAL, it is expected that the application will inject and call the io.quarkiverse.langchain4j.easyrag.EasyRagManualIngestion bean to trigger the ingestion when desired.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_INGESTION_STRATEGY

on, off, manual

on

Whether or not to reuse embeddings. Defaults to false.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_REUSE_EMBEDDINGS_ENABLED

boolean

false

The file path to load/save embeddings, assuming quarkus.langchain4j.easy-rag.reuse-embeddings.enabled == true.

Defaults to easy-rag-embeddings.json in the current directory.

Environment variable: QUARKUS_LANGCHAIN4J_EASY_RAG_REUSE_EMBEDDINGS_FILE

string

easy-rag-embeddings.json

LangChain4j Model Context Protocol client

Type

Default

Whether the MCP extension should automatically generate a ToolProvider that is wired up to all the configured MCP clients. The default is true if at least one MCP client is configured, false otherwise.

Environment variable: QUARKUS_LANGCHAIN4J_MCP_GENERATE_TOOL_PROVIDER

boolean

true

File containing the MCP servers configuration in the Claude Desktop format. This configuration can only be used to configure stdio transport type MCP servers.

This file is read at build time which means that which MCP servers the client will use, is determined at build time. However, specific configuration of each MCP server can be overridden at runtime.

Environment variable: QUARKUS_LANGCHAIN4J_MCP_CONFIG_FILE

string

Whether the MCP extension should automatically register a health check for configured MCP clients. The default is true if at least one MCP client is configured, false otherwise.

Environment variable: QUARKUS_LANGCHAIN4J_MCP_HEALTH_ENABLED

boolean

true

Whether resources should be exposed as MCP tools.

Environment variable: QUARKUS_LANGCHAIN4J_MCP_EXPOSE_RESOURCES_AS_TOOLS

boolean

false

Configured MCP clients

Type

Default

Transport type

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__TRANSPORT_TYPE

stdio, http, streamable-http, websocket

stdio

The URL of the SSE endpoint. This only applies to MCP clients using the HTTP transport.

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__URL

string

The command to execute to spawn the MCP server process. This only applies to MCP clients using the STDIO transport.

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__COMMAND

list of string

Environment variables for the spawned MCP server process. This only applies to MCP clients using the STDIO transport.

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__ENVIRONMENT__ENV_VAR_

Map<String,String>

Whether to log requests

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__LOG_REQUESTS

boolean

false

Whether to log responses

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__LOG_RESPONSES

boolean

false

Whether to prefer MicroProfile health checks. Applies to MCP HTTP clients only.

If this property is enabled, an HTTP GET call is made to an MCP Server MicroProfile Health endpoint. MicroProfile Health endpoint URL is calculated by extracting a base URL that has no path component from the url() property and adding the microprofile-health-check-path() path to it.

Default MCP Client health check that opens a Streamable HTTP or SSE transport channel is attempted when a MicroProfile health check returns an HTTP 404 or other error status.

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__MICROPROFILE_HEALTH_CHECK

boolean

false

Relative path of an MCP Server MicroProfile Health endpoint. This property is effective only when the microprofile-health-check() property is enabled.

MicroProfile Health endpoint URL is calculated by extracting the base URL that has no path component from the url() property and adding a value of this property to it.

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__MICROPROFILE_HEALTH_CHECK_PATH

string

/q/health

Timeout for tool executions performed by the MCP client

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__TOOL_EXECUTION_TIMEOUT

Duration 

60s

Timeout for resource-related operations (retrieving a list of resources as well as the actual contents of resources).

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__RESOURCES_TIMEOUT

Duration 

60s

Timeout for pinging the MCP server process to check if it’s still alive. If a ping times out, the client’s health check will start failing.

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__PING_TIMEOUT

Duration 

10S

The initial list of MCP roots that the client can present to the server. The list can be later updated programmatically during runtime. The list is formatted as key-value pairs separated by commas. For example: workspace1=/path/to/workspace1,workspace2=/path/to/workspace2

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__ROOTS

list of string

The name of the TLS configuration (bucket) used for client authentication in the TLS registry. This does not have any effect when the stdio transport is used.

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__TLS_CONFIGURATION_NAME

string

Whether to cache the tool list obtained from the MCP server. When set to true (the default), the tool list is cached until the server notifies of changes or the cache is manually evicted. When false, the client always fetches a fresh tool list from the server. This is useful when using MCP servers that don’t support tool list change notifications.

Environment variable: QUARKUS_LANGCHAIN4J_MCP__CLIENT_NAME__CACHE_TOOL_LIST

boolean

Configured MCP registry clients

Type

Default

The base URL of the MCP registry, without the API version segment. The default value points at the official registry (https://registry.modelcontextprotocol.io).

Environment variable: QUARKUS_LANGCHAIN4J_MCP_REGISTRY_CLIENT__REGISTRY_CLIENT_NAME__BASE_URL

string

https://registry.modelcontextprotocol.io

Whether to log requests

Environment variable: QUARKUS_LANGCHAIN4J_MCP_REGISTRY_CLIENT__REGISTRY_CLIENT_NAME__LOG_REQUESTS

boolean

false

Whether to log responses

Environment variable: QUARKUS_LANGCHAIN4J_MCP_REGISTRY_CLIENT__REGISTRY_CLIENT_NAME__LOG_RESPONSES

boolean

false

The name of the TLS configuration (bucket) that this MCP client registry will use.

Environment variable: QUARKUS_LANGCHAIN4J_MCP_REGISTRY_CLIENT__REGISTRY_CLIENT_NAME__TLS_CONFIGURATION_NAME

string

The read timeout for the MCP registry’s underlying http client

Environment variable: QUARKUS_LANGCHAIN4J_MCP_REGISTRY_CLIENT__REGISTRY_CLIENT_NAME__READ_TIMEOUT

Duration 

10s

The connect timeout for the MCP registry’s underlying http client

Environment variable: QUARKUS_LANGCHAIN4J_MCP_REGISTRY_CLIENT__REGISTRY_CLIENT_NAME__CONNECT_TIMEOUT

Duration 

10s

LangChain4j Neo4j embedding store

Type

Default

Dimension of the embeddings that will be stored in the Neo4j store.

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_DIMENSION

int

required

Label for the created nodes.

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_LABEL

string

Document

Name of the property to store the embedding vectors.

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_EMBEDDING_PROPERTY

string

embedding

Name of the property to store embedding IDs.

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_ID_PROPERTY

string

id

Prefix to be added to the metadata keys. By default, no prefix is used.

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_METADATA_PREFIX

string

Name of the property to store the embedding text.

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_TEXT_PROPERTY

string

text

Name of the index to be created for vector search.

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_INDEX_NAME

string

vector

Name of the database to connect to.

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_DATABASE_NAME

string

neo4j

The query to use when retrieving embeddings. This query has to return the following columns:

  • metadata

  • score

  • column of the same name as the 'id-property' value

  • column of the same name as the 'text-property' value

  • column of the same name as the 'embedding-property' value

Environment variable: QUARKUS_LANGCHAIN4J_NEO4J_RETRIEVAL_QUERY

string

RETURN properties(node) AS metadata, node.${quarkus.langchain4j.neo4j.id-property} AS ${quarkus.langchain4j.neo4j.id-property}, node.${quarkus.langchain4j.neo4j.text-property} AS ${quarkus.langchain4j.neo4j.text-property}, node.${quarkus.langchain4j.neo4j.embedding-property} AS ${quarkus.langchain4j.neo4j.embedding-property}, score

LangChain4j Pinecone embedding store

Type

Default

The API key to Pinecone.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_API_KEY

string

required

Environment name, e.g. gcp-starter or northamerica-northeast1-gcp.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_ENVIRONMENT

string

required

ID of the project.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_PROJECT_ID

string

required

Name of the index within the project. If the index doesn’t exist, it will be created.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_INDEX_NAME

string

required

Dimension of the embeddings in the index. This is required only in case that the index doesn’t exist yet and needs to be created.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_DIMENSION

int

The type of the pod to use. This is only used if the index doesn’t exist yet and needs to be created. The format: One of s1, p1, or p2 appended with . and one of x1, x2, x4, or x8.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_POD_TYPE

string

s1.x1

The timeout duration for the index to become ready. Only relevant if the index doesn’t exist yet and needs to be created. If not specified, 1 minute will be used.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_INDEX_READINESS_TIMEOUT

Duration 

The namespace.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_NAMESPACE

string

The name of the field that contains the text segment.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_TEXT_FIELD_NAME

string

text

The timeout duration for the Pinecone client. If not specified, 5 seconds will be used.

Environment variable: QUARKUS_LANGCHAIN4J_PINECONE_TIMEOUT

Duration 

LangChain4j Tavily Web Search Engine

Type

Default

Base URL of the Tavily API

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_BASE_URL

string

https://api.tavily.com

API key for the Tavily API

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_API_KEY

string

required

Maximum number of results to return

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_MAX_RESULTS

int

5

The timeout duration for Tavily requests.

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_TIMEOUT

Duration 

60S

Whether requests to Tavily should be logged

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_LOG_REQUESTS

boolean

false

Whether responses from Tavily should be logged

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_LOG_RESPONSES

boolean

false

The search depth to use. This can be "basic" or "advanced". Basic is the default.

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_SEARCH_DEPTH

basic, advanced

basic

Include a short answer to original query. Default is false.

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_INCLUDE_ANSWER

boolean

false

Include the cleaned and parsed HTML content of each search result. Default is false.

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_INCLUDE_RAW_CONTENT

boolean

false

A list of domains to specifically include in the search results. By default all domains are included.

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_INCLUDE_DOMAINS

list of string

empty list

A list of domains to specifically exclude from the search results. By default no domains are excluded.

Environment variable: QUARKUS_LANGCHAIN4J_TAVILY_EXCLUDE_DOMAINS

list of string

empty list

Quarkus LangChain4j - Anthropic

Type

Default

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_ENABLED

boolean

true

Base URL of the Anthropic API

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_BASE_URL

string

https://api.anthropic.com/v1/

Anthropic API key

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_API_KEY

string

dummy

The Anthropic version

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_VERSION

string

2023-06-01

Timeout for Anthropic calls

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_TIMEOUT

Duration 

10s

Whether the Anthropic client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_LOG_REQUESTS

boolean

false

Whether the Anthropic client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the Anthropic provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_ENABLE_INTEGRATION

boolean

true

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_MODEL_NAME

string

claude-3-haiku-20240307

What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It is generally recommended to set this or the top-k property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TEMPERATURE

double

0.7

The maximum number of tokens to generate in the completion.

The token count of your prompt plus max_tokens cannot exceed the model’s context length

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_MAX_TOKENS

int

1024

Double (0.0-1.0). Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to set this or the temperature property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TOP_P

double

1.0

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_TOP_K

int

40

The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_MAX_RETRIES

int

1

The custom text sequences that will cause the model to stop generating

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_STOP_SEQUENCES

list of string

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_LOG_RESPONSES

boolean

false

Cache system messages to reduce costs for repeated prompts. Requires minimum 1024 tokens (Claude Opus/Sonnet) or 2048-4096 tokens (Haiku). Supported models: Claude Opus 4.1, Sonnet 4.5, Haiku 4.5, and later models.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_CACHE_SYSTEM_MESSAGES

boolean

false

Cache tool definitions to reduce costs. Requires minimum 1024 tokens (Claude Opus/Sonnet) or 2048-4096 tokens (Haiku). Supported models: Claude Opus 4.1, Sonnet 4.5, Haiku 4.5, and later models.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_CACHE_TOOLS

boolean

false

The thinking type to enable Claude’s reasoning process

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_TYPE

string

The token budget for the model’s thinking process

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_BUDGET_TOKENS

int

Whether thinking results should be returned in the response

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_RETURN_THINKING

boolean

false

Whether previously stored thinking should be sent in follow-up requests

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_SEND_THINKING

boolean

true

Enable interleaved thinking for Claude 4 models, allowing reasoning between tool calls. Requires Claude 4 model (e.g., claude-opus-4-20250514) and thinking.type: enabled.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC_CHAT_MODEL_THINKING_INTERLEAVED

boolean

false

Named model config

Type

Default

Base URL of the Anthropic API

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__BASE_URL

string

https://api.anthropic.com/v1/

Anthropic API key

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__API_KEY

string

dummy

The Anthropic version

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__VERSION

string

2023-06-01

Timeout for Anthropic calls

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__TIMEOUT

Duration 

10s

Whether the Anthropic client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the Anthropic client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the Anthropic provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_MODEL_NAME

string

claude-3-haiku-20240307

What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It is generally recommended to set this or the top-k property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

0.7

The maximum number of tokens to generate in the completion.

The token count of your prompt plus max_tokens cannot exceed the model’s context length

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_MAX_TOKENS

int

1024

Double (0.0-1.0). Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to set this or the temperature property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TOP_P

double

1.0

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_TOP_K

int

40

The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_MAX_RETRIES

int

1

The custom text sequences that will cause the model to stop generating

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_STOP_SEQUENCES

list of string

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

Cache system messages to reduce costs for repeated prompts. Requires minimum 1024 tokens (Claude Opus/Sonnet) or 2048-4096 tokens (Haiku). Supported models: Claude Opus 4.1, Sonnet 4.5, Haiku 4.5, and later models.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_CACHE_SYSTEM_MESSAGES

boolean

false

Cache tool definitions to reduce costs. Requires minimum 1024 tokens (Claude Opus/Sonnet) or 2048-4096 tokens (Haiku). Supported models: Claude Opus 4.1, Sonnet 4.5, Haiku 4.5, and later models.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_CACHE_TOOLS

boolean

false

The thinking type to enable Claude’s reasoning process

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_TYPE

string

The token budget for the model’s thinking process

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_BUDGET_TOKENS

int

Whether thinking results should be returned in the response

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_RETURN_THINKING

boolean

false

Whether previously stored thinking should be sent in follow-up requests

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_SEND_THINKING

boolean

true

Enable interleaved thinking for Claude 4 models, allowing reasoning between tool calls. Requires Claude 4 model (e.g., claude-opus-4-20250514) and thinking.type: enabled.

Environment variable: QUARKUS_LANGCHAIN4J_ANTHROPIC__MODEL_NAME__CHAT_MODEL_THINKING_INTERLEAVED

boolean

false

Quarkus LangChain4j - Chroma

Type

Default

If DevServices has been explicitly enabled or disabled. DevServices is generally enabled by default, unless there is an existing configuration present.

When DevServices is enabled Quarkus will attempt to automatically configure and start a database when running in Dev or Test mode and when Docker is running.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_DEVSERVICES_ENABLED

boolean

true

The container image name to use, for container based DevServices providers.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_DEVSERVICES_IMAGE_NAME

string

ghcr.io/chroma-core/chroma:1.3.0

Optional fixed port the dev service will listen to.

If not defined, the port will be chosen randomly.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_DEVSERVICES_PORT

int

Indicates if the Chroma server managed by Quarkus Dev Services is shared. When shared, Quarkus looks for running containers using label-based service discovery. If a matching container is found, it is used, and so a second one is not started. Otherwise, Dev Services for Chroma starts a new container.

The discovery uses the quarkus-dev-service-chroma label. The value is configured using the service-name property.

Container sharing is only used in dev mode.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_DEVSERVICES_SHARED

boolean

true

The value of the quarkus-dev-service-chroma label attached to the started container. This property is used when shared is set to true. In this case, before starting a container, Dev Services for Chroma looks for a container with the quarkus-dev-service-chroma label set to the configured value. If found, it will use this container instead of starting a new one. Otherwise, it starts a new container with the quarkus-dev-service-chroma label set to the specified value.

This property is used when you need multiple shared Chroma servers.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_DEVSERVICES_SERVICE_NAME

string

chroma

Environment variables that are passed to the container.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_DEVSERVICES_CONTAINER_ENV__CONTAINER_ENV_

Map<String,String>

URL where the Chroma database is listening for requests

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_URL

string

required

The collection name.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_COLLECTION_NAME

string

default

The timeout duration for the Chroma client. If not specified, 5 seconds will be used.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_TIMEOUT

Duration 

Whether requests to Chroma should be logged

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_LOG_REQUESTS

boolean

false

Whether responses from Chroma should be logged

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_LOG_RESPONSES

boolean

false

The Chroma API version to use. V1 is deprecated (Chroma 0.x) and its support will be removed in the future. Please use Chroma 1.x which uses the V2 API.

Environment variable: QUARKUS_LANGCHAIN4J_CHROMA_API_VERSION

v1, v2

v2

Quarkus LangChain4j - Cohere

Type

Default

Whether the scoring model should be enabled.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_SCORING_MODEL_ENABLED

boolean

true

Base URL of the Cohere API.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_BASE_URL

string

https://api.cohere.com/

Cohere API key.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_API_KEY

string

dummy

Timeout for Cohere calls.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_TIMEOUT

Duration 

30s

Reranking model to use. The current list of supported models can be found in the Cohere docs

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_SCORING_MODEL_MODEL_NAME

string

rerank-multilingual-v2.0

Timeout for Cohere calls

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_SCORING_MODEL_TIMEOUT

Duration 

30S

Whether embedding model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_SCORING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_SCORING_MODEL_LOG_RESPONSES

boolean

false

The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE_SCORING_MODEL_MAX_RETRIES

int

1

Named model config

Type

Default

Base URL of the Cohere API.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE__MODEL_NAME__BASE_URL

string

https://api.cohere.com/

Cohere API key.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE__MODEL_NAME__API_KEY

string

dummy

Timeout for Cohere calls.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE__MODEL_NAME__TIMEOUT

Duration 

30s

Reranking model to use. The current list of supported models can be found in the Cohere docs

Environment variable: QUARKUS_LANGCHAIN4J_COHERE__MODEL_NAME__SCORING_MODEL_MODEL_NAME

string

rerank-multilingual-v2.0

Timeout for Cohere calls

Environment variable: QUARKUS_LANGCHAIN4J_COHERE__MODEL_NAME__SCORING_MODEL_TIMEOUT

Duration 

30S

Whether embedding model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE__MODEL_NAME__SCORING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE__MODEL_NAME__SCORING_MODEL_LOG_RESPONSES

boolean

false

The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled.

Environment variable: QUARKUS_LANGCHAIN4J_COHERE__MODEL_NAME__SCORING_MODEL_MAX_RETRIES

int

1

Quarkus LangChain4j - Core

Type

Default

Configure the type of ChatMemory that will be used by default by the default ChatMemoryProvider bean.

The extension provides a default bean that configures ChatMemoryProvider for use with AI services registered with RegisterAiService. This bean depends uses the quarkus.langchain4j.chat-memory configuration to set things up while also depending on the presence of a bean of type ChatMemoryStore (for which the extension also provides a default in the form of InMemoryChatMemoryStore).

If token-window is used, then the application must also provide a bean of type TokenCountEstimator.

Users can choose to provide their own ChatMemoryStore bean or even their own ChatMemoryProvider bean if full control over the details is needed.

Environment variable: QUARKUS_LANGCHAIN4J_CHAT_MEMORY_TYPE

message-window, token-window

message-window

If DevServices has been explicitly enabled or disabled. DevServices is generally enabled by default, unless there is an existing configuration present.

When DevServices is enabled Quarkus will attempt to automatically serve a model if there are any matching ones.

Environment variable: QUARKUS_LANGCHAIN4J_DEVSERVICES_ENABLED

boolean

true

The default port where the inference server listens for requests

Environment variable: QUARKUS_LANGCHAIN4J_DEVSERVICES_PORT

int

11434

Instructs Ollama to preload a model in order to get faster response times

Environment variable: QUARKUS_LANGCHAIN4J_DEVSERVICES_PRELOAD

boolean

true

Configuration property to enable or disable the use of the {response schema} placeholder in the @SystemMessage/@UserMessage.

Environment variable: QUARKUS_LANGCHAIN4J_RESPONSE_SCHEMA

boolean

true

The maximum number of messages the configured MessageWindowChatMemory will hold

Environment variable: QUARKUS_LANGCHAIN4J_CHAT_MEMORY_MEMORY_WINDOW_MAX_MESSAGES

int

10

The maximum number of tokens the configured TokenWindowChatMemory will hold

Environment variable: QUARKUS_LANGCHAIN4J_CHAT_MEMORY_TOKEN_WINDOW_MAX_TOKENS

int

1000

Whether clients should log requests

Environment variable: QUARKUS_LANGCHAIN4J_LOG_REQUESTS

boolean

false

Whether clients should log responses

Environment variable: QUARKUS_LANGCHAIN4J_LOG_RESPONSES

boolean

false

Global timeout for requests to LLM APIs

Environment variable: QUARKUS_LANGCHAIN4J_TIMEOUT

Duration 

10s

Global temperature for LLM APIs

Environment variable: QUARKUS_LANGCHAIN4J_TEMPERATURE

double

Configures the maximum number of retries for the guardrail. Sets it to 0 to disable retries.

Environment variable: QUARKUS_LANGCHAIN4J_GUARDRAILS_MAX_RETRIES

int

3

If enabled, the prompt is included on the generated spans

Environment variable: QUARKUS_LANGCHAIN4J_TRACING_INCLUDE_PROMPT

boolean

false

If enabled, the completion is included on the generated spans

Environment variable: QUARKUS_LANGCHAIN4J_TRACING_INCLUDE_COMPLETION

boolean

false

If enabled, tool call arguments are included on the generated spans

Environment variable: QUARKUS_LANGCHAIN4J_TRACING_INCLUDE_TOOL_ARGUMENTS

boolean

false

If enabled, tool call results are included on the generated spans

Environment variable: QUARKUS_LANGCHAIN4J_TRACING_INCLUDE_TOOL_RESULT

boolean

false

Default model config

Type

Default

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J_CHAT_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J_SCORING_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J_EMBEDDING_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J_MODERATION_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J_IMAGE_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J__MODEL_NAME__CHAT_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J__MODEL_NAME__SCORING_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J__MODEL_NAME__EMBEDDING_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J__MODEL_NAME__MODERATION_MODEL_PROVIDER

string

The model provider to use

Environment variable: QUARKUS_LANGCHAIN4J__MODEL_NAME__IMAGE_MODEL_PROVIDER

string

Quarkus LangChain4j - Hugging Face

Type

Default

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_EMBEDDING_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_MODERATION_MODEL_ENABLED

boolean

true

HuggingFace API key

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_API_KEY

string

dummy

Timeout for HuggingFace calls

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_TIMEOUT

Duration 

10s

The URL of the inference endpoint for the chat model.

When using a deployed inference endpoint, the URL is the URL of the endpoint. When using a local hugging face model, the URL is the URL of the local model.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_INFERENCE_ENDPOINT_URL

URL

https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct

Float (0.0-100.0). The temperature of the sampling operation. 1 means regular sampling, 0 means always take the highest score, 100.0 is getting closer to uniform probability

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:1.0}

Int (0-250). The amount of new tokens to be generated, this does not include the input length it is a estimate of the size of generated text you want. Each new tokens slows down the request, so look for balance between response times and length of text generated

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_MAX_NEW_TOKENS

int

If set to false, the return results will not contain the original query making it easier for prompting

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_RETURN_FULL_TEXT

boolean

false

If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error as it will limit hanging in your application to known places

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_WAIT_FOR_MODEL

boolean

true

Whether or not to use sampling ; use greedy decoding otherwise.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_DO_SAMPLE

boolean

The number of highest probability vocabulary tokens to keep for top-k-filtering.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_TOP_K

int

If set to less than 1, only the most probable tokens with probabilities that add up to top_p or higher are kept for generation.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_TOP_P

double

The parameter for repetition penalty. 1.0 means no penalty. See this paper for more details.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_REPETITION_PENALTY

double

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_CHAT_MODEL_LOG_RESPONSES

boolean

false

The URL of the inference endpoint for the embedding.

When using a deployed inference endpoint, the URL is the URL of the endpoint. When using a local hugging face model, the URL is the URL of the local model.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_EMBEDDING_MODEL_INFERENCE_ENDPOINT_URL

URL

https://api-inference.huggingface.co/pipeline/feature-extraction/sentence-transformers/all-MiniLM-L6-v2

If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error as it will limit hanging in your application to known places

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_EMBEDDING_MODEL_WAIT_FOR_MODEL

boolean

true

Whether the HuggingFace client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_LOG_REQUESTS

boolean

false

Whether the HuggingFace client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_LOG_RESPONSES

boolean

false

Whether or not to enable the integration. Defaults to true, which means requests are made to the OpenAI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE_ENABLE_INTEGRATION

boolean

true

Named model config

Type

Default

HuggingFace API key

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__API_KEY

string

dummy

Timeout for HuggingFace calls

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__TIMEOUT

Duration 

10s

The URL of the inference endpoint for the chat model.

When using a deployed inference endpoint, the URL is the URL of the endpoint. When using a local hugging face model, the URL is the URL of the local model.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_INFERENCE_ENDPOINT_URL

URL

https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct

Float (0.0-100.0). The temperature of the sampling operation. 1 means regular sampling, 0 means always take the highest score, 100.0 is getting closer to uniform probability

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:1.0}

Int (0-250). The amount of new tokens to be generated, this does not include the input length it is a estimate of the size of generated text you want. Each new tokens slows down the request, so look for balance between response times and length of text generated

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_MAX_NEW_TOKENS

int

If set to false, the return results will not contain the original query making it easier for prompting

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_RETURN_FULL_TEXT

boolean

false

If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error as it will limit hanging in your application to known places

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_WAIT_FOR_MODEL

boolean

true

Whether or not to use sampling ; use greedy decoding otherwise.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_DO_SAMPLE

boolean

The number of highest probability vocabulary tokens to keep for top-k-filtering.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_TOP_K

int

If set to less than 1, only the most probable tokens with probabilities that add up to top_p or higher are kept for generation.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_TOP_P

double

The parameter for repetition penalty. 1.0 means no penalty. See this paper for more details.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_REPETITION_PENALTY

double

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

The URL of the inference endpoint for the embedding.

When using a deployed inference endpoint, the URL is the URL of the endpoint. When using a local hugging face model, the URL is the URL of the local model.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__EMBEDDING_MODEL_INFERENCE_ENDPOINT_URL

URL

https://api-inference.huggingface.co/pipeline/feature-extraction/sentence-transformers/all-MiniLM-L6-v2

If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error as it will limit hanging in your application to known places

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__EMBEDDING_MODEL_WAIT_FOR_MODEL

boolean

true

Whether the HuggingFace client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the HuggingFace client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether or not to enable the integration. Defaults to true, which means requests are made to the OpenAI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_HUGGINGFACE__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

Quarkus LangChain4j - Infinispan embedding store

Type

Default

The name of the Infinispan client to use. These clients are configured by means of the infinispan-client extension. If unspecified, it will use the default Infinispan client.

Environment variable: QUARKUS_LANGCHAIN4J_INFINISPAN_CLIENT_NAME

string

The dimension of the embedding vectors. This has to be the same as the dimension of vectors produced by the embedding model that you use. For example, AllMiniLmL6V2QuantizedEmbeddingModel produces vectors of dimension 384. OpenAI’s text-embedding-ada-002 produces vectors of dimension 1536.

Environment variable: QUARKUS_LANGCHAIN4J_INFINISPAN_DIMENSION

long

required

Name of the cache that will be used in Infinispan when searching for related embeddings. If this cache doesn’t exist, it will be created.

Environment variable: QUARKUS_LANGCHAIN4J_INFINISPAN_CACHE_NAME

string

embeddings-cache

The maximum distance. The most distance between vectors is how close or far apart two embeddings are.

Environment variable: QUARKUS_LANGCHAIN4J_INFINISPAN_DISTANCE

int

3

Quarkus LangChain4j - Milvus embedding store

Type

Default

Whether Dev Services for Milvus are enabled or not.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_ENABLED

boolean

true

Container image for Milvus.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_MILVUS_IMAGE_NAME

string

docker.io/milvusdb/milvus:v2.3.16

Optional fixed port the Milvus dev service will listen to. If not defined, the port will be chosen randomly.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_PORT

int

Indicates if the Dev Service containers managed by Quarkus for Milvus are shared.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_SHARED

boolean

true

Service label to apply to created Dev Services containers.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_SERVICE_NAME

string

milvus

The URL of the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_HOST

string

required

The port of the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PORT

int

required

The authentication token for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_TOKEN

string

The username for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_USERNAME

string

The password for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PASSWORD

string

The timeout duration for the Milvus client. If not specified, 5 seconds will be used.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_TIMEOUT

Duration 

Name of the database.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DB_NAME

string

default

Create the collection if it does not exist yet.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_CREATE_COLLECTION

boolean

true

Name of the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_COLLECTION_NAME

string

embeddings

Dimension of the vectors. Only applicable when the collection yet has to be created.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DIMENSION

int

Name of the field that contains the ID of the vector.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PRIMARY_FIELD

string

id

Name of the field that contains the text from which the vector was calculated.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_TEXT_FIELD

string

text

Name of the field that contains JSON metadata associated with the text.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_METADATA_FIELD

string

metadata

Name of the field to store the vector in.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_VECTOR_FIELD

string

vector

Description of the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DESCRIPTION

string

The index type to use for the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_INDEX_TYPE

none, flat, ivf-flat, ivf-sq8, ivf-pq, hnsw, hnsw-sq, hnsw-pq, hnsw-prq, diskann, autoindex, scann, gpu-ivf-flat, gpu-ivf-pq, gpu-brute-force, gpu-cagra, bin-flat, bin-ivf-flat, trie, stl-sort, inverted, bitmap, sparse-inverted-index, sparse-wand

flat

The metric type to use for searching.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_METRIC_TYPE

none, l2, ip, cosine, hamming, jaccard

cosine

The consistency level.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_CONSISTENCY_LEVEL

strong, session, bounded, eventually

eventually

Quarkus LangChain4j - Mistral AI

Type

Default

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_EMBEDDING_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_MODERATION_MODEL_ENABLED

boolean

true

Base URL of Mistral API

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_BASE_URL

string

https://api.mistral.ai/v1/

Mistral API key

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_API_KEY

string

dummy

Timeout for Mistral calls

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_TIMEOUT

Duration 

10s

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_MODEL_NAME

string

mistral-tiny

What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It is generally recommended to set this or the top-k property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_TEMPERATURE

double

0.7

The maximum number of tokens to generate in the completion.

The token count of your prompt plus max_tokens cannot exceed the model’s context length

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_MAX_TOKENS

int

Double (0.0-1.0). Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to set this or the temperature property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_TOP_P

double

1.0

Whether to inject a safety prompt before all conversations

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_SAFE_PROMPT

boolean

The seed to use for random sampling. If set, different calls will generate deterministic results.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_RANDOM_SEED

int

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_LOG_RESPONSES

boolean

false

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_EMBEDDING_MODEL_MODEL_NAME

string

mistral-embed

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_MODERATION_MODEL_MODEL_NAME

string

mistral-moderation-latest

Whether moderation model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_MODERATION_MODEL_LOG_REQUESTS

boolean

false

Whether moderation model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_MODERATION_MODEL_LOG_RESPONSES

boolean

false

Whether the Mistral client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_LOG_REQUESTS

boolean

false

Whether the Mistral client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the Mistral AI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_ENABLE_INTEGRATION

boolean

true

Named model config

Type

Default

Base URL of Mistral API

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__BASE_URL

string

https://api.mistral.ai/v1/

Mistral API key

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__API_KEY

string

dummy

Timeout for Mistral calls

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__TIMEOUT

Duration 

10s

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_MODEL_NAME

string

mistral-tiny

What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It is generally recommended to set this or the top-k property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

0.7

The maximum number of tokens to generate in the completion.

The token count of your prompt plus max_tokens cannot exceed the model’s context length

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_MAX_TOKENS

int

Double (0.0-1.0). Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to set this or the temperature property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_TOP_P

double

1.0

Whether to inject a safety prompt before all conversations

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_SAFE_PROMPT

boolean

The seed to use for random sampling. If set, different calls will generate deterministic results.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_RANDOM_SEED

int

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__EMBEDDING_MODEL_MODEL_NAME

string

mistral-embed

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__MODERATION_MODEL_MODEL_NAME

string

mistral-moderation-latest

Whether moderation model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__MODERATION_MODEL_LOG_REQUESTS

boolean

false

Whether moderation model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__MODERATION_MODEL_LOG_RESPONSES

boolean

false

Whether the Mistral client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the Mistral client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the Mistral AI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

Quarkus LangChain4j - Ollama

Type

Default

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_ENABLED

boolean

true

If Dev Services for Ollama has been explicitly enabled or disabled. Dev Services are generally enabled by default, unless there is an existing configuration present.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_DEVSERVICES_ENABLED

boolean

true

The Ollama container image to use.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_DEVSERVICES_IMAGE_NAME

string

ollama/ollama:latest

Model to use

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_MODEL_ID

string

llama3.2

Model to use. According to Ollama docs, the default value is nomic-embed-text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_MODEL_ID

string

nomic-embed-text

Base URL where the Ollama serving is running

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_BASE_URL

string

If set, the named TLS configuration with the configured name will be applied to the REST Client

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_TLS_CONFIGURATION_NAME

string

Timeout for Ollama calls

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_TIMEOUT

Duration 

10s

Whether the Ollama client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_LOG_REQUESTS

boolean

false

Whether the Ollama client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the OpenAI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_ENABLE_INTEGRATION

boolean

true

The temperature of the model. Increasing the temperature will make the model answer with more variability. A lower temperature will make the model answer more conservatively.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:0.8}

Maximum number of tokens to predict when generating text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_NUM_PREDICT

int

Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_STOP

list of string

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_TOP_P

double

0.9

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_TOP_K

int

40

With a static number the result is always the same. With a random number the result varies Example:

`Random random = new Random();
int x = random.nextInt(Integer.MAX_VALUE);`

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_SEED

int

The format to return a response in. Format can be json or a JSON schema.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_FORMAT

string

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_LOG_RESPONSES

boolean

false

The temperature of the model. Increasing the temperature will make the model answer with more variability. A lower temperature will make the model answer more conservatively.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:0.8}

Maximum number of tokens to predict when generating text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_NUM_PREDICT

int

128

Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_STOP

list of string

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_TOP_P

double

0.9

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_TOP_K

int

40

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

Named model config

Type

Default

Model to use

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_MODEL_ID

string

llama3.2

Model to use. According to Ollama docs, the default value is nomic-embed-text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_MODEL_ID

string

nomic-embed-text

Base URL where the Ollama serving is running

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__BASE_URL

string

If set, the named TLS configuration with the configured name will be applied to the REST Client

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__TLS_CONFIGURATION_NAME

string

Timeout for Ollama calls

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__TIMEOUT

Duration 

10s

Whether the Ollama client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the Ollama client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the OpenAI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

The temperature of the model. Increasing the temperature will make the model answer with more variability. A lower temperature will make the model answer more conservatively.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:0.8}

Maximum number of tokens to predict when generating text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_NUM_PREDICT

int

Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_STOP

list of string

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_TOP_P

double

0.9

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_TOP_K

int

40

With a static number the result is always the same. With a random number the result varies Example:

`Random random = new Random();
int x = random.nextInt(Integer.MAX_VALUE);`

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_SEED

int

The format to return a response in. Format can be json or a JSON schema.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_FORMAT

string

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

The temperature of the model. Increasing the temperature will make the model answer with more variability. A lower temperature will make the model answer more conservatively.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:0.8}

Maximum number of tokens to predict when generating text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_NUM_PREDICT

int

128

Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_STOP

list of string

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_TOP_P

double

0.9

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_TOP_K

int

40

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

Quarkus LangChain4j - OpenAI

Type

Default

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_EMBEDDING_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_MODERATION_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_ENABLED

boolean

true

Base URL of OpenAI API

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_BASE_URL

string

https://api.openai.com/v1/

If set, the named TLS configuration with the configured name will be applied to the REST Client

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_TLS_CONFIGURATION_NAME

string

OpenAI API key

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_API_KEY

string

dummy

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_ORGANIZATION_ID

string

Timeout for OpenAI calls

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_TIMEOUT

Duration 

10s

The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_MAX_RETRIES

int

1

Whether the OpenAI client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_LOG_REQUESTS

boolean

false

Whether the OpenAI client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the OpenAI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_ENABLE_INTEGRATION

boolean

true

The Proxy type

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_PROXY_TYPE

string

HTTP

The Proxy host

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_PROXY_HOST

string

The Proxy port

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_PROXY_PORT

int

3128

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_MODEL_NAME

string

gpt-4o-mini

What sampling temperature to use, with values between 0 and 2. Higher values means the model will take more risks. A value of 0.9 is good for more creative applications, while 0 (argmax sampling) is good for ones with a well-defined answer. It is recommended to alter this or topP, but not both.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:1.0}

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with topP probability mass. 0.1 means only the tokens comprising the top 10% probability mass are considered. It is recommended to alter this or temperature, but not both.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_TOP_P

double

1.0

An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_MAX_COMPLETION_TOKENS

int

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_PRESENCE_PENALTY

double

0

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_FREQUENCY_PENALTY

double

0

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_LOG_RESPONSES

boolean

false

The response format the model should use. Some models are not compatible with some response formats, make sure to review OpenAI documentation.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_RESPONSE_FORMAT

string

Whether responses follow JSON Schema for Structured Outputs

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_STRICT_JSON_SCHEMA

boolean

The list of stop words to use.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_STOP

list of string

Constrains effort on reasoning for reasoning models. Currently supported values are minimal, low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

Note: The gpt-5-pro model defaults to (and only supports) high reasoning effort.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_REASONING_EFFORT

string

Specifies the processing type used for serving the request.

If set to auto, then the request will be processed with the service tier configured in the Project settings. If set to default, then the request will be processed with the standard pricing and performance for the selected model. If set to flex or priority, then the request will be processed with the corresponding service tier. When not set, the default behavior is auto.

When the service tier parameter is set, the response body will include the service_tier value based on the processing mode actually used to serve the request. This response value may be different from the value set in the parameter.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_CHAT_MODEL_SERVICE_TIER

string

default

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_EMBEDDING_MODEL_MODEL_NAME

string

text-embedding-ada-002

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_EMBEDDING_MODEL_USER

string

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_MODERATION_MODEL_MODEL_NAME

string

omni-moderation-latest

Whether moderation model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_MODERATION_MODEL_LOG_REQUESTS

boolean

false

Whether moderation model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_MODERATION_MODEL_LOG_RESPONSES

boolean

false

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_MODEL_NAME

string

dall-e-3

Configure whether the generated images will be saved to disk. By default, persisting is disabled, but it is implicitly enabled when quarkus.langchain4j.openai.image-mode.directory is set and this property is not to false

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_PERSIST

boolean

false

The path where the generated images will be persisted to disk. This only applies of quarkus.langchain4j.openai.image-mode.persist is not set to false.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_PERSIST_DIRECTORY

path

${java.io.tmpdir}/dall-e-images

The format in which the generated images are returned.

Must be one of url or b64_json

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_RESPONSE_FORMAT

string

url

The size of the generated images.

Must be one of 1024x1024, 1792x1024, or 1024x1792 when the model is dall-e-3.

Must be one of 256x256, 512x512, or 1024x1024 when the model is dall-e-2.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_SIZE

string

1024x1024

The quality of the image that will be generated.

hd creates images with finer details and greater consistency across the image.

This param is only supported for when the model is dall-e-3.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_QUALITY

string

standard

The number of images to generate.

Must be between 1 and 10.

When the model is dall-e-3, only n=1 is supported.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_NUMBER

int

1

The style of the generated images.

Must be one of vivid or natural. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images.

This param is only supported for when the model is dall-e-3.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_STYLE

string

vivid

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_USER

string

Whether image model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_LOG_REQUESTS

boolean

false

Whether image model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI_IMAGE_MODEL_LOG_RESPONSES

boolean

false

Named model config

Type

Default

Base URL of OpenAI API

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__BASE_URL

string

https://api.openai.com/v1/

If set, the named TLS configuration with the configured name will be applied to the REST Client

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__TLS_CONFIGURATION_NAME

string

OpenAI API key

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__API_KEY

string

dummy

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__ORGANIZATION_ID

string

Timeout for OpenAI calls

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__TIMEOUT

Duration 

10s

The maximum number of times to retry. 1 means exactly one attempt, with retrying disabled.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__MAX_RETRIES

int

1

Whether the OpenAI client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the OpenAI client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the OpenAI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

The Proxy type

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__PROXY_TYPE

string

HTTP

The Proxy host

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__PROXY_HOST

string

The Proxy port

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__PROXY_PORT

int

3128

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_MODEL_NAME

string

gpt-4o-mini

What sampling temperature to use, with values between 0 and 2. Higher values means the model will take more risks. A value of 0.9 is good for more creative applications, while 0 (argmax sampling) is good for ones with a well-defined answer. It is recommended to alter this or topP, but not both.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:1.0}

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with topP probability mass. 0.1 means only the tokens comprising the top 10% probability mass are considered. It is recommended to alter this or temperature, but not both.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_TOP_P

double

1.0

An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_MAX_COMPLETION_TOKENS

int

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_PRESENCE_PENALTY

double

0

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_FREQUENCY_PENALTY

double

0

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

The response format the model should use. Some models are not compatible with some response formats, make sure to review OpenAI documentation.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_RESPONSE_FORMAT

string

Whether responses follow JSON Schema for Structured Outputs

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_STRICT_JSON_SCHEMA

boolean

The list of stop words to use.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_STOP

list of string

Constrains effort on reasoning for reasoning models. Currently supported values are minimal, low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

Note: The gpt-5-pro model defaults to (and only supports) high reasoning effort.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_REASONING_EFFORT

string

Specifies the processing type used for serving the request.

If set to auto, then the request will be processed with the service tier configured in the Project settings. If set to default, then the request will be processed with the standard pricing and performance for the selected model. If set to flex or priority, then the request will be processed with the corresponding service tier. When not set, the default behavior is auto.

When the service tier parameter is set, the response body will include the service_tier value based on the processing mode actually used to serve the request. This response value may be different from the value set in the parameter.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__CHAT_MODEL_SERVICE_TIER

string

default

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__EMBEDDING_MODEL_MODEL_NAME

string

text-embedding-ada-002

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__EMBEDDING_MODEL_USER

string

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__MODERATION_MODEL_MODEL_NAME

string

omni-moderation-latest

Whether moderation model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__MODERATION_MODEL_LOG_REQUESTS

boolean

false

Whether moderation model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__MODERATION_MODEL_LOG_RESPONSES

boolean

false

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_MODEL_NAME

string

dall-e-3

Configure whether the generated images will be saved to disk. By default, persisting is disabled, but it is implicitly enabled when quarkus.langchain4j.openai.image-mode.directory is set and this property is not to false

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_PERSIST

boolean

false

The path where the generated images will be persisted to disk. This only applies of quarkus.langchain4j.openai.image-mode.persist is not set to false.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_PERSIST_DIRECTORY

path

${java.io.tmpdir}/dall-e-images

The format in which the generated images are returned.

Must be one of url or b64_json

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_RESPONSE_FORMAT

string

url

The size of the generated images.

Must be one of 1024x1024, 1792x1024, or 1024x1792 when the model is dall-e-3.

Must be one of 256x256, 512x512, or 1024x1024 when the model is dall-e-2.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_SIZE

string

1024x1024

The quality of the image that will be generated.

hd creates images with finer details and greater consistency across the image.

This param is only supported for when the model is dall-e-3.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_QUALITY

string

standard

The number of images to generate.

Must be between 1 and 10.

When the model is dall-e-3, only n=1 is supported.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_NUMBER

int

1

The style of the generated images.

Must be one of vivid or natural. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images.

This param is only supported for when the model is dall-e-3.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_STYLE

string

vivid

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_USER

string

Whether image model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_LOG_REQUESTS

boolean

false

Whether image model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OPENAI__MODEL_NAME__IMAGE_MODEL_LOG_RESPONSES

boolean

false

Quarkus LangChain4j - pgvector

Type

Default

The name of the configured Postgres datasource to use for this store. If not set, the default datasource from the Agroal extension will be used.

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_DATASOURCE

string

The table name for storing embeddings

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_TABLE

string

embeddings

The dimension of the embedding vectors. This has to be the same as the dimension of vectors produced by the embedding model that you use. For example, AllMiniLmL6V2QuantizedEmbeddingModel produces vectors of dimension 384. OpenAI’s text-embedding-ada-002 produces vectors of dimension 1536.

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_DIMENSION

int

required

Use index or not

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_USE_INDEX

boolean

false

index size

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_INDEX_LIST_SIZE

int

0

Whether the table should be created if not already existing.

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_CREATE_TABLE

boolean

true

Whether the table should be dropped prior to being created.

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_DROP_TABLE_FIRST

boolean

false

Whether the PG extension should be created on Start. By Default, if it’s dev or test environment the value is overridden to true

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_REGISTER_VECTOR_PG_EXTENSION

boolean

false

Metadata type:

  • COLUMN_PER_KEY: for static metadata, when you know in advance the list of metadata fields. In this case, you should also override the quarkus.langchain4j.pgvector.metadata.column-definitions property to define the right columns.

  • COMBINED_JSON: For dynamic metadata, when you don’t know the list of metadata fields that will be used.

  • COMBINED_JSONB: Same as JSON, but stored in a binary way. Optimized for query on large dataset. In this case, you should also override the quarkus.langchain4j.pgvector.metadata.column-definitions property to change the type of the metadata column to COMBINED_JSONB.

Default value: COMBINED_JSON

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_METADATA_STORAGE_MODE

column-per-key, combined-json, combined-jsonb

combined-json

Metadata Definition: SQL definition of metadata field(s). By default, "metadata JSON NULL" configured. This is only suitable if using the JSON metadata type.

If using JSONB metadata type, this should in most cases be set to metadata JSONB NULL.

If using COLUMNS metadata type, this should be a list of columns, one column for each desired metadata field. Example: condominium_id uuid null, user uuid null

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_METADATA_COLUMN_DEFINITIONS

list of string

metadata JSON NULL

Metadata Indexes, list of fields to use as index.

For instance:

  • JSON: with JSON metadata, indexes are not allowed, so this property must be empty. To use indexes, switch to JSONB metadata.

  • JSONB: (metadata→'key'), (metadata→'name'), (metadata→'age')

  • COLUMNS: key, name, age

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_METADATA_INDEXES

list of string

Index Type:

Environment variable: QUARKUS_LANGCHAIN4J_PGVECTOR_METADATA_INDEX_TYPE

string

BTREE

Quarkus LangChain4j - Redis embedding store

Type

Default

The name of the Redis client to use. These clients are configured by means of the redis-client extension. If unspecified, it will use the default Redis client.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_CLIENT_NAME

string

The dimension of the embedding vectors. This has to be the same as the dimension of vectors produced by the embedding model that you use. For example, AllMiniLmL6V2QuantizedEmbeddingModel produces vectors of dimension 384. OpenAI’s text-embedding-ada-002 produces vectors of dimension 1536.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_DIMENSION

long

required

Name of the index that will be used in Redis when searching for related embeddings. If this index doesn’t exist, it will be created.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_INDEX_NAME

string

embedding-index

Names of fields that will store textual metadata associated with embeddings. NOTE: Filtering based on textual metadata fields is not supported at the moment.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_TEXTUAL_METADATA_FIELDS

list of string

Names of fields that will store numeric metadata associated with embeddings.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_NUMERIC_METADATA_FIELDS

list of string

Metric used to compute the distance between two vectors.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_DISTANCE_METRIC

l2, ip, cosine

cosine

Name of the key that will be used to store the embedding vector.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_VECTOR_FIELD_NAME

string

vector

Name of the key that will be used to store the embedded text.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_SCALAR_FIELD_NAME

string

scalar

Prefix to be applied to all keys by the embedding store. Embeddings are stored in Redis under a key that is the concatenation of this prefix and the embedding ID.

If the configured prefix does not ends with :, it will be added automatically to follow the Redis convention.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_PREFIX

string

embedding:

Algorithm used to index the embedding vectors.

Environment variable: QUARKUS_LANGCHAIN4J_REDIS_VECTOR_ALGORITHM

flat, hnsw

hnsw

Quarkus LangChain4j - Watsonx

Type

Default

Whether the model should be enabled.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_ENABLED

boolean

true

Whether the embedding model should be enabled.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_EMBEDDING_MODEL_ENABLED

boolean

true

Whether the scoring model should be enabled.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_SCORING_MODEL_ENABLED

boolean

true

Specifies the mode of interaction with the LLM model.

This property allows you to choose between two modes of operation:

  • chat: prompts are automatically enriched with the specific tags defined by the model

  • generation: prompts require manual specification of tags

Allowable values: [chat, generation]

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_MODE

string

chat

Specifies the base URL of the watsonx.ai API.

A list of all available URLs is provided in the IBM Watsonx.ai documentation at the this link.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_BASE_URL

string

IBM Cloud API key.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_API_KEY

string

Timeout for watsonx.ai calls.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_TIMEOUT

Duration 

10s

The version date for the API of the form YYYY-MM-DD.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_VERSION

string

2025-04-23

The space that contains the resource.

Either space_id or project_id has to be given.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_SPACE_ID

string

The project that contains the resource.

Either space_id or project_id has to be given.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_PROJECT_ID

string

Whether the watsonx.ai client should log requests.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_LOG_REQUESTS

boolean

false

Whether the watsonx.ai client should log responses.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the watsonx.ai provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_ENABLE_INTEGRATION

boolean

true

Base URL of the IAM Authentication API.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_IAM_BASE_URL

URL

https://iam.cloud.ibm.com

Timeout for IAM authentication calls.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_IAM_TIMEOUT

Duration 

10s

Grant type for the IAM Authentication API.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_IAM_GRANT_TYPE

string

urn:ibm:params:oauth:grant-type:apikey

Base URL of the Cloud Object Storage API.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_TEXT_EXTRACTION_BASE_URL

string

required

The ID of the connection asset that contains the credentials required to access the data.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_TEXT_EXTRACTION_DOCUMENT_REFERENCE_CONNECTION

string

required

The name of the bucket containing the input document.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_TEXT_EXTRACTION_DOCUMENT_REFERENCE_BUCKET_NAME

string

required

The ID of the connection asset used to store the extracted results.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_TEXT_EXTRACTION_RESULTS_REFERENCE_CONNECTION

string

required

The name of the bucket where the output files will be written.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_TEXT_EXTRACTION_RESULTS_REFERENCE_BUCKET_NAME

string

required

Whether the Cloud Object Storage client should log requests.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_TEXT_EXTRACTION_LOG_REQUESTS

boolean

false

Whether the Cloud Object Storage client should log responses.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_TEXT_EXTRACTION_LOG_RESPONSES

boolean

false

Specifies the model to use for the chat completion.

A list of all available models is provided in the IBM watsonx.ai documentation at the this link.

To use a model, locate the API model ID column in the table and copy the corresponding model ID.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_MODEL_NAME

string

meta-llama/llama-4-maverick-17b-128e-instruct-fp8

Specifies how the model should choose which tool to call during a request.

This value can be:

  • auto: The model decides whether and which tool to call automatically.

  • required: The model must call one of the available tools.

If toolChoiceName is set, this value is ignored.

Setting this value influences the tool-calling behavior of the model when no specific tool is required.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_TOOL_CHOICE

auto, required, none

Specifies the name of a specific tool that the model must call.

When set, the model will be forced to call the specified tool. The name must exactly match one of the available tools defined for the service.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_TOOL_CHOICE_NAME

string

Positive values penalize new tokens based on their existing frequency in the generated text, reducing the likelihood of the model repeating the same lines verbatim.

Possible values: -2 < value < 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_FREQUENCY_PENALTY

double

0

Specifies whether to return the log probabilities of the output tokens.

If set to true, the response will include the log probability of each output token in the content of the message.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_LOGPROBS

boolean

false

An integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option logprobs must be set to true if this parameter is used.

Possible values: 0 ≤ value ≤ 20

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_TOP_LOGPROBS

int

The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length. Set to 0 for the model’s configured max generated tokens.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_MAX_TOKENS

int

1024

Specifies how many chat completion choices to generate for each input message.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_N

int

1

Applies a penalty to new tokens based on whether they already appear in the generated text so far, encouraging the model to introduce new topics rather than repeat itself.

Possible values: -2 < value < 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_PRESENCE_PENALTY

double

0

Random number generator seed to use in sampling mode for experimental repeatability.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_SEED

int

Defines one or more stop sequences that will cause the model to stop generating further tokens if any of them are encountered in the output.

This allows control over where the model should end its response. If a stop sequence is encountered before the minimum number of tokens has been generated, it will be ignored.

Possible values: 0 ≤ number of items ≤ 4

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_STOP

list of string

Specifies the sampling temperature to use in the generation process.

Higher values (e.g. 0.8) make the output more random and diverse, while lower values (e.g. 0.2) make the output more focused and deterministic.

Possible values: 0 < value < 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:1.0}

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

Possible values: 0 < value < 1

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_TOP_P

double

1

Specifies the desired format for the model’s output.

Allowable values: [text, json_object, json_schema]

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_RESPONSE_FORMAT

string

Whether chat model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_CHAT_MODEL_LOG_RESPONSES

boolean

false

The id of the model to be used.

All available models are listed in the IBM Watsonx.ai documentation at the link: following link.

To use a model, locate the API model_id column in the table and copy the corresponding model ID.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_MODEL_NAME

string

meta-llama/llama-4-maverick-17b-128e-instruct-fp8

Represents the strategy used for picking the tokens during generation of the output text. During text generation when parameter value is set to greedy, each successive token corresponds to the highest probability token given the text that has already been generated. This strategy can lead to repetitive results especially for longer output sequences. The alternative sample strategy generates text by picking subsequent tokens based on the probability distribution of possible next tokens defined by (i.e., conditioned on) the already-generated text and the top_k and top_p parameters.

Allowable values: [sample,greedy]

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_DECODING_METHOD

string

greedy

Represents the factor of exponential decay. Larger values correspond to more aggressive decay.

Possible values: > 1

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_LENGTH_PENALTY_DECAY_FACTOR

double

A number of generated tokens after which this should take effect.

Possible values: ≥ 0

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_LENGTH_PENALTY_START_INDEX

int

The maximum number of new tokens to be generated. The maximum supported value for this field depends on the model being used. How the "token" is defined depends on the tokenizer and vocabulary size, which in turn depends on the model. Often the tokens are a mix of full words and sub-words. Depending on the users plan, and on the model being used, there may be an enforced maximum number of new tokens.

Possible values: ≥ 0

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_MAX_NEW_TOKENS

int

200

If stop sequences are given, they are ignored until minimum tokens are generated.

Possible values: ≥ 0

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_MIN_NEW_TOKENS

int

0

Random number generator seed to use in sampling mode for experimental repeatability.

Possible values: ≥ 1

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_RANDOM_SEED

int

Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.

Possible values: 0 ≤ number of items ≤ 6

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_STOP_SEQUENCES

list of string

A value used to modify the next-token probabilities in sampling mode. Values less than 1.0 sharpen the probability distribution, resulting in "less random" output. Values greater than 1.0 flatten the probability distribution, resulting in "more random" output. A value of 1.0 has no effect.

Possible values: 0 ≤ value ≤ 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:1.0}

The number of highest probability vocabulary tokens to keep for top-k-filtering. Only applies for sampling mode. When decoding_strategy is set to sample, only the top_k most likely tokens are considered as candidates for the next generated token.

Possible values: 1 ≤ value ≤ 100

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_TOP_K

int

Similar to top_k except the candidates to generate the next token are the most likely tokens with probabilities that add up to at least top_p. Also known as nucleus sampling. A value of 1.0 is equivalent to disabled.

Possible values: 0 < value ≤ 1

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_TOP_P

double

Represents the penalty for penalizing tokens that have already been generated or belong to the context. The value 1.0 means that there is no penalty.

Possible values: 1 ≤ value ≤ 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_REPETITION_PENALTY

double

Represents the maximum number of input tokens accepted. This can be used to avoid requests failing due to input being longer than configured limits. If the text is truncated, then it truncates the start of the input (on the left), so the end of the input will remain the same. If this value exceeds the maximum sequence length (refer to the documentation to find this value for the model) then the call will fail if the total number of tokens exceeds the maximum sequence length. Zero means don’t truncate.

Possible values: ≥ 0

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_TRUNCATE_INPUT_TOKENS

int

Pass false to omit matched stop sequences from the end of the output text. The default is true, meaning that the output will end with the stop sequence text when matched.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_INCLUDE_STOP_SEQUENCE

boolean

Whether chat model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_LOG_RESPONSES

boolean

false

Delimiter used to concatenate the ChatMessage elements into a single string. By setting this property, you can define your preferred way of concatenating messages to ensure that the prompt is structured in the correct way.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_GENERATION_MODEL_PROMPT_JOINER

string

Specifies the ID of the model to be used.

A list of all available models is provided in the IBM watsonx.ai documentation at the this link.

To use a model, locate the API model ID column in the table and copy the corresponding model ID.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_EMBEDDING_MODEL_MODEL_NAME

string

ibm/granite-embedding-278m-multilingual

Specifies the maximum number of input tokens accepted. This can be used to prevent requests from failing due to input exceeding the configured token limits.

If the input exceeds the specified token limit, the input will be truncated from the end (right side), ensuring that the start of the input remains intact. If the provided value exceeds the model’s maximum sequence length (refer to the documentation for the model’s maximum sequence length), the request will fail if the total number of tokens exceeds the maximum limit.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_EMBEDDING_MODEL_TRUNCATE_INPUT_TOKENS

int

Whether embedding model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

The id of the model to be used.

All available models are listed in the IBM Watsonx.ai documentation at the link: following link.

To use a model, locate the API model_id column in the table and copy the corresponding model ID.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_SCORING_MODEL_MODEL_NAME

string

cross-encoder/ms-marco-minilm-l-12-v2

Specifies the maximum number of input tokens accepted. This helps to avoid requests failing due to input exceeding the configured token limits.

If the input exceeds the specified token limit, the text will be truncated from the end (right side), ensuring that the start of the input remains intact. If the provided value exceeds the model’s maximum sequence length (refer to the documentation for the model’s maximum sequence length), the request will fail if the total number of tokens exceeds the maximum limit.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_SCORING_MODEL_TRUNCATE_INPUT_TOKENS

int

Whether embedding model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_SCORING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_SCORING_MODEL_LOG_RESPONSES

boolean

false

Base URL for the built-in service.

All available URLs are listed in the IBM Watsonx.ai documentation at the following link.

Note: If empty, the URL is automatically calculated based on the watsonx.base-url value.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_BUILT_IN_SERVICE_BASE_URL

string

IBM Cloud API key.

If empty, the api key inherits the value from the watsonx.api-key property.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_BUILT_IN_SERVICE_API_KEY

string

Timeout for built-in tools APIs.

If empty, the api key inherits the value from the watsonx.timeout property.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_BUILT_IN_SERVICE_TIMEOUT

Duration 

10s

Whether the built-in rest client should log requests.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_BUILT_IN_SERVICE_LOG_REQUESTS

boolean

false

Whether the built-in rest client should log responses.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_BUILT_IN_SERVICE_LOG_RESPONSES

boolean

false

Maximum number of search results.

Possible values: 1 < value < 20

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX_BUILT_IN_SERVICE_GOOGLE_SEARCH_MAX_RESULTS

int

10

Named model config

Type

Default

Specifies the mode of interaction with the LLM model.

This property allows you to choose between two modes of operation:

  • chat: prompts are automatically enriched with the specific tags defined by the model

  • generation: prompts require manual specification of tags

Allowable values: [chat, generation]

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__MODE

string

chat

Specifies the base URL of the watsonx.ai API.

A list of all available URLs is provided in the IBM Watsonx.ai documentation at the this link.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__BASE_URL

string

IBM Cloud API key.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__API_KEY

string

Timeout for watsonx.ai calls.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__TIMEOUT

Duration 

10s

The version date for the API of the form YYYY-MM-DD.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__VERSION

string

2025-04-23

The space that contains the resource.

Either space_id or project_id has to be given.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__SPACE_ID

string

The project that contains the resource.

Either space_id or project_id has to be given.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__PROJECT_ID

string

Whether the watsonx.ai client should log requests.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the watsonx.ai client should log responses.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the watsonx.ai provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

Base URL of the IAM Authentication API.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__IAM_BASE_URL

URL

https://iam.cloud.ibm.com

Timeout for IAM authentication calls.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__IAM_TIMEOUT

Duration 

10s

Grant type for the IAM Authentication API.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__IAM_GRANT_TYPE

string

urn:ibm:params:oauth:grant-type:apikey

Base URL of the Cloud Object Storage API.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__TEXT_EXTRACTION_BASE_URL

string

required

The ID of the connection asset that contains the credentials required to access the data.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__TEXT_EXTRACTION_DOCUMENT_REFERENCE_CONNECTION

string

required

The name of the bucket containing the input document.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__TEXT_EXTRACTION_DOCUMENT_REFERENCE_BUCKET_NAME

string

required

The ID of the connection asset used to store the extracted results.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__TEXT_EXTRACTION_RESULTS_REFERENCE_CONNECTION

string

required

The name of the bucket where the output files will be written.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__TEXT_EXTRACTION_RESULTS_REFERENCE_BUCKET_NAME

string

required

Whether the Cloud Object Storage client should log requests.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__TEXT_EXTRACTION_LOG_REQUESTS

boolean

false

Whether the Cloud Object Storage client should log responses.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__TEXT_EXTRACTION_LOG_RESPONSES

boolean

false

Specifies the model to use for the chat completion.

A list of all available models is provided in the IBM watsonx.ai documentation at the this link.

To use a model, locate the API model ID column in the table and copy the corresponding model ID.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_MODEL_NAME

string

meta-llama/llama-4-maverick-17b-128e-instruct-fp8

Specifies how the model should choose which tool to call during a request.

This value can be:

  • auto: The model decides whether and which tool to call automatically.

  • required: The model must call one of the available tools.

If toolChoiceName is set, this value is ignored.

Setting this value influences the tool-calling behavior of the model when no specific tool is required.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_TOOL_CHOICE

auto, required, none

Specifies the name of a specific tool that the model must call.

When set, the model will be forced to call the specified tool. The name must exactly match one of the available tools defined for the service.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_TOOL_CHOICE_NAME

string

Positive values penalize new tokens based on their existing frequency in the generated text, reducing the likelihood of the model repeating the same lines verbatim.

Possible values: -2 < value < 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_FREQUENCY_PENALTY

double

0

Specifies whether to return the log probabilities of the output tokens.

If set to true, the response will include the log probability of each output token in the content of the message.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_LOGPROBS

boolean

false

An integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option logprobs must be set to true if this parameter is used.

Possible values: 0 ≤ value ≤ 20

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_TOP_LOGPROBS

int

The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length. Set to 0 for the model’s configured max generated tokens.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_MAX_TOKENS

int

1024

Specifies how many chat completion choices to generate for each input message.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_N

int

1

Applies a penalty to new tokens based on whether they already appear in the generated text so far, encouraging the model to introduce new topics rather than repeat itself.

Possible values: -2 < value < 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_PRESENCE_PENALTY

double

0

Random number generator seed to use in sampling mode for experimental repeatability.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_SEED

int

Defines one or more stop sequences that will cause the model to stop generating further tokens if any of them are encountered in the output.

This allows control over where the model should end its response. If a stop sequence is encountered before the minimum number of tokens has been generated, it will be ignored.

Possible values: 0 ≤ number of items ≤ 4

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_STOP

list of string

Specifies the sampling temperature to use in the generation process.

Higher values (e.g. 0.8) make the output more random and diverse, while lower values (e.g. 0.2) make the output more focused and deterministic.

Possible values: 0 < value < 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:1.0}

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

Possible values: 0 < value < 1

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_TOP_P

double

1

Specifies the desired format for the model’s output.

Allowable values: [text, json_object, json_schema]

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_RESPONSE_FORMAT

string

Whether chat model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

The id of the model to be used.

All available models are listed in the IBM Watsonx.ai documentation at the link: following link.

To use a model, locate the API model_id column in the table and copy the corresponding model ID.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_MODEL_NAME

string

meta-llama/llama-4-maverick-17b-128e-instruct-fp8

Represents the strategy used for picking the tokens during generation of the output text. During text generation when parameter value is set to greedy, each successive token corresponds to the highest probability token given the text that has already been generated. This strategy can lead to repetitive results especially for longer output sequences. The alternative sample strategy generates text by picking subsequent tokens based on the probability distribution of possible next tokens defined by (i.e., conditioned on) the already-generated text and the top_k and top_p parameters.

Allowable values: [sample,greedy]

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_DECODING_METHOD

string

greedy

Represents the factor of exponential decay. Larger values correspond to more aggressive decay.

Possible values: > 1

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_LENGTH_PENALTY_DECAY_FACTOR

double

A number of generated tokens after which this should take effect.

Possible values: ≥ 0

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_LENGTH_PENALTY_START_INDEX

int

The maximum number of new tokens to be generated. The maximum supported value for this field depends on the model being used. How the "token" is defined depends on the tokenizer and vocabulary size, which in turn depends on the model. Often the tokens are a mix of full words and sub-words. Depending on the users plan, and on the model being used, there may be an enforced maximum number of new tokens.

Possible values: ≥ 0

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_MAX_NEW_TOKENS

int

200

If stop sequences are given, they are ignored until minimum tokens are generated.

Possible values: ≥ 0

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_MIN_NEW_TOKENS

int

0

Random number generator seed to use in sampling mode for experimental repeatability.

Possible values: ≥ 1

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_RANDOM_SEED

int

Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.

Possible values: 0 ≤ number of items ≤ 6

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_STOP_SEQUENCES

list of string

A value used to modify the next-token probabilities in sampling mode. Values less than 1.0 sharpen the probability distribution, resulting in "less random" output. Values greater than 1.0 flatten the probability distribution, resulting in "more random" output. A value of 1.0 has no effect.

Possible values: 0 ≤ value ≤ 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:1.0}

The number of highest probability vocabulary tokens to keep for top-k-filtering. Only applies for sampling mode. When decoding_strategy is set to sample, only the top_k most likely tokens are considered as candidates for the next generated token.

Possible values: 1 ≤ value ≤ 100

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_TOP_K

int

Similar to top_k except the candidates to generate the next token are the most likely tokens with probabilities that add up to at least top_p. Also known as nucleus sampling. A value of 1.0 is equivalent to disabled.

Possible values: 0 < value ≤ 1

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_TOP_P

double

Represents the penalty for penalizing tokens that have already been generated or belong to the context. The value 1.0 means that there is no penalty.

Possible values: 1 ≤ value ≤ 2

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_REPETITION_PENALTY

double

Represents the maximum number of input tokens accepted. This can be used to avoid requests failing due to input being longer than configured limits. If the text is truncated, then it truncates the start of the input (on the left), so the end of the input will remain the same. If this value exceeds the maximum sequence length (refer to the documentation to find this value for the model) then the call will fail if the total number of tokens exceeds the maximum sequence length. Zero means don’t truncate.

Possible values: ≥ 0

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_TRUNCATE_INPUT_TOKENS

int

Pass false to omit matched stop sequences from the end of the output text. The default is true, meaning that the output will end with the stop sequence text when matched.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_INCLUDE_STOP_SEQUENCE

boolean

Whether chat model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_LOG_RESPONSES

boolean

false

Delimiter used to concatenate the ChatMessage elements into a single string. By setting this property, you can define your preferred way of concatenating messages to ensure that the prompt is structured in the correct way.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__GENERATION_MODEL_PROMPT_JOINER

string

Specifies the ID of the model to be used.

A list of all available models is provided in the IBM watsonx.ai documentation at the this link.

To use a model, locate the API model ID column in the table and copy the corresponding model ID.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__EMBEDDING_MODEL_MODEL_NAME

string

ibm/granite-embedding-278m-multilingual

Specifies the maximum number of input tokens accepted. This can be used to prevent requests from failing due to input exceeding the configured token limits.

If the input exceeds the specified token limit, the input will be truncated from the end (right side), ensuring that the start of the input remains intact. If the provided value exceeds the model’s maximum sequence length (refer to the documentation for the model’s maximum sequence length), the request will fail if the total number of tokens exceeds the maximum limit.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__EMBEDDING_MODEL_TRUNCATE_INPUT_TOKENS

int

Whether embedding model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

The id of the model to be used.

All available models are listed in the IBM Watsonx.ai documentation at the link: following link.

To use a model, locate the API model_id column in the table and copy the corresponding model ID.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__SCORING_MODEL_MODEL_NAME

string

cross-encoder/ms-marco-minilm-l-12-v2

Specifies the maximum number of input tokens accepted. This helps to avoid requests failing due to input exceeding the configured token limits.

If the input exceeds the specified token limit, the text will be truncated from the end (right side), ensuring that the start of the input remains intact. If the provided value exceeds the model’s maximum sequence length (refer to the documentation for the model’s maximum sequence length), the request will fail if the total number of tokens exceeds the maximum limit.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__SCORING_MODEL_TRUNCATE_INPUT_TOKENS

int

Whether embedding model requests should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__SCORING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged.

Environment variable: QUARKUS_LANGCHAIN4J_WATSONX__MODEL_NAME__SCORING_MODEL_LOG_RESPONSES

boolean

false

Quarkus LangChain4j - Weaviate

Type

Default

If DevServices has been explicitly enabled or disabled. DevServices is generally enabled by default, unless there is an existing configuration present.

When DevServices is enabled Quarkus will attempt to automatically configure and start a database when running in Dev or Test mode and when Docker is running.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_DEVSERVICES_ENABLED

boolean

true

The container image name to use, for container based DevServices providers. If you want to use Redis Stack modules (bloom, graph, search…​), use: redis/redis-stack:latest.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_DEVSERVICES_IMAGE_NAME

string

cr.weaviate.io/semitechnologies/weaviate:1.25.5

Optional fixed port the dev service will listen to.

If not defined, the port will be chosen randomly.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_DEVSERVICES_PORT

int

Indicates if the Redis server managed by Quarkus Dev Services is shared. When shared, Quarkus looks for running containers using label-based service discovery. If a matching container is found, it is used, and so a second one is not started. Otherwise, Dev Services for Redis starts a new container.

The discovery uses the quarkus-dev-service-weaviate label. The value is configured using the service-name property.

Container sharing is only used in dev mode.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_DEVSERVICES_SHARED

boolean

true

The value of the quarkus-dev-service-weaviate label attached to the started container. This property is used when shared is set to true. In this case, before starting a container, Dev Services for Redis looks for a container with the quarkus-dev-service-weaviate label set to the configured value. If found, it will use this container instead of starting a new one. Otherwise, it starts a new container with the quarkus-dev-service-weaviate label set to the specified value.

This property is used when you need multiple shared Weaviate servers.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_DEVSERVICES_SERVICE_NAME

string

weaviate

Environment variables that are passed to the container.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_DEVSERVICES_CONTAINER_ENV__CONTAINER_ENV_

Map<String,String>

The Weaviate API key to authenticate with.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_API_KEY

string

The scheme, e.g. "https" of cluster URL. Find it under Details of your Weaviate cluster.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_SCHEME

string

http

The URL of the Weaviate server.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_HOST

string

localhost

The gRPC port of the Weaviate server. Defaults to 8080

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_PORT

int

8080

The gRPC port of the Weaviate server. Defaults to 50051

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_GRPC_PORT

int

50051

The gRPC connection is secured.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_GRPC_SECURE

boolean

false

Use gRPC instead of http for batch inserts only. Will still be used for search.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_GRPC_USE_FOR_INSERTS

boolean

false

The object class you want to store, e.g. "MyGreatClass". Must start from an uppercase letter.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_OBJECT_CLASS

string

Default

The name of the field that contains the text of a TextSegment. Default is "text"

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_TEXT_FIELD_NAME

string

text

If true (default), then WeaviateEmbeddingStore will generate a hashed ID based on provided text segment, which avoids duplicated entries in DB. If false, then random ID will be generated.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_AVOID_DUPS

boolean

false

Consistency level: ONE, QUORUM (default) or ALL.

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_CONSISTENCY_LEVEL

one, quorum, all

quorum

Metadata keys that should be persisted. The default in Weaviate [], however it is required to specify at least one for the EmbeddingStore to work. Thus, we use "tags" as default

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_METADATA_KEYS

list of string

tags

The name of the field where Metadata entries are stored

Environment variable: QUARKUS_LANGCHAIN4J_WEAVIATE_METADATA_FIELD_NAME

string

_metadata

About the Duration format

To write duration values, use the standard java.time.Duration format. See the Duration#parse() Java API documentation for more information.

You can also use a simplified format, starting with a number:

  • If the value is only a number, it represents time in seconds.

  • If the value is a number followed by ms, it represents time in milliseconds.

In other cases, the simplified format is translated to the java.time.Duration format for parsing:

  • If the value is a number followed by h, m, or s, it is prefixed with PT.

  • If the value is a number followed by d, it is prefixed with P.

About the MemorySize format

A size configuration option recognizes strings in this format (shown as a regular expression): [0-9]+[KkMmGgTtPpEeZzYy]?.

If no suffix is given, assume bytes.