Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property

Type

Default

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_ENABLED

boolean

true

If Dev Services for Ollama has been explicitly enabled or disabled. Dev Services are generally enabled by default, unless there is an existing configuration present.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_DEVSERVICES_ENABLED

boolean

true

The Ollama container image to use.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_DEVSERVICES_IMAGE_NAME

string

ollama/ollama:latest

Model to use

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_MODEL_ID

string

llama3.2

Model to use. According to Ollama docs, the default value is nomic-embed-text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_MODEL_ID

string

nomic-embed-text

Base URL where the Ollama serving is running

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_BASE_URL

string

If set, the named TLS configuration with the configured name will be applied to the REST Client

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_TLS_CONFIGURATION_NAME

string

Timeout for Ollama calls

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_TIMEOUT

Duration

10s

Whether the Ollama client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_LOG_REQUESTS

boolean

false

Whether the Ollama client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the OpenAI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_ENABLE_INTEGRATION

boolean

true

The temperature of the model. Increasing the temperature will make the model answer with more variability. A lower temperature will make the model answer more conservatively.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:0.8}

Maximum number of tokens to predict when generating text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_NUM_PREDICT

int

Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_STOP

list of string

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_TOP_P

double

0.9

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_TOP_K

int

40

With a static number the result is always the same. With a random number the result varies Example:

Random random = new Random(); int x = random.nextInt(Integer.MAX_VALUE);

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_SEED

int

The format to return a response in. Format can be json or a JSON schema.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_FORMAT

string

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_CHAT_MODEL_LOG_RESPONSES

boolean

false

The temperature of the model. Increasing the temperature will make the model answer with more variability. A lower temperature will make the model answer more conservatively.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:0.8}

Maximum number of tokens to predict when generating text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_NUM_PREDICT

int

128

Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_STOP

list of string

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_TOP_P

double

0.9

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_TOP_K

int

40

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA_EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

Named model config

Type

Default

Model to use

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_MODEL_ID

string

llama3.2

Model to use. According to Ollama docs, the default value is nomic-embed-text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_MODEL_ID

string

nomic-embed-text

Base URL where the Ollama serving is running

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__BASE_URL

string

If set, the named TLS configuration with the configured name will be applied to the REST Client

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__TLS_CONFIGURATION_NAME

string

Timeout for Ollama calls

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__TIMEOUT

Duration

10s

Whether the Ollama client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the Ollama client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the OpenAI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

The temperature of the model. Increasing the temperature will make the model answer with more variability. A lower temperature will make the model answer more conservatively.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:0.8}

Maximum number of tokens to predict when generating text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_NUM_PREDICT

int

Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_STOP

list of string

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_TOP_P

double

0.9

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_TOP_K

int

40

With a static number the result is always the same. With a random number the result varies Example:

Random random = new Random(); int x = random.nextInt(Integer.MAX_VALUE);

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_SEED

int

The format to return a response in. Format can be json or a JSON schema.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_FORMAT

string

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

The temperature of the model. Increasing the temperature will make the model answer with more variability. A lower temperature will make the model answer more conservatively.

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_TEMPERATURE

double

${quarkus.langchain4j.temperature:0.8}

Maximum number of tokens to predict when generating text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_NUM_PREDICT

int

128

Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_STOP

list of string

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_TOP_P

double

0.9

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_TOP_K

int

40

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_OLLAMA__MODEL_NAME__EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

About the Duration format

To write duration values, use the standard java.time.Duration format. See the Duration#parse() Java API documentation for more information.

You can also use a simplified format, starting with a number:

  • If the value is only a number, it represents time in seconds.

  • If the value is a number followed by ms, it represents time in milliseconds.

In other cases, the simplified format is translated to the java.time.Duration format for parsing:

  • If the value is a number followed by h, m, or s, it is prefixed with PT.

  • If the value is a number followed by d, it is prefixed with P.