Mistral

Mistral is a French company that provide open source LLM models.

Using Mistral Models

To employ Mistral LLMs, integrate the following dependency into your project:

<dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-mistral-ai</artifactId>
    <version>0.21.0</version>
</dependency>

If no other LLM extension is installed, AI Services will automatically utilize the configured Mistral model.

Configuration

Configuring Mistral models mandates an API key, obtainable by creating an account on the Mistral platform.

The API key can be set in the application.properties file:

quarkus.langchain4j.mistralai.api-key=...
Alternatively, leverage the QUARKUS_LANGCHAIN4J_MISTRALAI_API_KEY environment variable.

Several configuration properties are available:

Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property

Type

Default

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_ENABLED

boolean

true

Whether the model should be enabled

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_EMBEDDING_MODEL_ENABLED

boolean

true

Base URL of Mistral API

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_BASE_URL

string

https://api.mistral.ai/v1/

Mistral API key

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_API_KEY

string

dummy

Timeout for Mistral calls

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_TIMEOUT

Duration

10s

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_MODEL_NAME

string

mistral-tiny

What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It is generally recommended to set this or the top-k property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_TEMPERATURE

double

0.7

The maximum number of tokens to generate in the completion.

The token count of your prompt plus max_tokens cannot exceed the model’s context length

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_MAX_TOKENS

int

Double (0.0-1.0). Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to set this or the temperature property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_TOP_P

double

1.0

Whether to inject a safety prompt before all conversations

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_SAFE_PROMPT

boolean

The seed to use for random sampling. If set, different calls will generate deterministic results.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_RANDOM_SEED

int

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_CHAT_MODEL_LOG_RESPONSES

boolean

false

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_EMBEDDING_MODEL_MODEL_NAME

string

mistral-embed

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

Whether the Mistral client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_LOG_REQUESTS

boolean

false

Whether the Mistral client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the Mistral AI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI_ENABLE_INTEGRATION

boolean

true

Named model config

Type

Default

Base URL of Mistral API

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__BASE_URL

string

https://api.mistral.ai/v1/

Mistral API key

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__API_KEY

string

dummy

Timeout for Mistral calls

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__TIMEOUT

Duration

10s

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_MODEL_NAME

string

mistral-tiny

What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

It is generally recommended to set this or the top-k property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_TEMPERATURE

double

0.7

The maximum number of tokens to generate in the completion.

The token count of your prompt plus max_tokens cannot exceed the model’s context length

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_MAX_TOKENS

int

Double (0.0-1.0). Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to set this or the temperature property but not both.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_TOP_P

double

1.0

Whether to inject a safety prompt before all conversations

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_SAFE_PROMPT

boolean

The seed to use for random sampling. If set, different calls will generate deterministic results.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_RANDOM_SEED

int

Whether chat model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_LOG_REQUESTS

boolean

false

Whether chat model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__CHAT_MODEL_LOG_RESPONSES

boolean

false

Model name to use

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__EMBEDDING_MODEL_MODEL_NAME

string

mistral-embed

Whether embedding model requests should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__EMBEDDING_MODEL_LOG_REQUESTS

boolean

false

Whether embedding model responses should be logged

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__EMBEDDING_MODEL_LOG_RESPONSES

boolean

false

Whether the Mistral client should log requests

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__LOG_REQUESTS

boolean

false

Whether the Mistral client should log responses

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__LOG_RESPONSES

boolean

false

Whether to enable the integration. Defaults to true, which means requests are made to the Mistral AI provider. Set to false to disable all requests.

Environment variable: QUARKUS_LANGCHAIN4J_MISTRALAI__MODEL_NAME__ENABLE_INTEGRATION

boolean

true

About the Duration format

To write duration values, use the standard java.time.Duration format. See the Duration#parse() Java API documentation for more information.

You can also use a simplified format, starting with a number:

  • If the value is only a number, it represents time in seconds.

  • If the value is a number followed by ms, it represents time in milliseconds.

In other cases, the simplified format is translated to the java.time.Duration format for parsing:

  • If the value is a number followed by h, m, or s, it is prefixed with PT.

  • If the value is a number followed by d, it is prefixed with P.