Milvus Store for Retrieval Augmented Generation (RAG)

When implementing Retrieval Augmented Generation (RAG), a robust document store is crucial. This guide demonstrates how to leverage a Milvus database as the document store.

Leveraging the Milvus embedding store

To make use of the Milvus embedding store, you’ll need to include the following dependency:

<dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-milvus</artifactId>
</dependency>

This extension includes a dev service. Therefore, if you’re operating in a container environment, a Milvus instance will automatically start in dev and test mode.

Upon installing the extension, you can use the Milvus document store with the following code:

package io.quarkiverse.langchain4j.samples;

import static dev.langchain4j.data.document.splitter.DocumentSplitters.recursive;

import java.util.List;

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;

import dev.langchain4j.data.document.Document;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
import dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore;

@ApplicationScoped
public class IngestorExampleWithMilvus {

    /**
     * The embedding store (the database).
     * The bean is provided by the quarkus-langchain4j-milvus extension.
     */
    @Inject
    MilvusEmbeddingStore store;

    /**
     * The embedding model (how is computed the vector of a document).
     * The bean is provided by the LLM (like openai) extension.
     */
    @Inject
    EmbeddingModel embeddingModel;

    public void ingest(List<Document> documents) {
        EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
                .embeddingStore(store)
                .embeddingModel(embeddingModel)
                .documentSplitter(recursive(500, 0))
                .build();
        // Warning - this can take a long time...
        ingestor.ingest(documents);
    }
}
To get started, only one configuration property is required to be set - quarkus.langchain4j.milvus.dimension, which specifies the dimension of the embeddings that you’re going to store and depends on the embedding model.

To use a remote Milvus instance, you have to also set the host and port, in which case dev-services will not start another instance:

quarkus.langchain4j.milvus.host=localhost
quarkus.langchain4j.milvus.port=19530

Configuration Settings

Customize the behavior of the extension by exploring various configuration options:

Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property

Type

Default

Whether Dev Services for Milvus are enabled or not.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_ENABLED

boolean

true

Container image for Milvus.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_MILVUS_IMAGE_NAME

string

docker.io/milvusdb/milvus:v2.3.16

Optional fixed port the Milvus dev service will listen to. If not defined, the port will be chosen randomly.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_PORT

int

Indicates if the Dev Service containers managed by Quarkus for Milvus are shared.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_SHARED

boolean

true

Service label to apply to created Dev Services containers.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_SERVICE_NAME

string

milvus

The URL of the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_HOST

string

required

The port of the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PORT

int

required

The authentication token for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_TOKEN

string

The username for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_USERNAME

string

The password for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PASSWORD

string

The timeout duration for the Milvus client. If not specified, 5 seconds will be used.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_TIMEOUT

Duration

Name of the database.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DB_NAME

string

default

Create the collection if it does not exist yet.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_CREATE_COLLECTION

boolean

true

Name of the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_COLLECTION_NAME

string

embeddings

Dimension of the vectors. Only applicable when the collection yet has to be created.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DIMENSION

int

TODO

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PRIMARY_FIELD

string

id

Name of the field to store the vector in.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_VECTOR_FIELD

string

vector

Description of the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DESCRIPTION

string

The index type to use for the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_INDEX_TYPE

none, flat, ivf-flat, ivf-sq8, ivf-pq, hnsw, diskann, autoindex, scann, gpu-ivf-flat, gpu-ivf-pq, gpu-brute-force, gpu-cagra, bin-flat, bin-ivf-flat, trie, stl-sort, inverted, sparse-inverted-index, sparse-wand

flat

The metric type to use for searching.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_METRIC_TYPE

none, l2, ip, cosine, hamming, jaccard

cosine

The consistency level.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_CONSISTENCY_LEVEL

strong, bounded, eventually

eventually

About the Duration format

To write duration values, use the standard java.time.Duration format. See the Duration#parse() Java API documentation for more information.

You can also use a simplified format, starting with a number:

  • If the value is only a number, it represents time in seconds.

  • If the value is a number followed by ms, it represents time in milliseconds.

In other cases, the simplified format is translated to the java.time.Duration format for parsing:

  • If the value is a number followed by h, m, or s, it is prefixed with PT.

  • If the value is a number followed by d, it is prefixed with P.