Milvus Embedding Store

Milvus is a scalable and high-performance vector database optimized for AI and semantic search use cases. This guide explains how to use Milvus as an embedding store in Quarkus LangChain4j for RAG applications.

Overview

The quarkus-langchain4j-milvus extension enables seamless integration with a Milvus instance for storing and retrieving embedded documents using vector similarity.

Milvus supports approximate nearest neighbor (ANN) search with various indexing strategies and similarity metrics.

Dependency

To enable Milvus integration in your Quarkus project, add the following dependency:

<dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-milvus</artifactId>
    <version>1.0.2</version>
</dependency>

Dev Services Support

This extension includes Dev Services support. In dev and test mode, a containerized Milvus instance is started automatically.

To configure Dev Services for Milvus, set the vector dimension according to your embedding model:

quarkus.langchain4j.milvus.dimension=384
For example, AllMiniLmL6V2QuantizedEmbeddingModel → 384; OpenAI text-embedding-ada-002 → 1536.

Connecting to an External Milvus Instance

If you prefer to connect to a remote Milvus instance, disable Dev Services and provide the host and port:

quarkus.langchain4j.milvus.host=localhost
quarkus.langchain4j.milvus.port=19530
quarkus.langchain4j.milvus.dimension=384

When a host is defined, Dev Services will not start a container.

Usage Example

Once the extension is configured, you can use Milvus as an embedding store in your ingestion logic:

package io.quarkiverse.langchain4j.samples;

import static dev.langchain4j.data.document.splitter.DocumentSplitters.recursive;

import java.util.List;

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;

import dev.langchain4j.data.document.Document;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
import dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore;

@ApplicationScoped
public class IngestorExampleWithMilvus {

    /**
     * The embedding store (the database).
     * The bean is provided by the quarkus-langchain4j-milvus extension.
     */
    @Inject
    MilvusEmbeddingStore store;

    /**
     * The embedding model (how is computed the vector of a document).
     * The bean is provided by the LLM (like openai) extension.
     */
    @Inject
    EmbeddingModel embeddingModel;

    public void ingest(List<Document> documents) {
        EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
                .embeddingStore(store)
                .embeddingModel(embeddingModel)
                .documentSplitter(recursive(500, 0))
                .build();
        // Warning - this can take a long time...
        ingestor.ingest(documents);
    }
}

This example shows how to store vectorized content and make it retrievable via similarity search.

Configuration Options

Customize the behavior of the extension using the following configuration options:

Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property

Type

Default

Whether Dev Services for Milvus are enabled or not.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_ENABLED

boolean

true

Container image for Milvus.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_MILVUS_IMAGE_NAME

string

docker.io/milvusdb/milvus:v2.3.16

Optional fixed port the Milvus dev service will listen to. If not defined, the port will be chosen randomly.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_PORT

int

Indicates if the Dev Service containers managed by Quarkus for Milvus are shared.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_SHARED

boolean

true

Service label to apply to created Dev Services containers.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DEVSERVICES_SERVICE_NAME

string

milvus

The URL of the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_HOST

string

required

The port of the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PORT

int

required

The authentication token for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_TOKEN

string

The username for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_USERNAME

string

The password for the Milvus server.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PASSWORD

string

The timeout duration for the Milvus client. If not specified, 5 seconds will be used.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_TIMEOUT

Duration 

Name of the database.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DB_NAME

string

default

Create the collection if it does not exist yet.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_CREATE_COLLECTION

boolean

true

Name of the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_COLLECTION_NAME

string

embeddings

Dimension of the vectors. Only applicable when the collection yet has to be created.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DIMENSION

int

Name of the field that contains the ID of the vector.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_PRIMARY_FIELD

string

id

Name of the field that contains the text from which the vector was calculated.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_TEXT_FIELD

string

text

Name of the field that contains JSON metadata associated with the text.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_METADATA_FIELD

string

metadata

Name of the field to store the vector in.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_VECTOR_FIELD

string

vector

Description of the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_DESCRIPTION

string

The index type to use for the collection.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_INDEX_TYPE

none, flat, ivf-flat, ivf-sq8, ivf-pq, hnsw, hnsw-sq, hnsw-pq, hnsw-prq, diskann, autoindex, scann, gpu-ivf-flat, gpu-ivf-pq, gpu-brute-force, gpu-cagra, bin-flat, bin-ivf-flat, trie, stl-sort, inverted, bitmap, sparse-inverted-index, sparse-wand

flat

The metric type to use for searching.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_METRIC_TYPE

none, l2, ip, cosine, hamming, jaccard

cosine

The consistency level.

Environment variable: QUARKUS_LANGCHAIN4J_MILVUS_CONSISTENCY_LEVEL

strong, session, bounded, eventually

eventually

About the Duration format

To write duration values, use the standard java.time.Duration format. See the Duration#parse() Java API documentation for more information.

You can also use a simplified format, starting with a number:

  • If the value is only a number, it represents time in seconds.

  • If the value is a number followed by ms, it represents time in milliseconds.

In other cases, the simplified format is translated to the java.time.Duration format for parsing:

  • If the value is a number followed by h, m, or s, it is prefixed with PT.

  • If the value is a number followed by d, it is prefixed with P.

How It Works

Milvus stores each embedded text segment along with its metadata and vector in a collection. The vector similarity search uses one of several distance metrics:

  • COSINE

  • L2 (Euclidean)

  • IP (Inner Product / dot product)

The extension automatically initializes the collection and schema if it does not already exist. It uses gRPC-based access for high performance and low latency.

Summary

To use Milvus with Quarkus LangChain4j:

  1. Add the required extension

  2. Set the embedding dimension to match your model

  3. Use Dev Services or connect to an external Milvus instance

  4. Ingest documents and perform similarity search using MilvusEmbeddingStore