In-Process Embedding Models

When ingesting document or implementing the RAG pattern, you need an embedding model. This is a model that takes a document and returns a vector representation of that document. The vector representation is stored in a vector database, and is used to find similar documents.

When using LLMs like OpenAI or HuggingFace, it provides remote embedding models. To compute the embedding of a document, you need to send the document to the remote model.

In-process models avoids this overhead by running the model in the same process as the application. This is generally faster, but requires more memory.

You can check the MTEB (Massive Text Embedding Benchmark) leaderboard to select the most appropriate model for your use case.

Supported in-process models

The Quarkus LangChain4j extension provides supports for a set of in-process embedding models. They are not included by default, and you need to add the explicit dependency to your project.

The following table lists the supported models, and the corresponding dependency to add to your project.

Model Name Dependency Vector Dimension Injected type

all-minlm-l6-v2 (quantized)

dev.langchain4j:langchain4j-embeddings-all-minilm-l6-v2-q:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.allminilml6v2q.AllMiniLmL6V2QuantizedEmbeddingModel

all-minlm-l6-v2

dev.langchain4j:langchain4j-embeddings-all-minilm-l6-v2:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.allminilml6v2.AllMiniLmL6V2EmbeddingModel

bge-small-en-v1.5 (quantized)

dev.langchain4j:langchain4j-embeddings-bge-small-en-v15-q:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.bgesmallenv15q.BgeSmallEnV15QuantizedEmbeddingModel

bge-small-en-v1.5

dev.langchain4j:langchain4j-embeddings-bge-small-en-v15:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.bgesmallenv15.BgeSmallEnV15EmbeddingModel

bge-small-en (quantized)

dev.langchain4j:langchain4j-embeddings-bge-small-en-q:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.bgesmallenq.BgeSmallEnQuantizedEmbeddingModel

bge-small-en

dev.langchain4j:langchain4j-embeddings-bge-small-en:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.bgesmallen.BgeSmallEnEmbeddingModel

bge-small-zh (quantized)

dev.langchain4j:langchain4j-embeddings-bge-small-zh-q:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.bgesmallzhq.BgeSmallZhQuantizedEmbeddingModel

bge-small-zh

dev.langchain4j:langchain4j-embeddings-bge-small-zh:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.bgesmallzh.BgeSmallZhEmbeddingModel

e5-small-v2 (quantized)

dev.langchain4j:langchain4j-embeddings-e5-small-v2-q:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.e5smallv2q.E5SmallV2QuantizedEmbeddingModel

e5-small-v2

dev.langchain4j:langchain4j-embeddings-e5-small-v2:1.0.0-alpha1

384

dev.langchain4j.model.embedding.onnx.e5smallv2.E5SmallV2EmbeddingModel

Injecting an embedding model

You can inject the model in your application using:

@Inject E5SmallV2QuantizedEmbeddingModel model;

Use the corresponding model type for the model you want to use, and make sure you have added the corresponding dependency to your project.

Note that if you do not have any other embedding model in your project, you can inject the EmbeddingModel interface, and it will be automatically injected with the available model:

@Inject EmbeddingModel model;