In-Process Embedding Models
When ingesting document or implementing the RAG pattern, you need an embedding model. This is a model that takes a document and returns a vector representation of that document. The vector representation is stored in a vector database, and is used to find similar documents.
When using LLMs like OpenAI or HuggingFace, it provides remote embedding models. To compute the embedding of a document, you need to send the document to the remote model.
In-process models avoids this overhead by running the model in the same process as the application. This is generally faster, but requires more memory.
You can check the MTEB (Massive Text Embedding Benchmark) leaderboard to select the most appropriate model for your use case.
Supported in-process models
The Quarkus LangChain4j extension provides supports for a set of in-process embedding models. They are not included by default, and you need to add the explicit dependency to your project.
The following table lists the supported models, and the corresponding dependency to add to your project.
Model Name | Dependency | Vector Dimension | Injected type |
---|---|---|---|
|
384 |
|
|
|
384 |
|
|
|
384 |
|
|
|
384 |
|
|
|
384 |
|
|
|
384 |
|
|
|
384 |
|
|
|
384 |
|
|
|
384 |
|
|
|
384 |
|
Injecting an embedding model
You can inject the model in your application using:
@Inject E5SmallV2QuantizedEmbeddingModel model;
Use the corresponding model type for the model you want to use, and make sure you have added the corresponding dependency to your project.
Note that if you do not have any other embedding model in your project, you can inject the EmbeddingModel
interface, and it will be automatically injected with the available model:
@Inject EmbeddingModel model;