Oracle Database Document Store

The Oracle extension allows you to use Oracle Database as a vector database for Retrieval-Augmented Generation (RAG) with Quarkus LangChain4j. It leverages Oracle AI Vector Search, available in Oracle Database 23ai, to store and search vector embeddings using the native VECTOR data type.

Prerequisites

To use Oracle Database as a document store:

  • An Oracle Database 23ai (or later) instance is required, since AI Vector Search is only available from that version.

  • A Quarkus datasource must be configured.

Oracle AI Vector Search is built into the database engine and stores embeddings in a native VECTOR column. Unlike some other stores, the embedding dimension does not need to be configured: it is inferred from the embedding model and the data inserted at runtime.

In dev mode and test mode, the quarkus-langchain4j-oracle extension automatically starts an Oracle Database Free container (gvenzl/oracle-free:23-slim) via Dev Services. Oracle Express Edition (XE) is not supported, as it does not include AI Vector Search.

Dependency

To enable Oracle integration in your Quarkus project, add the following Maven dependency:

<dependency>
  <groupId>io.quarkiverse.langchain4j</groupId>
  <artifactId>quarkus-langchain4j-oracle</artifactId>
  <version>1.10.0</version>
</dependency>

Even better, if you use the Quarkus platform BOM (default for projects generated), add the Quarkus Langchain4J BOM and all dependency versions will align:

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>${quarkus.platform.group-id}</groupId>
                <artifactId>${quarkus.platform.artifact-id}</artifactId>
                <version>${quarkus.platform.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
            <dependency>
                <groupId>${quarkus.platform.group-id}</groupId>
                <artifactId>quarkus-langchain4j-bom</artifactId> (1)
                <version>${quarkus.platform.version}</version> (2)
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
      <dependency>
        <groupId>io.quarkiverse.langchain4j</groupId>
        <artifactId>quarkus-langchain4j-oracle</artifactId>
        (3)
      </dependency>
    </dependencies>
1 In your dependencyManagement section, add the quarkus-langchain4j-bom
2 Inherit the version from your platform version
3 VoilĂ , no need for version alignment anymore

This extension requires a configured Quarkus datasource. For configuration details, refer to the Quarkus DataSource Guide.

Embedding Table

The extension manages an embedding table whose creation is controlled by the create-option property:

quarkus.langchain4j.oracle.table=embeddings
quarkus.langchain4j.oracle.create-option=CREATE_IF_NOT_EXISTS

create-option accepts:

  • CREATE_NONE: the table must already exist.

  • CREATE_IF_NOT_EXISTS: create the table only if it is missing (default).

  • CREATE_OR_REPLACE: drop and recreate the table.

The column names can be customized if you need to map to an existing schema:

quarkus.langchain4j.oracle.id-column=id
quarkus.langchain4j.oracle.embedding-column=embedding
quarkus.langchain4j.oracle.text-column=text
quarkus.langchain4j.oracle.metadata-column=metadata

Vector Index

By default, searches run as an exact (brute-force) nearest neighbor scan, which is appropriate for small tables. For larger datasets, you can create an IVF (Inverted File) index to enable approximate nearest neighbor search:

quarkus.langchain4j.oracle.vector-index.create-option=CREATE_IF_NOT_EXISTS
quarkus.langchain4j.oracle.vector-index.target-accuracy=95

The remaining IVF parameters (degree-of-parallelism, neighbor-partitions, sample-per-partition, min-vectors-per-partition) are optional and fall back to the database defaults when not set.

The vector index is only created when vector-index.create-option is not CREATE_NONE (the default). Leaving it unset keeps exact search enabled, which is usually what you want during development.

To force exact search even when an index exists:

quarkus.langchain4j.oracle.exact-search=true

Metadata Indexes

Independently of the vector index, you can create JSON indexes on metadata keys to speed up filtering during search:

quarkus.langchain4j.oracle.metadata-indexes[0].create-option=CREATE_IF_NOT_EXISTS
quarkus.langchain4j.oracle.metadata-indexes[0].keys[0].key=category
quarkus.langchain4j.oracle.metadata-indexes[0].keys[0].type=STRING
quarkus.langchain4j.oracle.metadata-indexes[0].keys[0].order=ASC

This is useful for small tables where exact search is sufficient but metadata filtering still needs to be fast.

Usage Example

Once the extension is installed and configured, you can ingest documents into Oracle using the following code:

package io.quarkiverse.langchain4j.samples;

import static dev.langchain4j.data.document.splitter.DocumentSplitters.recursive;

import java.util.List;

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;

import dev.langchain4j.data.document.Document;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
import io.quarkiverse.langchain4j.oracle.QuarkusOracleEmbeddingStore;

@ApplicationScoped
public class IngestorExampleWithOracle {

    /**
     * The embedding store (the database).
     * The bean is provided by the quarkus-langchain4j-oracle extension.
     */
    @Inject
    QuarkusOracleEmbeddingStore store;

    /**
     * The embedding model (how is computed the vector of a document).
     * The bean is provided by the LLM (like openai) extension.
     */
    @Inject
    EmbeddingModel embeddingModel;

    public void ingest(List<Document> documents) {
        EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
                .embeddingStore(store)
                .embeddingModel(embeddingModel)
                .documentSplitter(recursive(500, 0))
                .build();
        // Warning - this can take a long time...
        ingestor.ingest(documents);
    }
}

This example shows how to embed and persist documents using the Oracle store, enabling similarity search during RAG queries.

Configuration

Customize the behavior of the extension using the following configuration options:

Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property

Type

Default

Whether the default (unnamed) Oracle embedding store should be enabled. Set to false when you only want to use named stores.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_DEFAULT_STORE_ENABLED

boolean

true

The name of the configured Oracle datasource to use for the default store. If not set, the default datasource from the Agroal extension will be used.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_DATASOURCE

string

The table name for storing embeddings.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_TABLE

string

embeddings

Whether to create the embedding table if it does not already exist, replace it, or do nothing.

  • CREATE_NONE: the table must already exist

  • CREATE_IF_NOT_EXISTS: create the table if it does not exist

  • CREATE_OR_REPLACE: drop and recreate the table

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_CREATE_OPTION

create-none, create-if-not-exists, create-or-replace

create-if-not-exists

Custom name for the id column. Defaults to id.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_ID_COLUMN

string

Custom name for the embedding column. Defaults to embedding.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_EMBEDDING_COLUMN

string

Custom name for the text column. Defaults to text.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_TEXT_COLUMN

string

Custom name for the metadata column. Defaults to metadata.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_METADATA_COLUMN

string

Whether to use exact search (brute force) instead of approximate nearest neighbor search.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_EXACT_SEARCH

boolean

false

Named store configurations

Type

Default

Whether this named Oracle embedding store should be enabled. Set to false to skip bean creation for this named store while keeping its configuration.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__ENABLED

boolean

true

The name of the configured Oracle datasource to use for this named store. If not set, the default datasource from the Agroal extension will be used.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__DATASOURCE

string

The table name for storing embeddings.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__TABLE

string

embeddings

Whether to create the embedding table if it does not already exist, replace it, or do nothing.

  • CREATE_NONE: the table must already exist

  • CREATE_IF_NOT_EXISTS: create the table if it does not exist

  • CREATE_OR_REPLACE: drop and recreate the table

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__CREATE_OPTION

create-none, create-if-not-exists, create-or-replace

create-if-not-exists

Custom name for the id column. Defaults to id.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__ID_COLUMN

string

Custom name for the embedding column. Defaults to embedding.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__EMBEDDING_COLUMN

string

Custom name for the text column. Defaults to text.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__TEXT_COLUMN

string

Custom name for the metadata column. Defaults to metadata.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__METADATA_COLUMN

string

Whether to use exact search (brute force) instead of approximate nearest neighbor search.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__EXACT_SEARCH

boolean

false

Configuration for the IVF vector index used for approximate nearest neighbor search

Type

Default

Whether to create the IVF vector index.

  • CREATE_NONE: do not create an index

  • CREATE_IF_NOT_EXISTS: create the index if it does not exist

  • CREATE_OR_REPLACE: drop and recreate the index

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__VECTOR_INDEX_CREATE_OPTION

create-none, create-if-not-exists, create-or-replace

create-none

The target accuracy percentage (0-100) for the IVF vector index. Higher values improve recall at the cost of search latency.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__VECTOR_INDEX_TARGET_ACCURACY

int

-1

The degree of parallelism for IVF vector index creation. Higher values speed up index creation on multi-core systems.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__VECTOR_INDEX_DEGREE_OF_PARALLELISM

int

-1

The number of neighbor partitions in the IVF index. This controls how the vector space is divided during index creation.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__VECTOR_INDEX_NEIGHBOR_PARTITIONS

int

-1

The number of samples per partition used when building the IVF index.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__VECTOR_INDEX_SAMPLE_PER_PARTITION

int

-1

The minimum number of vectors per partition in the IVF index.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__VECTOR_INDEX_MIN_VECTORS_PER_PARTITION

int

-1

JSON metadata index configurations

Type

Default

Whether this is a unique index.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__METADATA_INDEXES_I__UNIQUE

boolean

false

Whether to create a bitmap index instead of a B-tree index. Bitmap indexes are more efficient for low-cardinality columns.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__METADATA_INDEXES_I__BITMAP

boolean

false

Whether to create the metadata index.

  • CREATE_NONE: do not create the index

  • CREATE_IF_NOT_EXISTS: create the index if it does not exist

  • CREATE_OR_REPLACE: drop and recreate the index

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__METADATA_INDEXES_I__CREATE_OPTION

create-none, create-if-not-exists, create-or-replace

create-if-not-exists

The JSON metadata key name to index.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__METADATA_INDEXES_I__KEYS_I__KEY

string

required

The SQL type of the indexed metadata key. Allowed values: STRING, INTEGER, LONG, FLOAT, DOUBLE.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__METADATA_INDEXES_I__KEYS_I__TYPE

string

STRING

The sort order for this key in the index. Allowed values: ASC, DESC.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE__STORE_NAME__METADATA_INDEXES_I__KEYS_I__ORDER

string

ASC

Configuration for the IVF vector index used for approximate nearest neighbor search

Type

Default

Whether to create the IVF vector index.

  • CREATE_NONE: do not create an index

  • CREATE_IF_NOT_EXISTS: create the index if it does not exist

  • CREATE_OR_REPLACE: drop and recreate the index

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_VECTOR_INDEX_CREATE_OPTION

create-none, create-if-not-exists, create-or-replace

create-none

The target accuracy percentage (0-100) for the IVF vector index. Higher values improve recall at the cost of search latency.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_VECTOR_INDEX_TARGET_ACCURACY

int

-1

The degree of parallelism for IVF vector index creation. Higher values speed up index creation on multi-core systems.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_VECTOR_INDEX_DEGREE_OF_PARALLELISM

int

-1

The number of neighbor partitions in the IVF index. This controls how the vector space is divided during index creation.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_VECTOR_INDEX_NEIGHBOR_PARTITIONS

int

-1

The number of samples per partition used when building the IVF index.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_VECTOR_INDEX_SAMPLE_PER_PARTITION

int

-1

The minimum number of vectors per partition in the IVF index.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_VECTOR_INDEX_MIN_VECTORS_PER_PARTITION

int

-1

JSON metadata index configurations

Type

Default

Whether this is a unique index.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_METADATA_INDEXES_I__UNIQUE

boolean

false

Whether to create a bitmap index instead of a B-tree index. Bitmap indexes are more efficient for low-cardinality columns.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_METADATA_INDEXES_I__BITMAP

boolean

false

Whether to create the metadata index.

  • CREATE_NONE: do not create the index

  • CREATE_IF_NOT_EXISTS: create the index if it does not exist

  • CREATE_OR_REPLACE: drop and recreate the index

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_METADATA_INDEXES_I__CREATE_OPTION

create-none, create-if-not-exists, create-or-replace

create-if-not-exists

The JSON metadata key name to index.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_METADATA_INDEXES_I__KEYS_I__KEY

string

required

The SQL type of the indexed metadata key. Allowed values: STRING, INTEGER, LONG, FLOAT, DOUBLE.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_METADATA_INDEXES_I__KEYS_I__TYPE

string

STRING

The sort order for this key in the index. Allowed values: ASC, DESC.

Environment variable: QUARKUS_LANGCHAIN4J_ORACLE_METADATA_INDEXES_I__KEYS_I__ORDER

string

ASC

How It Works

The Oracle extension maps each ingested document to a row in an Oracle table. Each row contains:

  • The original text content

  • Optional metadata, stored as JSON

  • The vector embedding, stored in a native VECTOR(*, FLOAT32) column

During retrieval, a similarity search is performed using the native VECTOR_DISTANCE function, optionally accelerated by the IVF index and filtered by metadata when a filter is provided.

The extension manages schema and index creation automatically according to the configured create-option values.

Named Stores

You can configure multiple named Oracle stores, each backed by a different datasource. This is useful when your application needs to manage embeddings for different domains or tenants in separate databases.

To configure a named store, set its datasource and enable it at build time:

quarkus.langchain4j.oracle.products.datasource=products-ds
quarkus.langchain4j.oracle.products.table=product_embeddings

To inject a named store, use the @EmbeddingStoreName qualifier:

@Inject
@EmbeddingStoreName("products")
EmbeddingStore<TextSegment> productsStore;

The default store and named stores can coexist. If you only need named stores, disable the default store:

quarkus.langchain4j.oracle.default-store-enabled=false

Summary

To use Oracle Database as a document store with Quarkus LangChain4j:

  • Use an Oracle Database 23ai instance, which provides AI Vector Search.

  • Add the extension dependency.

  • Configure a datasource.

  • Optionally tune the embedding table, vector index, and metadata indexes.

  • Use QuarkusOracleEmbeddingStore (or inject EmbeddingStore<TextSegment>) to ingest and retrieve embedded documents.