Using web search
Quarkus LangChain4j currently supports the Tavily search engine.
To use it, add the quarkus-langchain4j-tavily
extension to your project. You’ll need to specify the API key, this is done by the quarkus.langchain4j.tavily.api-key
property.
After this, you can inject the search engine into your application using
@Inject
WebSearchEngine engine;
and then use it by calling its search
method.
If you want to let a chat model use web search by itself, there are
generally two recommended ways to accomplish this: either by implementing a
tool that uses it, or as a content retriever inside a RAG pipeline. The
chatbot-web-search
example in the quarkus-langchain4j
repository demonstrates using web
search as a tool.
Using Web search as a tool
To use web search as a tool that the LLM can decide to execute (and the
relevant search results will be the return value of the tool execution), you
can either use the provided tool from the upstream LangChain4j project,
in class dev.langchain4j.web.search.WebSearchTool
, or implement your own tool
if that one does not fit your requirements. The chatbot-web-search
example demonstrates how to use the provided tool.
Using Web search in a RAG pipeline
There is also a provided content retriever, dev.langchain4j.rag.content.retriever.WebSearchContentRetriever
that uses
a web search engine to retrieve relevant documents.
For inspiration, the retrieval augmentor that wraps it may look like this:
@ApplicationScoped
public class WebSearchRetrievalAugmentor implements Supplier<RetrievalAugmentor> {
@Inject
WebSearchEngine webSearchEngine;
@Inject
ChatLanguageModel chatModel;
@Override
public RetrievalAugmentor get() {
return DefaultRetrievalAugmentor.builder()
.queryTransformer((question) -> {
// before actually querying the engine, we need to transform the
// user's question into a suitable search query
String query = chatModel.generate("Transform the user's question into a suitable query for the " +
"Tavily search engine. The query should yield the results relevant to answering the user's question." +
"User's question: " + question.text());
return Collections.singleton(Query.from(query));
}).contentRetriever(new WebSearchContentRetriever(webSearchEngine, 10))
.build();
}
}
Tavily configuration reference
Configuration property fixed at build time - All other configuration properties are overridable at runtime
Configuration property |
Type |
Default |
---|---|---|
Base URL of the Tavily API Environment variable: |
string |
|
API key for the Tavily API Environment variable: |
string |
required |
Maximum number of results to return Environment variable: |
int |
|
The timeout duration for Tavily requests. Environment variable: |
|
|
Whether requests to Tavily should be logged Environment variable: |
boolean |
|
Whether responses from Tavily should be logged Environment variable: |
boolean |
|
The search depth to use. This can be "basic" or "advanced". Basic is the default. Environment variable: |
|
|
Include a short answer to original query. Default is false. Environment variable: |
boolean |
|
Include the cleaned and parsed HTML content of each search result. Default is false. Environment variable: |
boolean |
|
A list of domains to specifically include in the search results. Default is [], which includes all domains. Environment variable: |
list of string |
|
A list of domains to specifically exclude from the search results. Default is [], which doesn’t exclude any domains. Environment variable: |
list of string |
|
About the Duration format
To write duration values, use the standard You can also use a simplified format, starting with a number:
In other cases, the simplified format is translated to the
|