本文主要研究一下langchain4j的Web Search Engine

步骤

pom.xml

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-web-search-engine-searchapi</artifactId>
    <version>1.0.0-beta2</version>
</dependency>

example

    @Test
    public void testSearchEngine() {
        SearchApiWebSearchEngine searchEngine = SearchApiWebSearchEngine.builder()
                .apiKey(System.getenv("SEARCH_API_KEY"))
                .engine("baidu")
                .build();
        WebSearchTool webTool = WebSearchTool.from(searchEngine);

        Assistant assistant = AiServices.builder(Assistant.class)
                .chatLanguageModel(model)
                .tools(webTool)
                .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
                .build();
        String result = assistant.search("今天日期是哪一天");
        System.out.println(result);
    }
输出如下:
根据搜索结果,今天的日期为2025年3月25日(星期二),在农历中则是2025年二月廿六。

更多详细信息您可以访问以下网站:
- [易灵算命网](http://old.d02.cn/suanming/nongli.php)
- [今天几号网](https://www.jintianjihao.com/)
- [快乐日历 - 今天什么日子,今天星期几,农历几月几日@快乐家园](http://www.joyurl.cn/)

源码

WebSearchEngine

dev/langchain4j/web/search/WebSearchEngine.java

public interface WebSearchEngine {

    /**
     * Performs a search query on the web search engine and returns the search results.
     *
     * @param query the search query
     * @return the search results
     */
    default WebSearchResults search(String query) {
        return search(WebSearchRequest.from(query));
    }

    /**
     * Performs a search request on the web search engine and returns the search results.
     *
     * @param webSearchRequest the search request
     * @return the web search results
     */
    WebSearchResults search(WebSearchRequest webSearchRequest);
}
langchain4j定义了WebSearchEngine接口,它定义了search方法,根据WebSearchRequest返回WebSearchResults

WebSearchRequest

dev/langchain4j/web/search/WebSearchRequest.java

public class WebSearchRequest {

    private final String searchTerms;
    private final Integer maxResults;
    private final String language;
    private final String geoLocation;
    private final Integer startPage;
    private final Integer startIndex;
    private final Boolean safeSearch;
    private final Map<String, Object> additionalParams;

    private WebSearchRequest(Builder builder){
        this.searchTerms = ensureNotBlank(builder.searchTerms,"searchTerms");
        this.maxResults = builder.maxResults;
        this.language = builder.language;
        this.geoLocation = builder.geoLocation;
        this.startPage = getOrDefault(builder.startPage,1);
        this.startIndex = builder.startIndex;
        this.safeSearch = getOrDefault(builder.safeSearch,true);
        this.additionalParams = getOrDefault(builder.additionalParams, () -> new HashMap<>());
    }

    //......
}    
WebSearchRequest定义了searchTerms、maxResults、language、geoLocation、startPage、startIndex、safeSearch、additionalParams属性

WebSearchResults

dev/langchain4j/web/search/WebSearchResults.java

public class WebSearchResults {

    private final Map<String, Object> searchMetadata;
    private final WebSearchInformationResult searchInformation;
    private final List<WebSearchOrganicResult> results;

    /**
     * Constructs a new instance of WebSearchResults.
     *
     * @param searchInformation The information about the web search.
     * @param results           The list of organic search results.
     */
    public WebSearchResults(WebSearchInformationResult searchInformation, List<WebSearchOrganicResult> results) {
        this(null, searchInformation, results);
    }

    /**
     * Constructs a new instance of WebSearchResults.
     *
     * @param searchMetadata    The metadata associated with the web search.
     * @param searchInformation The information about the web search.
     * @param results           The list of organic search results.
     */
    public WebSearchResults(Map<String, Object> searchMetadata, WebSearchInformationResult searchInformation, List<WebSearchOrganicResult> results) {
        this.searchMetadata = searchMetadata;
        this.searchInformation = ensureNotNull(searchInformation, "searchInformation");
        this.results = results;
    }

    //......
}    
WebSearchResults定义了searchMetadata、searchInformation、results属性

SearchApiWebSearchEngine

dev/langchain4j/web/search/searchapi/SearchApiWebSearchEngine.java

public class SearchApiWebSearchEngine implements WebSearchEngine {

    private static final String DEFAULT_BASE_URL = "https://www.searchapi.io";
    private static final String DEFAULT_ENGINE = "google";

    private final String apiKey;
    private final String engine;
    private final SearchApiClient client;
    private final Map<String, Object> optionalParameters;

    /**
     * @param apiKey             Required - the Search API key for accessing their API
     * @param baseUrl            overrides the default SearchApi base url
     * @param timeout            the timeout duration for API requests
     *                           <p>
     *                           Default value is 30 seconds.
     * @param engine             the engine used by Search API to execute the search
     *                           <p>
     *                           Default engine is Google Search.
     * @param optionalParameters parameters to be passed on every request of this the engine, they can be overridden by the WebSearchRequest additional parameters for matching keys
     *                           <p>
     *                           Check <a href="https://www.searchapi.io">Search API</a> for more information on available parameters for each engine
     */
    @Builder
    public SearchApiWebSearchEngine(String apiKey,
                                    String baseUrl,
                                    Duration timeout,
                                    String engine,
                                    Map<String, Object> optionalParameters) {
        this.apiKey = ensureNotBlank(apiKey, "apiKey");
        this.engine = getOrDefault(engine, DEFAULT_ENGINE);
        this.optionalParameters = getOrDefault(copyIfNotNull(optionalParameters), new HashMap<>());
        this.client = SearchApiClient.builder()
                .timeout(getOrDefault(timeout, ofSeconds(30)))
                .baseUrl(getOrDefault(baseUrl, DEFAULT_BASE_URL))
                .build();
    }

    /**
     * @param webSearchRequest Check <a href="https://www.searchapi.io">Search API</a> for more information on available additional
     *                         parameters for each engine that can be inside the request
     */
    @Override
    public WebSearchResults search(WebSearchRequest webSearchRequest) {
        SearchApiWebSearchRequest request = SearchApiWebSearchRequest.builder()
                .apiKey(apiKey)
                .engine(engine)
                .query(webSearchRequest.searchTerms())
                .optionalParameters(optionalParameters)
                .additionalRequestParameters(webSearchRequest.additionalParams())
                .build();
        SearchApiWebSearchResponse response = client.search(request);
        return toWebSearchResults(response);
    }

    private WebSearchResults toWebSearchResults(SearchApiWebSearchResponse response) {
        List<OrganicResult> organicResults = response.getOrganicResults();
        Long totalResults = getTotalResults(response.getSearchInformation());
        WebSearchInformationResult searchInformation = WebSearchInformationResult.from(
                totalResults,
                getCurrentPage(response.getPagination()),
                null
        );
        Map<String, Object> searchMetadata = getOrDefault(response.getSearchParameters(), new HashMap<>());
        addToMetadata(searchMetadata, response.getSearchMetadata());
        return WebSearchResults.from(
                searchMetadata,
                searchInformation,
                toWebSearchOrganicResults(organicResults));
    }

    private static long getTotalResults(Map<String, Object> searchInformation) {
        if (searchInformation != null && searchInformation.containsKey("total_results")) {
            Object totalResults = searchInformation.get("total_results");
            return totalResults instanceof Integer
                    ? ((Integer) totalResults).longValue()
                    : (Long) totalResults; // changes depending on the amount of total_results
        }
        return 0;
    }

    private Integer getCurrentPage(Map<String, Object> pagination) {
        if (pagination != null && pagination.containsKey("current")) {
            return (Integer) pagination.get("current");
        }
        return null;
    }

    private void addToMetadata(Map<String, Object> metadata, Map<String, Object> dataToAdd) {
        if (dataToAdd != null) {
            metadata.putAll(dataToAdd);
        }
    }

    private List<WebSearchOrganicResult> toWebSearchOrganicResults(List<OrganicResult> organicResults) {
        return organicResults.stream()
                .map(result -> {
                    Map<String, String> metadata = new HashMap<>(2);
                    metadata.put("position", result.getPosition());
                    return WebSearchOrganicResult.from(
                            result.getTitle(),
                            URI.create(result.getLink()),
                            getOrDefault(result.getSnippet(), ""),
                            null,  // by default google custom search api does not return content
                            metadata);
                })
                .collect(Collectors.toList());
    }

    public static WebSearchEngine withApiKey(String apiKey) {
        return builder().apiKey(apiKey).build();
    }
}
SearchApiWebSearchEngine实现了WebSearchEngine,它通过SearchApiClient去请求Search API

WebSearchTool

dev/langchain4j/web/search/WebSearchTool.java

public class WebSearchTool {

    private final WebSearchEngine searchEngine;

    public WebSearchTool(WebSearchEngine searchEngine) {
        this.searchEngine = ensureNotNull(searchEngine, "searchEngine");
    }

    /**
     * Runs a search query on the web search engine and returns a pretty-string representation of the search results.
     *
     * @param query the search user query
     * @return a pretty-string representation of the search results
     */
    @Tool("This tool can be used to perform web searches using search engines such as Google, particularly when seeking information about recent events.")
    public String searchWeb(@P("Web search query") String query) {
        WebSearchResults results = searchEngine.search(query);
        return format(results);
    }

    private String format(WebSearchResults results) {
        if (isNullOrEmpty(results.results()))
            return "No results found.";

        return results.results()
                .stream()
                .map(organicResult -> "Title: " + organicResult.title() + "\n"
                        + "Source: " + organicResult.url().toString() + "\n"
                        + (organicResult.content() != null ? "Content:" + "\n" + organicResult.content() : "Snippet:" + "\n" + organicResult.snippet()))
                .collect(Collectors.joining("\n\n"));
    }

    /**
     * Creates a new WebSearchTool with the specified web search engine.
     *
     * @param searchEngine the web search engine to use for searching the web
     * @return a new WebSearchTool
     */
    public static WebSearchTool from(WebSearchEngine searchEngine) {
        return new WebSearchTool(searchEngine);
    }
}
WebSearchTool定义了WebSearchEngine属性,它提供了searchWeb方法,并标注了@Tool注解,该方法执行searchEngine.search(query),并对结果进行format。

小结

langchain4j定义了WebSearchEngine接口,它定义了search方法,根据WebSearchRequest返回WebSearchResults;它提供了WebSearchTool可以将WebSearchEngine转为tool去跟model集成进行调用。langchain4j-web-search-engine-google-custom提供了google自定义搜索,langchain4j-web-search-engine-searchapi支持search api搜索,langchain4j-community-web-search-engine-searxng支持SearXNG搜索,langchain4j-web-search-engine-tavily支持tavily搜索。

doc


codecraft
11.9k 声望2k 粉丝

当一个代码的工匠回首往事时,不因虚度年华而悔恨,也不因碌碌无为而羞愧,这样,当他老的时候,可以很自豪告诉世人,我曾经将代码注入生命去打造互联网的浪潮之巅,那是个很疯狂的时代,我在一波波的浪潮上留下...