本文主要研究一下langchain4j+springboot如何实现流式输出

步骤

pom.xml

        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-reactor</artifactId>
            <version>$1.0.0-beta1</version>
        </dependency>

application.yaml

langchain4j:
  ollama:
    streaming-chat-model:
      base-url: http://localhost:11434
      model-name: deepseek-r1:8b
配置了langchain4j.ollama.streaming-chat-model.base-url的配置就可以自动开启并注入streamingChatModel

controller

    @Autowired
    StreamingChatLanguageModel streamingChatModel;

    @GetMapping("/stream")
    public Flux<String> stream(@RequestParam("prompt") String prompt, HttpServletResponse response) {
        response.setCharacterEncoding("UTF-8");
        return Flux.create(sink -> {
            streamingChatModel.chat(prompt, new StreamingChatResponseHandler() {
                @Override
                public void onPartialResponse(String partialResponse) {
                    log.info("onPartialResponse:{}", partialResponse);
                    sink.next(partialResponse);
                }

                @Override
                public void onCompleteResponse(ChatResponse completeResponse) {
                    log.info("complete:{}", completeResponse);
                    sink.complete();
                }

                @Override
                public void onError(Throwable error) {
                    sink.error(error);
                }
            });

        });
    }
StreamingChatLanguageModel提供了StreamingChatResponseHandler用于处理片段结果,结合Flux可以实现流式输出

源码

dev/langchain4j/ollama/spring/AutoConfig.java

    @Bean
    @ConditionalOnProperty(PREFIX + ".streaming-chat-model.base-url")
    OllamaStreamingChatModel ollamaStreamingChatModel(
            @Qualifier(OLLAMA_STREAMING_CHAT_MODEL_HTTP_CLIENT_BUILDER) HttpClientBuilder httpClientBuilder,
            Properties properties,
            ObjectProvider<ChatModelListener> listeners
    ) {
        ChatModelProperties chatModelProperties = properties.getStreamingChatModel();
        return OllamaStreamingChatModel.builder()
                .httpClientBuilder(httpClientBuilder)
                .baseUrl(chatModelProperties.getBaseUrl())
                .modelName(chatModelProperties.getModelName())
                .temperature(chatModelProperties.getTemperature())
                .topK(chatModelProperties.getTopK())
                .topP(chatModelProperties.getTopP())
                .repeatPenalty(chatModelProperties.getRepeatPenalty())
                .seed(chatModelProperties.getSeed())
                .numPredict(chatModelProperties.getNumPredict())
                .stop(chatModelProperties.getStop())
                .format(chatModelProperties.getFormat())
                .supportedCapabilities(chatModelProperties.getSupportedCapabilities())
                .timeout(chatModelProperties.getTimeout())
                .customHeaders(chatModelProperties.getCustomHeaders())
                .logRequests(chatModelProperties.getLogRequests())
                .logResponses(chatModelProperties.getLogResponses())
                .listeners(listeners.orderedStream().toList())
                .build();
    }
langchain4j-ollama-spring-boot-starter的AutoConfig当检测到有配置langchain4j.ollama.streaming-chat-model.base-url的时候会自动配置OllamaStreamingChatModel,它实现了StreamingChatLanguageModel接口

小结

langchain4j-ollama-spring-boot-starter提供了OllamaStreamingChatModel实现了StreamingChatLanguageModel接口,不过貌似没有提供对spring mvc的内置集成,需要自己通过Flux及StreamingChatResponseHandler去适配下。

doc


codecraft
11.9k 声望2k 粉丝

当一个代码的工匠回首往事时,不因虚度年华而悔恨,也不因碌碌无为而羞愧,这样,当他老的时候,可以很自豪告诉世人,我曾经将代码注入生命去打造互联网的浪潮之巅,那是个很疯狂的时代,我在一波波的浪潮上留下...