本文主要研究一下Spring AI的ChromaVectorStore

示例

pom.xml

        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-vector-store-chroma</artifactId>
        </dependency>

配置

spring:
  ai:
    vectorstore:
      type: chroma
      chroma:
        initialize-schema: true
        collectionName: "test1"
        client:
          host: http://localhost
          port: 8000

代码

    @Test
    public void testAddAndSearch() {
        List<Document> documents = List.of(
                new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
                new Document("The World is Big and Salvation Lurks Around the Corner"),
                new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));

        // Add the documents to Milvus Vector Store
        chromaVectorStore.add(documents);

        // Retrieve documents similar to a query
        List<Document> results = this.chromaVectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());
        log.info("results:{}", JSON.toJSONString(results));
    }

输出如下:

results:[{"contentFormatter":{"excludedEmbedMetadataKeys":[],"excludedInferenceMetadataKeys":[],"metadataSeparator":"\n","metadataTemplate":"{key}: {value}","textTemplate":"{metadata_string}\n\n{content}"},"formattedContent":"distance: 0.4350912\nmeta1: meta1\n\nSpring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!","id":"53ce7adb-07ba-429c-b443-40edffde2c89","metadata":{"distance":0.4350912,"meta1":"meta1"},"score":0.5649088025093079,"text":"Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!"},{"contentFormatter":{"$ref":"$[0].contentFormatter"},"formattedContent":"distance: 0.57093126\n\nThe World is Big and Salvation Lurks Around the Corner","id":"cd79cfe8-8503-4e2e-bb1e-ec0e294bc800","metadata":{"distance":0.57093126},"score":0.42906874418258667,"text":"The World is Big and Salvation Lurks Around the Corner"},{"contentFormatter":{"$ref":"$[0].contentFormatter"},"formattedContent":"distance: 0.59360236\nmeta2: meta2\n\nYou walk forward facing the past and you turn back toward the future.","id":"f266f142-3044-4b09-8178-55abe6ef84c5","metadata":{"distance":0.59360236,"meta2":"meta2"},"score":0.40639764070510864,"text":"You walk forward facing the past and you turn back toward the future."}]

源码

ChromaVectorStoreAutoConfiguration

org/springframework/ai/vectorstore/chroma/autoconfigure/ChromaVectorStoreAutoConfiguration.java

@AutoConfiguration
@ConditionalOnClass({ EmbeddingModel.class, RestClient.class, ChromaVectorStore.class, ObjectMapper.class })
@EnableConfigurationProperties({ ChromaApiProperties.class, ChromaVectorStoreProperties.class })
@ConditionalOnProperty(name = SpringAIVectorStoreTypes.TYPE, havingValue = SpringAIVectorStoreTypes.CHROMA,
        matchIfMissing = true)
public class ChromaVectorStoreAutoConfiguration {

    @Bean
    @ConditionalOnMissingBean(ChromaConnectionDetails.class)
    PropertiesChromaConnectionDetails chromaConnectionDetails(ChromaApiProperties properties) {
        return new PropertiesChromaConnectionDetails(properties);
    }

    @Bean
    @ConditionalOnMissingBean
    public ChromaApi chromaApi(ChromaApiProperties apiProperties,
            ObjectProvider<RestClient.Builder> restClientBuilderProvider, ChromaConnectionDetails connectionDetails,
            ObjectMapper objectMapper) {

        String chromaUrl = String.format("%s:%s", connectionDetails.getHost(), connectionDetails.getPort());

        var chromaApi = new ChromaApi(chromaUrl, restClientBuilderProvider.getIfAvailable(RestClient::builder),
                objectMapper);

        if (StringUtils.hasText(connectionDetails.getKeyToken())) {
            chromaApi.withKeyToken(connectionDetails.getKeyToken());
        }
        else if (StringUtils.hasText(apiProperties.getUsername()) && StringUtils.hasText(apiProperties.getPassword())) {
            chromaApi.withBasicAuthCredentials(apiProperties.getUsername(), apiProperties.getPassword());
        }

        return chromaApi;
    }

    @Bean
    @ConditionalOnMissingBean(BatchingStrategy.class)
    BatchingStrategy chromaBatchingStrategy() {
        return new TokenCountBatchingStrategy();
    }

    @Bean
    @ConditionalOnMissingBean
    public ChromaVectorStore vectorStore(EmbeddingModel embeddingModel, ChromaApi chromaApi,
            ChromaVectorStoreProperties storeProperties, ObjectProvider<ObservationRegistry> observationRegistry,
            ObjectProvider<VectorStoreObservationConvention> customObservationConvention,
            BatchingStrategy chromaBatchingStrategy) {
        return ChromaVectorStore.builder(chromaApi, embeddingModel)
            .collectionName(storeProperties.getCollectionName())
            .initializeSchema(storeProperties.isInitializeSchema())
            .observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
            .customObservationConvention(customObservationConvention.getIfAvailable(() -> null))
            .batchingStrategy(chromaBatchingStrategy)
            .build();
    }

    static class PropertiesChromaConnectionDetails implements ChromaConnectionDetails {

        private final ChromaApiProperties properties;

        PropertiesChromaConnectionDetails(ChromaApiProperties properties) {
            this.properties = properties;
        }

        @Override
        public String getHost() {
            return this.properties.getHost();
        }

        @Override
        public int getPort() {
            return this.properties.getPort();
        }

        @Override
        public String getKeyToken() {
            return this.properties.getKeyToken();
        }

    }

}
ChromaVectorStoreAutoConfiguration在spring.ai.vectorstore.typechroma时启用,它根据ChromaApiProperties创建ChromaApi,再根据ChromaVectorStoreProperties创建ChromaVectorStore

ChromaApiProperties

org/springframework/ai/vectorstore/chroma/autoconfigure/ChromaApiProperties.java

@ConfigurationProperties(ChromaApiProperties.CONFIG_PREFIX)
public class ChromaApiProperties {

    public static final String CONFIG_PREFIX = "spring.ai.vectorstore.chroma.client";

    private String host = "http://localhost";

    private int port = 8000;

    private String keyToken;

    private String username;

    private String password;

    //......
}    
ChromaApiProperties主要是配置spring.ai.vectorstore.chroma.client,它提供了host、port、keyToken、username、password这些属性

ChromaVectorStoreProperties

org/springframework/ai/vectorstore/chroma/autoconfigure/ChromaVectorStoreProperties.java

@ConfigurationProperties(ChromaVectorStoreProperties.CONFIG_PREFIX)
public class ChromaVectorStoreProperties extends CommonVectorStoreProperties {

    public static final String CONFIG_PREFIX = "spring.ai.vectorstore.chroma";

    private String collectionName = ChromaVectorStore.DEFAULT_COLLECTION_NAME;

    public String getCollectionName() {
        return this.collectionName;
    }

    public void setCollectionName(String collectionName) {
        this.collectionName = collectionName;
    }

}
ChromaVectorStoreProperties提供了spring.ai.vectorstore.chroma,它从CommonVectorStoreProperties继承了initializeSchema属性,自己提供了collectionName属性

小结

Spring AI提供了spring-ai-starter-vector-store-chroma用于自动装配ChromaVectorStore。要注意的是embeddingDimension默认是1536,如果出现cannot unpack non-iterable coroutine object错误,记得把chroma版本降到0.6.2。

doc


codecraft
11.9k 声望2k 粉丝

当一个代码的工匠回首往事时,不因虚度年华而悔恨,也不因碌碌无为而羞愧,这样,当他老的时候,可以很自豪告诉世人,我曾经将代码注入生命去打造互联网的浪潮之巅,那是个很疯狂的时代,我在一波波的浪潮上留下...