Easysearch Java SDK 2.0.x 使用指南（二）

在上一篇文章中，我们介绍了 Easysearch Java SDK 2.0.x 的基本使用和批量操作。本文将深入探讨索引管理相关的功能，包括索引的创建、删除、开关、刷新、滚动等操作，以及新版 SDK 提供的同步和异步两种调用方式。

SDK 的对象构建有两种方式

1. 传统的 Builder 方式

最基础的方式，像这样：

CreateIndexResponse createResponse = client.indices().create(
    new CreateIndexRequest.Builder()
        .index("my-index")
        .aliases("foo",
            new Alias.Builder().isWriteIndex(true).build()
        )
        .build()
);

优点简单直观，但稍显笨重。

2. Lambda 表达式方式

这才是推荐的写法，简洁优雅：

CreateIndexResponse createResponse = client.indices()
    .create(c -> c
        .index("my-index")
        .aliases("foo", a -> a
            .isWriteIndex(true)
        )
    );

Lambda 方式不仅代码少，最大的优点是不用记那么多 Builder 类名。尤其是写复杂查询的时候，代码层次感特别强：

// 命名建议：用 b0、b1 这样的简写表示嵌套层级
SearchResponse<Doc> results = client.search(b0 -> b0
    .index("my-index")
    .query(b1 -> b1
        .bool(b2 -> b2
            .must(b3 -> b3
                .match(b4 -> b4
                    .field("title")
                    .query("搜索")
                )
            )
            .filter(b3 -> b3
                .range(b4 -> b4
                    .field("date")
                    .gte("2024-01-01")
                )
            )
        )
    ),
    Doc.class
);

好了，说回索引管理

啥是索引管理?

简单来说就是对索引进行增删改查的一些基本操作。比如:

建个新索引
删掉不要的索引
关闭/打开某个索引
刷新一下让数据立马能搜到
清个缓存
整理一下索引段(让搜索更快)

新版 SDK 在这块的设计特别贴心，同步异步都支持，用起来特别顺手。

同步方式怎么用?

上代码!下面这段代码基本涵盖了日常用到的所有索引管理操作:

    String index = "test1";
    // 先看看索引在不在
    if (client.indices().exists(r -> r.index(index)).value()) {
        LOGGER.info("Deleting index " + index);
        // 在的话就删掉重来
        DeleteIndexResponse deleteIndexResponse =
            client.indices().delete(new DeleteIndexRequest.Builder().index(index).build());
        LOGGER.info(deleteIndexResponse.toString());
    }

    // 建个新的
    LOGGER.info("Creating index " + index);
    CreateIndexResponse createIndexResponse =
        client.indices().create(req -> req.index(index));

    // 关闭索引
    CloseIndexResponse closeIndexResponse =
        client.indices().close(req -> req.index(index));

    // 打开索引
    OpenResponse openResponse =
        client.indices().open(req -> req.index(index));

    // 刷新一下，让刚写入的数据马上能搜到
    RefreshResponse refreshResponse =
        client.indices().refresh(req -> req.index(index));

    // 把内存里的数据都写到磁盘上
    FlushResponse flushResponse =
        client.indices().flush(req -> req.index(index));

    // 整理一下索引段，搜索会快很多
    // maxNumSegments(1L) 意思是整理成一个段
    ForcemergeResponse forcemergeResponse =
        client.indices().forcemerge(req -> req.index(index).maxNumSegments(1L));

看代码就能明白，这些操作都特别直观，基本上方法名就能告诉你它是干啥的。而且返回的 Response 对象里都带着详细的执行结果，出了问题很容易排查。

异步方式又是咋回事?

有时候你可能不想等着这些操作一个个完成，这时候就可以用异步方式:


    String index = "test1";
    EasysearchAsyncClient asyncClient = SampleClient.createAsyncClient();

    // 用CompletableFuture串起来一串操作
    asyncClient.indices().exists(req -> req.index(index))
        .thenCompose(exists -> {
            if (exists.value()) {
                LOGGER.info("Deleting index " + index);
                return asyncClient.indices().delete(r -> r.index(index))
                    .thenAccept(deleteResponse -> {
                        LOGGER.info(deleteResponse);
                    });
            }
            return CompletableFuture.completedFuture(null);
        })
        .thenCompose(v -> {
            LOGGER.info("Creating index " + index);
            return asyncClient.indices().create(req -> req.index(index));
        })
        .whenComplete((createResponse, throwable) -> {
            if (throwable != null) {
                LOGGER.error("哎呀出错了", throwable);
            } else {
                LOGGER.info("搞定!");
            }
        })
        .get(30, TimeUnit.SECONDS); // 最多等30秒

异步方式看起来代码多了点，但是好处也很明显：

不会卡住主线程
可以并发执行多个操作
配合 CompletableFuture 能实现很多花样

小贴士

选哪种方式?
- 简单场景用同步，代码简单直观
- 要并发或者不想阻塞就用异步
记得处理异常
- 同步的就直接 try-catch
- 异步的用 whenComplete 或 exceptionally 来处理
性能方面
- force merge 挺耗资源的，建议半夜执行
- refresh 太频繁会影响写入性能，根据需要权衡

自动翻转 (Rollover)

在管理 Easysearch 索引时，我们经常需要控制单个索引的大小和时间跨度。Easysearch 的 Rollover API 提供了一个优雅的解决方案，允许我们基于特定条件自动创建新索引。本文将介绍如何使用 Java API 实现索引 rollover。

什么是 Rollover?

Rollover 是一种索引管理机制，当现有索引满足一个或多个条件时（如达到一定大小、文档数量或时间），会自动创建一个新索引。这对于日志管理等场景特别有用。

实现示例

首先，我们需要创建一个初始索引并设置别名：

String index = "test-00001";
// 如果索引存在则删除
if (client.indices().exists(r -> r.index(index)).value()) {
    client.indices().delete(new DeleteIndexRequest.Builder().index(index).build());
}

// 创建索引并设置别名
client.indices().create(req -> req
    .index(index)
    .aliases("test_log", a -> a.isWriteIndex(true)));

配置 Rollover 条件

有两种方式配置 rollover 条件：

方式一：使用 Java API

RolloverResponse res = client.indices().rollover(req -> req
    .alias("test_log")
    .conditions(c -> c
        .maxDocs(100L)        // 文档数量超过100
        .maxAge(b -> b.time("7d"))  // 索引年龄超过7天
        .maxSize("5gb")));    // 索引大小超过5GB

方式二：使用 JSON 配置

String conditionsJson = """
{
  "conditions": {
    "max_docs": 100,
    "max_age": "7d",
    "max_size": "5gb"
  }
}
""";

RolloverResponse response = client.indices().rollover(req -> req
    .alias("test_log")
    .withJson(new StringReader(conditionsJson))
);

Rollover 条件说明

max_docs: 索引中的最大文档数
max_age: 索引最大存在时间
max_size: 索引的最大存储大小

当满足任一条件时，系统会自动创建新索引。新索引的命名规则是将原索引名称中的数字部分加 1。

想要了解更多？

客户端 Maven 地址： https://mvnrepository.com/artifact/com.infinilabs/easysearch-client/2.0.2
更详细的文档和示例代码在官网持续更新中，请随时关注！

大家有啥问题或者建议，也欢迎随时反馈！

关于 Easysearch

INFINI Easysearch 是一个分布式的搜索型数据库，实现非结构化数据检索、全文检索、向量检索、地理位置信息查询、组合索引查询、多语种支持、聚合分析等。Easysearch 可以完美替代 Elasticsearch，同时添加和完善多项企业级功能。Easysearch 助您拥有简洁、高效、易用的搜索体验。

官网文档：https://infinilabs.cn/docs/latest/easysearch

作者：张磊，极限科技（INFINI Labs）搜索引擎研发负责人，对 Elasticsearch 和 Lucene 源码比较熟悉，目前主要负责公司的 Easysearch 产品的研发以及客户服务工作。

Easysearch Java SDK 2.0.x 使用指南（二）

SDK 的对象构建有两种方式

1. 传统的 Builder 方式

2. Lambda 表达式方式

啥是索引管理?

同步方式怎么用?

异步方式又是咋回事?

小贴士

自动翻转 (Rollover)

什么是 Rollover?

实现示例

配置 Rollover 条件

Rollover 条件说明

关于 Easysearch

极限实验室

引用和评论

Easysearch 时序数据的基于时间范围的合并策略

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

MySQL慢查询日志：性能优化的终极指南

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

好用的开源埋点方案-ClkLog埋点用户分析系统

DNS服务器地址大全

实战分享：DolphinScheduler 中 Shell 任务环境变量最佳配置方式