数据库 - eKuiper Newsletter 2022-07｜v1.6.0: Flow Orchestration + Better SQL to Express Business Logic Easily - 个人文章

In the midsummer season, the second major version of eKuiper this year, v1.6.0, is coming as promised. The development and internal trial polishing of the graph rule API for Flow orchestration ran through the entire summer version development process, and was finally completed in July. At the same time, we have also completed a number of improvements to SQL syntax and functions. We hope that the combination of Flow orchestration and SQL can help users express business logic more easily, cover more usage scenarios, and further reduce the need for custom development and cost. In addition, we have also optimized the use of external system connections. For example, when the EdgeX and MQTT connection is interrupted, the rules are no longer exited, and SQL and TDengine Sink support batch writing.

In previous Newsletters, we have successively introduced some new features that have been developed in v1.6.0, including protobuf codec support, offline caching and retransmission. This issue of Newletter will introduce other new features. For a complete list of features please see the 1.6.0 Release .

Graph Rules API for Flow Orchestration

In previous versions, eKuiper's rule logic was specified by SQL + actions. The benefits of rules based on SQL syntax are numerous:

SQL syntax is widely used, and it is relatively easy for users with technical background to get started.
The SQL syntax is concise and has been widely proven in the database field, and complex rules can be written in very short text.
SQL is a declarative language, and the execution engine needs to parse and generate an execution plan. In this way, the execution engine can optimize the actual execution plan itself without any changes from the user.

SQL is handy when it comes to dealing with data transformation-centric rules. However, in some scenarios, the SQL syntax is not very suitable.

For non-technical personnel, SQL is difficult to use.
For some scenarios, the SQL syntax is difficult to express or too complex. For example, a certain event is processed according to pattern matching. If the temperature and humidity sensor data is greater than a certain value, one process is performed, and another process is performed when the temperature is less than a certain value. Overall, Flow covers more scenarios.
SQL is difficult to implement UI due to its high degree of abstraction.

The graph rule API adopts JSON format, which directly describes the directed acyclic graph structure of the operators executed at runtime, which can be mapped one-to-one to Flow orchestration on the UI. In the new version, the graph rule API will be provided as a supplement to SQL.

It is worth noting that SQL rules are still fully supported in the new version, and users can choose the API to use according to the scenario. Among them, SQL is more suitable for user handwritten rules, while graph API is more suitable for UI generation due to the verbose structure of JSON.

Instructions

The graph rule API and SQL share the current rule REST API endpoint, which is used by specifying the graph attribute when creating a rule. The graph property is a JSON representation of a directed acyclic graph. It consists of nodes and topo, which define the nodes and their edges in the graph, respectively. Below is one of the simplest rules defined by a graph. It defines 3 nodes: demo, humidityFilter and mqttOut. This graph is linear, ie demo->humidityFilter->mqttOut. This rule will read data from the demo topic of MQTT, filter it through humidity (humidityFilter) and import the result into another topic of MQTT (mqttOut).

 {
  "id": "rule1",
  "name": "Test Condition",
  "graph": {
    "nodes": {
      "demo": {
        "type": "source",
        "nodeType": "mqtt",
        "props": {
          "datasource": "devices/+/messages"
        }
      },
      "humidityFilter": {
        "type": "operator",
        "nodeType": "filter",
        "props": {
          "expr": "humidity > 30"
        }
      },
      "mqttout": {
        "type": "sink",
        "nodeType": "mqtt",
        "props": {
          "server": "tcp://${mqtt_srv}:1883",
          "topic": "devices/result"
        }
      }
    },
    "topo": {
      "sources": ["demo"],
      "edges": {
        "demo": ["humidityFilter"],
        "humidityFilter": ["mqttout"]
      }
    }
  }
}

Each node in the graph's JSON has at least 3 fields:

type: The type of the node, which can be source, operator and sink.
nodeType: The implementation type of the node, which defines the business logic of the node, including built-in types and extension types defined by plugins.
props: The properties of the node. It is different for each nodeType.

For sources and sinks, the nodeType corresponds exactly to the type built into the system and extended by plugins. For the operator node, we provide a series of built-in nodes corresponding to the SQL syntax to achieve the same expressive ability as SQL. User-extended functions can be called through the funciton node or the aggfunc node. For a complete list of nodes, please refer to https://ekuiper.org/docs/zh/latest/rules/graph_rule.html#built-in-operator-nodetype .

Flow Editor

In the core version of eKuiper, only the background graph rule API is provided, and manufacturers and users can implement a drag-and-drop graphical interface based on this. We will also launch the Flow orchestration implementation in the near future for the convenience of users.

The graphical interface of the reference implementation is shown below. In the graphical interface, the available built-in and extended nodes can be listed in the left drawing board, allowing nodes to be dragged onto the canvas and connected to form a graph, set properties, etc. Dataflow graphs on artboards can be conveniently represented as JSON, created through the graph rules API.

SQL updates, easier to write rules

Several SQL syntax related updates have been added in the new version: the LAG function is provided to get previous values in the data stream; the BETWEEN and LIKE syntax is provided; the time window has been modified to align to natural time.

LAG functions for stateful analysis

The LAG function can view the previous data in the data stream and perform calculations with the current data. It is useful for computing the growth rate of a variable, detecting when a variable crosses a threshold, or when a condition starts or stops being true, etc. for computations that depend on the state of the cache. In previous versions, stateful computing relied on windows or plug-ins that were extended by the user, and the complexity was high. The LAG function can greatly lower the threshold for stateful analysis.

Its syntax is lag(expr, [offset], [default value]), which returns the result of the previous value of the expression at offset offset, if not found, it returns the default value, if no default value is specified, it returns nil. If none of the parameters except expression are specified, the offset defaults to 1, and the default value is nil. In the example below, we calculate the rate of change of the temperature value.

 SELECT lag(temperature) as last, temperature,  lag(temperature)/temperature as rate FROM demo

More ways to filter

The BETWEEN and LIKE syntaxes have been added in the new version. Among them, BETWEEN is used to filter numeric data, and select data within a range. LIKE is used to filter strings and select strings that satisfy a certain pattern. In the example below, we select data whose temperature is between 15 and 25 and whose deviceName starts with device.

 SELECT * FROM demo WHERE temperature BETWEEN 15 AND 25 AND deviceName LIKE "device%"

natural time window

Previously, the start time of the eKuiper time window was based on the actual window start time. But in practical scenarios, the aggregation of time is usually based on natural time. For example, for a time window of 1 hour, the desired result is an aggregate for each natural hour. Most streaming engines also align time windows to natural time. Therefore, in this release, the aggregation of time windows is also aligned to the natural time of the system time zone.

More efficient and stable connection

eKuiper connects with external systems through sources and sinks. This version focuses on improving the stability and efficiency of connections, and mainly improves the functions of the existing source and sink.

Database batch write

In SQL sink and TDengine Sink, attribute tableDataField is added, which can write embedded data (single row or multiple rows). At the same time, when the two receive array data (multi-line data), they will write all the data in batches at one time.

Stable EdgeX connection

Improve the connection logic of EdgeX, when the message bus connection is interrupted, it will not exit the rule immediately or print a large number of logs to cause a storm. After the message bus is restored, it can automatically reconnect. In conclusion, the rules for connecting EdgeX are more stable after creation and do not quit due to recoverable errors.

Copyright statement: This article is original by EMQ, please indicate the source when reprinting.
Original link: https://www.emqx.com/zh/blog/ekuiper-newsletter-202207

eKuiper Newsletter 2022-07｜v1.6.0: Flow Orchestration + Better SQL to Express Business Logic Easily

Graph Rules API for Flow Orchestration

Instructions

Flow Editor

SQL updates, easier to write rules

LAG functions for stateful analysis

More ways to filter

natural time window

More efficient and stable connection

Database batch write

Stable EdgeX connection

EMQX

引用和评论

在 Windows 平台搭建 MQTT 服务

被 Manus 带火的 MCP 是什么｜一文看懂

53 倍性能提升！TiDB 全局索引如何优化分区表查询？

分布式数据库解析

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

在 Kubernetes 上用 KubeBlocks + Dify 快速构建生产级 AIGC 应用

数据库的下一场革命：S3 延迟已降至原先的 10%，云数据库架构该进化了