HStreamDB Newsletter 2022-05|Decentralized cluster mechanism, new data integration framework, new clients and deployment methods

This month, the HStreamDB team officially released v0.8 and started development work on v0.9, which will bring major improvements in clustering, external system integration, partitioning, and more. This month, we mainly completed the design and preliminary development of HStream IO, a new cluster mechanism and data integration framework, and started the development of a new Python client. At the same time, the Erlang client version 0.1 was officially released, and the deployment support of Helm and Alibaba Cloud was added.

HServer cluster mechanism improvement

In v0.8 and earlier versions, the HServer cluster mainly adopts the centralized clustering mechanism based on ZooKeeper. ZooKeeper is used to register and discover HServer nodes and coordinate between nodes. There is no direct communication between HServer nodes. This clustering scheme is adopted by a large number of distributed systems and is relatively mature. The main disadvantage is that it needs to rely on external systems such as ZooKeeper, which is not flexible enough, and has some limitations in terms of scalability.

In order to support larger clusters and better scalability, as well as reduce dependence on external systems, v0.9 will adopt a decentralized clustering mechanism. The new clustering scheme will be mainly based on the SWIM[1] paper, and its core includes a A set of efficient failure dectation algorithm and gossip style cluster message propagation mechanism, similar solutions have been applied in distributed systems such as Consul and Cassandra. At present, the new cluster related functions are still in the research and development process and will be officially released in v0.9.

New data integration framework HStream IO

In order to meet a variety of different business needs, there are often multiple sets of data systems or data platforms within an enterprise, including but not limited to: online transaction library, offline analysis library, cache system, search system, batch processing system, real-time processing system, data Lake and more. While focusing on streamlining and reshaping the real-time data stack, HSteamDB, as an emerging streaming database, also shoulders the mission of promoting the efficient flow of data throughout the data stack and promoting the modernization and real-timeization of enterprise data stacks. The ability to integrate with numerous external systems is also very important to HStreamDB.

HStream IO is the data integration framework within HStreamDB. It includes components such as source connectors, sink connectors, and IO Runtime. It can import data from external systems into HStreamDB through source connectors, and export data in HStreamDB to external systems through sink connectors. . It is also worth noting that HStream IO will be implemented based on Airbyte spec, which means that we will be able to fully reuse a large number of open source connectors in the Airbyte community, and quickly integrate HStreamDB with any system. This month HStream IO has completed the design and preliminary development work, and will be officially released in v0.9.

Client update

Add Python client

This month we also started the research and development of HStreamDB's Python client hstreamdb-py, which supports Python 3.7 and above, and will be officially released next month.

hstreamdb-erlang v0.1 released

This month, HStreamDB's Erlang client hstreamdb-erlang officially released v0.1. For details, please refer to https://github.com/hstreamdb/hstreamdb-erlang/blob/main/README.md

Deployment method update

Added Helm-based deployment support

Helm ( https://helm.sh/ ) can help users install and manage K8s applications more easily. This month, HStreamDB also provides Helm-based deployment support. For details, please refer to the document https://hstream.io/docs/en /latest/deployment/deploy-helm.html#building-your-kubernetes-cluster

Added Alibaba Cloud Terraform deployment support

Previously, we provided a tutorial on deploying HStreamDB on AWS and HUAWEI CLOUD based on Terraform. This month, we added support for deployment on Alibaba Cloud. For details, please refer to the document https://hstream.io/docs/zh/latest/deployment /deploy-terraform-aliyun.html

[1]: Das, A., Gupta, I. and Motivala, A., 2002, June. Swim: Scalable weakly-consistent infection-style process group membership protocol. In Proceedings International Conference on Dependable Systems and Networks (pp. 303 -312).IEEE.

Copyright statement: This article is original by EMQ, please indicate the source when reprinting.

Original link: https://hstream.io/zh/blog/hstreamdb-newsletter-202205

EMQ(杭州映云科技有限公司)是一家开源物联网数据基础设施软件供应商,交付全球领先的开源 MQTT 消息服...

305 声望
429 粉丝
0 条评论
推荐阅读
EMQX 在 Kubernetes 中如何进行优雅升级
为了降低 EMQX 在 Kubernetes 上的部署、运维成本,我们将一些日常运维能力进行总结、抽象并整合到代码中,以 EMQX Kubernetes Operator 的方式帮助用户实现 EMQX 的自动化部署和运维。

EMQX阅读 358

一个物联网云平台一线开发者对物联网平台的全面认知
一个物联网云平台一线开发者对物联网平台的全面认知。​物联网平台的理解;物联网平台的商业化运作目标;物联网平台的架构分析;物联网平台核心能力分析;物联网平台玩家的基本情况;物联网平台的价值;物联网平台...

石志远1阅读 1.5k

封面图
集群部署看过来,低代码@AWS智能集群的架构与搭建方案
亚马逊AWS是葡萄城的生态合作伙伴。为了帮助您充分利用AWS的托管服务快速构建起一套集群环境,彻底去掉“单一故障点”,实现最高的可用性,我们准备了《低代码智能集群@AWS的架构与搭建方案》看完本文,带你掌握“基...

葡萄城技术团队阅读 2.2k

ElasticSearch 必知必会 - 进阶篇
京东物流:康睿 姚再毅 李振 刘斌 王北永说明:以下全部均基于 ElasticSearch 8.1 版本一.跨集群检索 - ccr官网文档地址: [链接]跨集群检索的背景和意义跨集群检索定义跨集群检索环境搭建官网文档地址: [链接]...

京东云开发者2阅读 321

封面图
Clickhouse表引擎探究-ReplacingMergeTree
MergeTree 系列的引擎被设计用于插入极大量的数据到一张表当中。数据可以以数据片段的形式一个接着一个的快速写入,数据片段在后台按照一定的规则进行合并。相比在插入时不断修改(重写)已存储的数据,这种策略...

京东云开发者1阅读 631

封面图
什么是MircoPython?
摘要:互联网玩家为了让Python这样的容易学,简单易学、社区API丰富的语言可以在嵌入式领域用上,逐渐开始了一轮Python上嵌入式的迁移,这样就有了今天的主角——MircoPython。

华为云开发者联盟1阅读 1.3k

不只是负载均衡,活字格智能集群的架构与搭建方案
作为一款优秀的企业级低代码开发平台,活字格除了本身开发集成的强大功能之外,负载均衡的能力依旧手到擒来。如果你需要解决如下的问题,可以考虑搭建一套活字格智能集群:

葡萄城技术团队阅读 1.2k

EMQ(杭州映云科技有限公司)是一家开源物联网数据基础设施软件供应商,交付全球领先的开源 MQTT 消息服...

305 声望
429 粉丝
宣传栏