At the 2021 Amazon Cloud Technology China Summit on July 21, 2021, Pan Juan, the co-founder of SphereEx and Apache ShardingSphere PMC, was invited to participate in the summit, with the theme of "Apache ShardingSphere Distributed Database Middleware Open Source Ecological Construction" and centered on the concept of open source Diffusion, community building, how ShardingSphere implements the Apache Way and other aspects are introduced. This article is summarized from Pan Juan's content sharing.

01
A new ecology above the database and below the business
One layer is close to the application, and one layer is close to the DataBase.

Different industries, different users, different positioning, different needs... Today's databases are facing more complex data application scenarios and increasingly personalized and customized data processing requirements than in the past. The increasingly demanding production environment is also driving different databases to continuously maximize performance indicators such as data read and write speed, latency, and throughput.

Over time, data application scenarios with a clear division of labor have gradually led to the fragmentation of the database market, and it is difficult to produce a database that can perfectly adapt to all scenarios. Choosing different databases in different business scenarios has become a common enterprise selection method.

But similarly, this kind of database form with a hundred flowers blooming will also bring about the problem of "a hundred flowers blooming". But from a macro point of view, there are commonalities between these issues, and they can be extracted to form a set of factual standards. If you can build a platform layer that can uniformly apply and manage data on top of these blooming databases, you can develop in accordance with fixed standards while shielding the differences in the underlying databases. This standardized solution will greatly reduce the user management foundation. The pressure of data facilities and the cost of learning.

Apache ShardingSphere is located at this layer. By reusing the capabilities of the original database, it can help the technical team realize the development of incremental capabilities such as sharding, encryption and decryption, and there is no need to consider the configuration of the underlying database. Upward can shield user perception, thereby quickly building business-oriented database direct connection capabilities, and easily manage large-scale data clusters.

image.png

02
How to practice the Apache Way
Sharding

ShardingSphere can simultaneously use multiple functions to meet the diverse needs of users.

With the increase of business volume, when a single database is difficult to support a large amount of business, it is necessary to expand the database horizontally, which will inevitably face the problem of distributed management. ShardingSphere builds a hot-swappable functional layer on top of the database and provides a traditional database operation mode, shielding users from the perception of changes in the underlying database, and giving developers the ability to manage large-scale database clusters using a single database. . Among them, ShardingSphere mainly includes the following four application scenarios:

Sharding strategy

When the volume of business increases, the pressure on data fragmentation will increase, and the corresponding fragmentation strategy will be designed more complicated accordingly. ShardingSphere can assist users to do more sharding strategies beyond the original horizontal expansion in a flexible and easy-scalable manner, and at the same time, it also supports the ability to customize expansion.

Read and write separation

Under normal circumstances, master-slave deployment can effectively alleviate the pressure on the database, but if a machine or library table under a certain cluster fails to perform normal read and write operations, it will have a relatively large impact on the business. In order to avoid business unavailability, developers are usually required to rewrite a set of high-availability strategies to achieve master-slave switching of reading and writing library tables. ShardingSphere can automatically explore the status of all clusters, discover unreliable requests, master-slave switching in the underlying database and other problems at the first time, and can automatically restore the master-slave status without the surface user's perception.

Sharding Scaling

As the business grows, it may be necessary to split the previously split data cluster again. The Scaling component of ShardingSphere can start a task with a single SQL command and display the running status in the background in real time. Through the "pipeline" of Scaling, the old database ecology and the new database ecology are reconnected.

image.png

Data encryption and decryption

In the application of the database, the encryption and decryption of key data is also a very important part. If the original system's monitoring capabilities are not up to standard, some sensitive data may be stored in plaintext, and it needs to be encrypted later, which is a common problem for many teams. By standardizing these capabilities and integrating them into the middleware ecosystem, ShardingSphere automates the process of user desensitization, encryption and decryption of new and old business data, and the entire process achieves user-level unawareness. At the same time, it supports a variety of built-in data encryption/desensitization algorithms, and users can customize and expand the corresponding data algorithms according to their own conditions.

Constructing the nerve of data access: the pluggable Database Plus platform

Faced with a variety of needs and usage scenarios, ShardingSphere provides three access forms for developers in different fields: JDBC for Java, proxy for heterogeneous, and Sidecar for cloud. Users can choose specific Need to make selection, and perform related operations such as sharding, read-write separation, and data migration on top of the original cluster.

JDBC access: It is completely used in the way of JDBC, which can be understood as an enhanced JDBC driver, which is fully compatible with JDBC and various ORM frameworks. It can realize distributed management, horizontal expansion, and release without additional deployment and dependencies. Min and a series of operations;

Proxy access: In the form of simulating database services, the underlying real database cluster is managed through Proxy, basically without the need to modify the business;

Mesh access on the cloud: Provides a deployment form on the public cloud for ShardingSphere. In the cloud, currently SphereEx has joined the cloud creation plan of Amazon Cloud Technology. In the future, it will continue to cooperate with Amazon Cloud Technology in the Marketplace in China and overseas to provide users on Amazon Cloud Technology with more powerful proxy image deployment capabilities. , To jointly create a more mature cloud environment for enterprise applications.

image.png

03
open source, allowing personal work to connect to the world

ShardingSphere has had considerable influence in the industry since it is open sourced. At present, as long as tools or capabilities for horizontal expansion are involved in China, ShardingSphere usually appears on the candidate list. Of course, this is due to the contribution of the project maintenance team over the years, which has made ShardingSphere's functions more and more perfect. On the other hand, it is also attributed to the increasing domestic open source atmosphere.

In the open source community in the past few years, most domestic users played the role of program downloading and code quoting, but they were rarely involved in community building. In recent years, with the promotion of open source concepts in China, more and more students with strong technical feelings have emerged. It is the participation of these students that can make the ShardingSphere community more and more active. Because for a good open source project, the judging criteria are not just the advanced concept and advanced technology, but also the deep foundation accumulated in many aspects such as technological influence, open source influence, ecological construction, and developer groups.

image.png

This is why ShardingSphere, as a top-level Apache open source project, is still actively calling for everyone to participate in the open source community. After all, what everyone comes into contact with every day is only the group of people around them, and the work they do is just these things in the office, and they are "limited" in this circle every day. And through open source, you can connect your work to the world, so that you can put aside the books and really invest in the project, open your horizons, gradually cultivate the spirit of openness and cooperation, and rediscover the value you have produced at the moment.

This article is from SphereEx Pan Juan (Shared by Amazon Cloud Technology China Summit)


亚马逊云开发者
2.9k 声望9.6k 粉丝

亚马逊云开发者社区是面向开发者交流与互动的平台。在这里,你可以分享和获取有关云计算、人工智能、IoT、区块链等相关技术和前沿知识,也可以与同行或爱好者们交流探讨,共同成长。