"With a high base, even if you win, you will lose." In the construction of a digital enterprise building, the importance of basic software is self-evident. However, for all walks of life, the basic software designed for the traditional business model has been difficult to support the innovation of digital business. Only by learning from the experience of professional teams and shortening the time for basic software upgrade and exploration can we eliminate worries and invest in the digitalization of business and management. , and wholeheartedly deal with the risks of the global environment and the uncertainty of the industry.

From April 15th to 16th, 2022, the first DIVE Global Basic Software Innovation Conference with the theme of "deepening basic software and creating a new digital base" will be held online. This conference is hosted by InfoQ and aims to create the most content in basic software field. The rich, cutting-edge and most technical industry conference has become the vane in the field of basic software. Two senior architects of NetEase Shufan, Weng Yanghui and Xiang Dong, were invited to participate in this conference, and they respectively gave the titles of "Unified Governance Practice of NetEase Shufan under the Hybrid Microservice Architecture" and "Future-oriented Distributed Storage Design" "The speech shared NetEase Shufan's digital basic software innovation experience accumulated in the process of supporting NetEase's business and serving customers in the industry.

Unified service governance solves the problem of technical fragmentation

Weng Yanghui introduced the existing background and current problems of the hybrid microservice technology architecture, the core problems and difficulties that need to be solved in unified governance, and put forward the idea of how to elegantly upgrade the legacy historical business from the framework to the service grid, and shared how Netease Shufan Through product design, the unified governance of microservices is more elegant.

It has been more than 10 years since microservices were first proposed as an architectural design pattern, and microservices technology has been widely used in enterprise business architecture design. From the perspective of the technical selection of development frameworks, Dubbo and Spring Cloud are the two mainstream Java language microservice development framework selections, but there are still some companies that are based on private internal frameworks, and some have not even been fully microserviced. .

Due to the update and iteration of technology and the rapid development of business, it is necessary to introduce new technologies to deal with complex business scenarios, resulting in the "fragmentation" of technology during the evolution of business technology architecture, which is reflected in many aspects:

1. The microservice framework is difficult to manage uniformly . Java still occupies the largest share in enterprise-level application development. Whether it is using Spring Cloud, Dubbo, gRPC, etc., or even private development frameworks, there is a need for service governance. Different microservices How to realize mutual discovery between frameworks and how to conduct unified governance are the pain points faced by many enterprise teams.

2. Heterogeneous languages are difficult to manage in a unified manner. For different business scenarios, using different development languages can often take advantage of language features. For example, using C++ to develop high-performance, low-latency services, and using Python to develop artificial intelligence and data analysis applications, These heterogeneous language applications also need unified governance, such as providing traffic management, security control and other capabilities;

3. Middleware is difficult to manage uniformly. There are different types of registries for different microservice technology choices, such as configuration centers, authentication centers, and a variety of general data and message middleware such as MySQL, Redis, ES, Kafka, etc., how to carry out effective unified management and realize cloud-based efficient and intelligent operation and maintenance is also one of the demands of the business team;

4. It is difficult to manage the operating environment in a unified manner. With the development of cloud-native technologies, the transition from physical machines to virtual machines to containerized application operating environments is becoming a standard evolution route. Enterprise business deployment is also changing from private cloud to public Cloud, to the development of the hybrid cloud model, to solve the requirements of resource elastic scaling and business disaster recovery guarantee. Different basic environments also need to effectively shield differences and unified governance at the business layer.

In addition, there are some common basic technical components and business deployment architectures that require more unified and standardized design appeals, which are reflected in different dimensions and levels in different technical architectures. The business R&D team often invests energy in business development to support business development, so there are various technical debts brought about by the technological evolution process, which is also a pain point faced by the current enterprises in the process of digital transformation and upgrading.

NetEase Shufan Qingzhou microservice team has accumulated a lot of experience and best practices in the process of internal and external customer support for many years, especially in the field of microservices and cloud native technology, and has precipitated a set of enterprise-level microservices unified Governance Platform. Through industry-leading non-intrusive micro-service governance technology, dual-engine multi-mode unified governance, and middleware PaaS-based management , the technical problems faced by enterprises in the process of architecture upgrade are solved. By providing a one-stop micro-service platform console , help Enterprise users can quickly achieve unified business governance with minimal transformation and use costs, so that business teams can pay more attention to business development in professional fields, improve the overall R&D efficiency of the enterprise, and achieve cost optimization.

In addition, Weng Yanghui also pointed out in this sharing that the Qingzhou microservice team has made many excellent cases in the financial industry in recent years, and summarized and accumulated experience in the financial industry. By providing a full-site distributed technical capability base, as well as business architecture support capabilities such as two-site, three-center, and off-site multi-active, it helps traditional financial enterprises to realize the distributed technology transformation and upgrading of core businesses, so as to achieve de-IOE and ultimately achieve full The ultimate construction goal of localization and independent control of stack technology.

Future-proof distributed storage design

Xiangdong introduced the latest development of distributed storage architecture based on the research and development background and application scenarios of Netease Shufan's open source cloud-native software-defined storage software Curve, how to achieve design goals through reasonable design, the details of storage optimization, and the development of Curve direction and evolution. Curve is a distributed storage system. It includes two parts, the CurveBS distributed block storage system and the CurveFS distributed file storage system. Currently, CurveBS has been widely used within the company, and CurveFS is in the process of development and evolution.

As the trend of separation of storage and computing continues to develop, more and more cloud applications rely on an architecture that separates storage and computing. The separation of storage and computing can deeply optimize resources to achieve elastic expansion of computing and storage resources and allocate them on demand. Curve is a cloud-native storage system born to meet the needs of separation of storage and computing. It has the characteristics of high performance, easy operation and maintenance, and cloud native.

NetEase Shufan chose the self-developed Curve storage system for three reasons:

  1. Lack of independent and controllable unified distributed storage system with small amount of code, Ceph code amount reaches 100W+, it is very difficult to be fully familiar with and mastered;
  2. When the existing open source storage system fails, it has a great impact on upper-layer applications and is difficult to operate and maintain. Ceph adopts a strong consistency protocol, which will cause frequent I/O jitters when the system fails;
  3. Existing open source storage systems cannot provide higher performance and meet the needs of core application scenarios under general-purpose hardware.

The main core challenge of easy operation and maintenance is how to effectively improve the availability and reliability of the system. When the system fails, it can not only ensure the consistency of the data but also minimize the impact of the failure. In order to achieve the easy operation and maintenance goal of CurveBS, NetEase Shufan adopts the RAFT protocol . Using the RAFT protocol can not only maintain data consistency, but also reduce the response delay of write I/O. It only requires that most replica replication requests return successfully to indicate that data writing is successful.

In order to improve the reliability of data, NetEase Shufan adopts the concept of fault domain in the topology structure, and uses the copy algorithm in data distribution to ensure that the probability of data loss is the lowest when a fault occurs. When the storage system is upgraded online, a special client design is adopted to ensure the online upgrade of the storage system.

To achieve the high-performance goal of CurveBS, the three major strategies are to reduce the write amplification of the underlying I/O, improve the throughput of I/O data, and reduce the I/O delay. NetEase Shufan uses ChunkFilePool to pre-create file pools to reduce I/O write amplification, and uses DataStrip data strips similar to Raid to improve data throughput, and uses zerocopy to reduce the overhead caused by I/O data copying.

Compared with CurveBS, CurveFS needs to face more complex loads and more diverse application scenarios, such as: machine learning scenarios that take into account both performance and capacity, business scenarios for fast cross-cloud elastic release, and low-cost, high-capacity businesses. , Middleware automatic separation of hot and cold data, S3 and POSIX unified access requirements.

NetEase Shufan's solution is to first ensure the performance and spatial linear scalability of file metadata at the metadata level, use the RAFT protocol to ensure data consistency and availability in the event of system failure, and use multiple layers of cache to improve data and metadata services. performance. At present, CurveFS already supports the underlying S3 object storage and can provide POSIX-compatible file services to the outside world. The NetEase Shufan storage team is still optimizing the performance of CurveFS, and is developing support for accessing CurveBS block storage.


网易数帆
391 声望550 粉丝

网易数智旗下全链路大数据生产力平台,聚焦全链路数据开发、治理及分析,为企业量身打造稳定、可控、创新的数据生产力平台,服务“看数”、“管数”、“用数”等业务场景,盘活数据资产,释放数据价值。