|Bai Yu

At the 2021 Hangzhou Yunqi Conference, Li Guoqiang, product leader of Alibaba Cloud's intelligent cloud native application platform, took "The Road to Enterprise Internet Architecture Transformation-Alibaba Cloud Middleware Upgrade Release" as the theme to comprehensively interpret the innovation practice of Alibaba Cloud cloud native products. In the past year, in order to cope with increasingly fierce competition in the industry, restructuring application architecture has become a general trend. According to data from authoritative organizations, more than 80% of users have used or plan to use microservices, and more than 68% of institutions are in production environments. Use containers. More than 85% of users use distributed tracking, monitoring tools, and logs. These changes have highlighted the strong demands of enterprises for cloud-native application architecture, cloud-native deployment and maintenance, and stability upgrades.

1(1).png

As a cloud native beneficiary, Alibaba Group fully obtains the dividends of cloud computing technology through cloud native, and realizes the world's largest cloud native practice. 100% of all businesses run on the public cloud, and applications are 100% cloud native. Based on the integrated optimization of container software and hardware, the online business deployment of millions of containers has brought about a 30% increase in CPU resource utilization, an 80% reduction in 10,000 transaction costs, and a 20% increase in R&D operation and maintenance efficiency. Based on this, Alibaba will share these best practices and solutions with the society, helping taxation, human resources, banking, insurance, petroleum and petrochemical, retail fast-moving consumer, automobile manufacturing, Internet platforms and many other industries to tap more social value. After years of technology precipitation, Alibaba Cloud has provided more than 300 cloud products and nearly 1,000 solutions. Among them, message queue MQ, application real-time monitoring service ARMS, enterprise-level distributed application service EDAS, etc. have become indispensable components for many enterprises in the distributed Internet architecture. The Yunqi Conference also exposed the new features of these products for the first time.

2.png

RocketMQ5.0 heavy upgrade

As the communication infrastructure of contemporary applications, message queues are the core reliance of microservice architecture applications. Through asynchronous decoupling capabilities, users can build distributed, high-performance, flexible and robust applications more efficiently. In terms of data and value, the value of message queues continues to deepen. The core business data flowing in the message queue involves different links and scenarios such as integrated transmission, analysis, calculation, and processing. With continuous evolution, we can foresee that message queues will inevitably generate new value and create new "chemical reactions" in scenarios such as data channels, event integration driving, analysis and calculation.

3.png

This time, Alibaba Cloud RocketMQ released version 5.0 and fully upgraded to a one-stop "message, event, and stream" integrated processing platform, and has the following two highlights:

(1) Message core scene expansion: covering many scenes such as event-driven and message streaming;
(2) Iteration of one-stop integrated processing technology architecture: realize a message storage to support stream computing, asynchronous delivery, integrated driving and other fields.

In addition to the two highlights, RocketMQ5.0 brings three new functions:

(1) New upgrade of RocketMQ infrastructure
The openness of the lightweight SDK and the improvement of the full-link observable system
Message-level load balancing
Multiple network access support
Massive tiered storage
(2) Introducing the lightweight message ETL function in Streaming streaming scenarios
Lightweight and no dependencies
Low development threshold
Serverless flexibility
(3) EDA Cloud Best Practice-Event Center EventBridge
Unified and standardized event integration ecology
Global Event Interchange Network
Serverless low-code development

The microservice product family is upgraded again

As an important representative of today's application Internet architecture, microservices and containers continue to converge. It can be seen that enterprises have clarified the application architecture and business requirements of microservices. In terms of architecture, Java-based microservice systems such as Spring Cloud and Dubbo, and Service Mesh technology systems that have gradually emerged with the emergence of multiple trends have become mainstream. In terms of requirements, business development and design for microservices, native containerization of software infrastructure, and bird's-eye view of application production, operation and maintenance have become core demands. Alibaba Cloud uses the microservice engine MSE and the service network ASM to perfectly support these two different types of microservice systems.

4.png

Under the microservice architecture in the virtualization era, the business usually adopts a two-tier architecture of traffic gateway + microservice gateway. The traffic gateway is responsible for north-south traffic scheduling and security protection, and the microservice gateway is responsible for east-west traffic scheduling and service governance. In the cloud-native era dominated by Kubernetes, Ingress has become the gateway standard for the Kubernetes ecosystem, giving the gateway a new mission, making it possible to combine the traffic gateway + microservice gateway into one.

This time, the cloud native gateway released by Alibaba Cloud MSE changes the two-layer gateway into one layer without any discount on the capacity, which not only saves 50% of resource costs, but also reduces operation and maintenance and usage costs. The MSE cloud native gateway is built on Envoy and Istio to achieve a unified control plane control, and is directly connected to back-end services, supports Dubbo3.0 and Nacos, opens up Alibaba Cloud Container Service ACK, and automatically synchronizes service registration information.

The MSE cloud native gateway has already experienced thousands of trials within Alibaba. At present, it has been used in Alipay, Dingding, Taobao, Tmall, Youku, Fliggy, Word of Mouth and other Ali business systems, and has passed the test of the massive number of requests on the 2020 Double 11, and the big promotion day can easily carry hundreds of thousands of requests per second , The daily request volume has reached the level of tens of billions.

As the industry’s first fully managed Istio-compatible service grid product, Alibaba Cloud Service Grid (ASM), as a managed platform for unified management of microservice application traffic and compatible with Istio, focuses on creating a fully managed, secure, stable, and easy-to-use Service grid. Support the unified governance of cross-regional multi-cluster, multi-cloud hybrid cloud services, allowing ubiquitous application services to easily communicate with each other across multiple heterogeneous computing infrastructures. Today, the professional version of ASM Pro is released, covering more application scenarios, including:

  • Support Dubbo and other micro-service frameworks and extended protocols: By providing more scenario-based capabilities, it can meet the different demands of customers in grayscale release, canary release, lossless service flow offline, and full-link grayscale.
  • Comprehensive integration of multiple service registration centers: Full integration of the high-availability capabilities of the Nacos service registration center, multi-language service interoperability across registration centers, and high-performance, large-scale scenario support.
  • The unified service grid capability of cloud and edge integration: supports unified management of services on cross-regional multi-clusters and multi-cloud hybrid clouds, supports ACK Edge edge clusters, and explores service grid scenarios in edge computing.
  • Optimize existing applications for modernization: unified support for the hybrid deployment of multiple heterogeneous computing infrastructures such as containers and virtual machines, facilitating the migration of virtual machine applications; enhancing the dynamic execution capabilities of OPA strategies, implementing zero-trust security without code transformation, and simplifying management Applications on multiple types of computing infrastructure.
  • Full stack optimization: Reduce service communication delay and encryption overhead through the integrated operating system and software and hardware, and improve the efficiency of TLS encryption and decryption and the performance of the data plane.

Through functions such as flow control, grid observation, and communication security between services, the service grid ASM simplifies service governance in all aspects, provides unified management capabilities for services running on heterogeneous computing infrastructure, and is suitable for ubiquitous Kubernetes clusters, Serverless Kubernetes clusters, ECS virtual machines, and self-built clusters.

Finally, in the development process of microservice applications, a full-site platform is required to cover the entire system of application architecture design, development, testing, launch, and operation and maintenance. One-stop cloud-native application research and development support is of extraordinary significance for improving user efficiency. Therefore, the cloud-native application design & development platform ADD came into being to help enterprises quickly carry out native development and manage cloud-native applications from the perspective of application throughout the life cycle, and has the following features:

1. Application development & architecture design: Implement a drag-and-drop design that supports application architecture diagrams, and provide presets and enterprise custom application architecture templates.
2. Cloud-native asset store: Improve out-of-the-box middleware services for enterprises, and accumulate public business components and public technology middleware for enterprises, so as to realize the standardization, productization, sharing and reuse of enterprise software assets.

At the same time, the enterprise-level distributed application service EDAS v4.0 rebuilds the entire process of user application release and online, realizes bird's-eye view operation and maintenance and dual-mode governance, facilitates application operation and maintenance modernization, and accelerates the cloud nativeization of online services.

ARMS 3.0-Enterprise Observable System All in one

As an important part of the enterprise technology architecture, different communities and institutions have increasingly convergent views on trends in the observable field:

  • Full stack integration: When a request enters the business system, from the front end to the application layer to the fixed resources, how the enterprise connects the entire link and integrates the vertical link with the horizontal data, which becomes a test of the operation and maintenance team Key capabilities.
  • Cloud-native observable standardization: When the observable open source fields Grafana, Prometheus, and OpenTelemetry become the de facto standards, the cloud-native observable system built by enterprises will be more efficient and traceable.
  • AIOps: As each enterprise's technology continues to expand, the scale and dimensions of its operation and maintenance data continue to increase, including massive indicators, logging, and tracing data. AI plays a huge role in this process, discovering and solving anomalies and problems faster and more efficiently.

5.png

In order to meet the above trends and needs, Alibaba Cloud released ARMS 3.0 to help companies realize the all in one observable system, achieving unified access, unified indicators, unified links, unified metering, unified panels, and unified alarms.

  • Support 50+ technical components, from the access experience, business application to the infrastructure layer vertical full link open;
  • Metric, Logging, and Tracing are opened horizontally to speed up problem diagnosis;
  • Fully supports the three open source standards of Prometheus, Grafana, and OpenTelemetry cloud native observability;
  • Support access to 10+ monitoring and alarm system access to realize the unified management of discrete alarm messages. At the same time, it combines algorithms and Ali experience to provide intelligent noise reduction and root cause analysis capabilities.

It is worth mentioning that with ARMS, Alibaba Cloud has become the only domestic cloud vendor to be selected in the "2021 Gartner APM Magic Quadrant", and its product capabilities and strategic vision have been highly recognized by Gartner analysts.

High availability

The application high-availability service AHAS in the high-availability family has also undergone a major product upgrade. The Application High Availability Service (Application High Availability Service) focuses on improving the high availability capabilities of applications and businesses, mainly providing traffic protection, fault drills, and multiple activities. Three core capabilities of disaster tolerance. All modules of this upgrade have greatly improved the stability and resilience of the user's business.

First of all, in terms of traffic protection, the innovative cluster protection function is provided to help customers solve typical cluster flow control problems such as uneven single-machine traffic and small cluster traffic. At the same time, in the gateway protection scenario, the nginx plug-in solution based on the native version of C/C++ is currently supported. While it stably supports Sentinel core flow control and API grouping capabilities, performance loss is greatly reduced, throughput loss is less than 5%, and CPU occupancy is 0.8 Within nuclear. In addition, monitoring and alarm capabilities and protection scenarios have been greatly improved and optimized in terms of business scenarios and ease of use.

Fault rehearsal (Chaos) is a cloud-native chaos engineering platform that provides large-scale, low-cost, controllable impact and diversified fault rehearsal services. Chaos provides one-stop architecture analysis, fault inspection, fault injection, system stability and other functions to help users enhance the fault tolerance and recoverability of distributed systems, and help the system go to the cloud smoothly. The fault drill platform has also undergone a comprehensive upgrade in terms of drill scenarios, drill forms, ease of use, and open source compatibility.

  • In the exercise scenario, Windows-type exercise nodes are supported; one-stop disaster recovery and network disconnection exercises are supported for pre-check, network disconnection, recovery, and replay; the microservice exercise is also upgraded to 2.0, which supports automatic verification of the strength of the service level.
  • In terms of the form of the exercise, this heavy release of visual exercises supports one-click launching of exercises based on the business architecture topology.
  • In terms of open source compatibility, it supports online hosting from the community version to the enterprise version, and supports one-click upgrades to the enterprise version.

The Multi-active Disaster Tolerance (MSHA) solution has been fully upgraded from a multi-active business disaster recovery solution to a multi-active business disaster recovery solution, which is more compatible, more stable and simpler.

Compatible with richer disaster tolerance architecture and business components.

Added intra-city active/multi-active disaster recovery architecture, remote active-active disaster recovery architecture, and remote application active-active disaster recovery architecture. Added multi-active disaster recovery support for component modules such as MQTT, ScheduleX, K8S, PolarDB, etc.

The core disaster tolerance capability is strengthened, and the stability is improved by more than 50%.

Through the optimization and reinforcement of the multi-active disaster tolerance architecture of the access layer, service layer, message layer, task scheduling layer and data layer, top-down traffic penetration optimization, the overall stability of disaster tolerance is improved by more than 50%.

Zero reconstruction in the same city, and a reduction of more than 20% in the workload of remote disaster recovery reconstruction.

In the intra-city scenario, the business is zero-renovated, and the intra-city multi-active disaster recovery service is launched in an average of 3 hours. In the remote container business scenario, relying on the pilot to quickly integrate the agent, greatly reducing the cost of disaster recovery transformation.

This comprehensive upgrade gives the business technical team more choices. Through simple, rich, open and low-cost PaaS services, it helps enterprise customers to innovate on the cloud more simply and efficiently, and build more in line with business needs and teams The technical system of the situation


阿里云云原生
1k 声望302 粉丝