Introduction to 5G era massive view computing scenarios, Alibaba Cloud edge computing nodes focus on the direction of video cloud and processing, Alibaba Cloud senior technical experts interpret the technology and architecture capabilities behind massive view computing.

\>>Conference Portal : https://yqh.aliyun.com/live/edgecloud\_visual

Author: Hu Fan

Data carriers and computing power distribution are fundamentally changing

Video and pictures have become the main carrier of data content and the main way of information dissemination due to their strong information carrying capacity. The large bandwidth, low latency, and wide connection characteristics of 5G have activated cloud video surveillance, cloud gaming, Internet of Things and other scenarios. The extension from the consumer Internet to the industrial Internet has further promoted the explosion of terminal applications and view data.

These terminals and data have the characteristics of being scattered, massive and relatively low in value density. Taking cameras as an example, IHS research pointed out that there are currently 1 billion surveillance cameras in the world watching the world, which means that video image data is continuously generated. This amount of data is ZB level, but most of the data is of low value. We need to retain the fragments of the event of interest and its structured information. Such scenarios and requirements have brought severe challenges and fundamental changes to the calculation and storage methods.

The new distributed system architecture based on edge data access, computing and caching effectively solves such problems. It can ensure that traffic and calculations converge to the local area, significantly reduce network transmission costs and improve computing efficiency. meets 5G low Scenario-based requirements for delayed processing.

Technical challenges of building business systems based on the edge

Massive, distributed, and heterogeneous edge node resource characteristics will bring huge challenges to the business. Corresponding processing must be carried out in terms of service adaptation, elasticity, and high availability. It is impressed by the business system and not handled well. It may even be lossy.

image.png

builds a business system based on the edge. The technical challenges mainly come from the following aspects:

1. The nodes at the edge are scattered and multi-level, with many nodes and small size, requiring complex management. When interactive visits, pay attention to specific locations and there will be multiple entrances, such as computing and storage locations.

2. The heterogeneity of resources leads to business needs to select resources, and the distribution of resource types for each node may also be uneven, such as different computing resources such as CPU, GPU, and ARM array.

3. The elasticity of a single node is weak, the overall elasticity is strong, and the deployment location and business adaptation must be considered for scalability.

4. The cutover of a single node, as well as the complex network environment between edge and edge, cloud and edge, may cause service jitter, or even single point unavailability. It requires the business system to consider issues such as service drift. When the task is in a state When the situation is considered, the situation will be more complicated.

deal with these challenges and experience simplicity and consistency when using massive distributed nodes and central clouds? It is best to have only one interactive surface

View calculation-location-insensitive calculation, caching and connection platform

View computing solves this problem well. Based on the extensive ENS infrastructure, it provides a location-insensitive computing and caching platform. At the same time, in order to allow the view data to be better uploaded to the cloud, it provides a connection for the view (terminal) to the cloud. platform.

image.png

As shown in the figure above, at the infrastructure layer, through resource management, virtualization, and resource slicing, a unified pooled resource is formed, and security and isolation capabilities are provided; the view computing PaaS platform uses unified network, computing, and storage scheduling, The heterogeneity of resources and the physical location of resources are shielded. According to service characteristics, terminal location and resource status, the matching and coordinated scheduling of edge resources and terminals are carried out. While ensuring low-latency and high-availability response of the business, it realizes business-to-calculation , Storage and connection location is not sense;

For example, in security, education and training, traffic logistics and other camera cloud scenarios, device access, streaming media access and processing will comprehensively consider the available computing power, network bandwidth and storage capacity and other node states, and select the closest matching node and node location Closer to the production end of the content (camera). Cloud games and other scenarios require specific rendering computing resources (such as ARM boards), and at the same time, they must be closer to the consumer (mobile phone) side of the content. When multiple people are required to watch the live broadcast, they can be streamed to the CDN network for distribution. And remote viewing.

View Computing Cloud-Edge-End Collaborative Architecture

core of building a view computing platform is the collaborative architecture of the cloud side:
1. The terminal device is responsible for the collection and aggregation of view and other data, as well as the decoding and display of the view (ie thin terminal), and can also perform command input control, or perform simple calculations according to the configuration and rules of cloud upload and download.

2. View computing builds a low-latency device access gateway based on scattered edge nodes, realizes a variety of terminal cloud connection protocols (such as GB28181/RTMP, etc.), and can receive real-time streams and fast upload of video image files. Compute processing and periodic storage in the node or adjacent nodes, calculation results (such as structured AI analysis data) and data that need to be persistent for long-term storage, and quickly return to the cloud through the secure acceleration channel between the edge and the cloud.

3. View node and device management in the central cloud, as well as unified scheduling, Meta convergence, etc. The terminal device will be mapped to a virtual device on the cloud, similar to the projection of the physical world (that is, the shadow device). The management, configuration and operation of the shadow device will be quickly issued, executed and fed back through the signaling channel. After the physical device is powered off or goes offline abnormally, the context can be well preserved, and it will be synchronized in time after going online.

image.png

What you see through view computing is a cloud and an interactive surface, rather than a distributed small cloud with N scattered entrances. The collaborative architecture on the cloud side can find the best balance between cost, delay, and reliability. For example, in terms of cost, network bandwidth, computing and storage costs need to be considered comprehensively.

Location-insensitive multi-point collaborative computing

The view calculation service provides three kinds of position-free calculations for view data:

1. View basic calculations: including transcoding, recording, screenshots, etc., through encoding optimization, a high compression ratio is achieved, and the same image quality can save 20~40% of storage space and transmission bandwidth

2. View AI calculation: Relying on Dharma Academy's algorithm accumulation in computer vision, view calculation provides various scene-based view structure analysis, target detection and tracking and other AI capabilities

3. Customized calculation: self-service upload and hosting of operators, reduce the cost of algorithm access, and facilitate users and algorithm suppliers to integrate algorithms into view computing services. In addition to customizable operators and parameters, the calculation mode can also be customized according to your own business needs.

image.png

The biggest feature of these calculations is "calculation moves with the network": calculations are carried out with the flow of data on the network, avoiding the return of full data to the central cloud processing, and realizing the sinking of computing power and the rising of terminal computing.

This network is the edge collaboration network of Alibaba Cloud. It realizes the integrated collaboration of terminal-edge, edge-edge, and edge-center, shielding the complex network environment for upper-layer applications, and providing high-quality end-to-end access and data. At the same time of transmission capacity, this network has been injected with computability and caching capabilities.

In addition to common local computing scenarios such as the cloud on the camera, Internet live broadcasts and other scenarios can also perform edge transcoding and real-time AI analysis based on view computing to improve the overall user experience. For example, the live stream does not need to be distributed to the edge by the upstream backhaul center, and transcoding and compression is directly performed at the nearest node. For 80% of the cold stream (no one watched or very few people watched) can be directly converged to the local, and for the hot stream transcoded and distributed nearby, it can also reduce the delay and lag and make the client play more smoothly. In the whole process, the terminal only needs to access through a unified domain name, and the specific location of the calculation does not need to be sensed. The location-insensitive multi-point coordinated calculation can complete data calculation like using CDN acceleration.

Customizable scene calculation

In a large number of scenarios, you may already have a self-developed operator or application, or an operator from a third-party algorithm provider. View calculation provides an open and customizable scenario calculation framework, and you can host the operator or application in the view In terms of calculation, it is truly realized to help users do their own calculations.

entire computing platform is divided into a three-tier architecture, from bottom to top, corresponding to the computing environment, computing scheduling and computing services.

image.png

1. computing environment, is the production and control layer of computing resources, responsible for the production of containers, VMs and other resources, file storage, the release of running system software and operator applications, installation, deployment and configuration, and log monitoring, etc. The first layer also provides basic application isolation capabilities.

2. computing scheduling, realizes the elastic scaling management of resources and multi-dimensional global load balancing. This layer performs global planning and overall planning on the security isolation of the underlying resources such as containers to solve the problem of resource contention. At the same time, it has different granularities. The computing tasks can be mixed and run, and the reuse rate of resources can be improved.

3. computing service, realizes the hosting, evaluation and actual shunting calculation of operators, and at the same time performs image analysis of computing tasks, iterates and improves the accuracy of computing resource consumption evaluation, such as live transcoding, except for encoding format and resolution Output parameters such as, frame rate, and input content sources will also affect the actual resource consumption to a certain extent. The computing power consumption of each channel of transcoding fluctuates dynamically, which will lead to the accuracy of scheduling resource allocation, which requires dynamic analysis and calibration. , And finally realize the consistency between the dispatching allocation water level and the actual resource water level.

whole access process of
1. Upload and manage operators, configure calculation templates and parameters; the cloud will perform compatibility adaptation and resource consumption evaluation.

2. Online application for computing power and other resources, such as the maximum concurrency of different computing specifications, the cloud will evaluate and confirm the capacity, and issue and deploy the operator to each computing node.

3. When content is accessed or triggered by users, the cloud performs data distribution and calculation, and feeds back the calculation results to users in real time.

Taking cloud games as an example, a streaming mirror that can load game packages and render video streams is an operator or application. After game manufacturers upload game packages and configure rendering specifications, the cloud performs corresponding adaptation, resource evaluation, and dynamic allocation.

Location-free distributed storage

After completing the computing platform, the cloud storage of data is the next problem we have to solve. Due to the dispersion of data sources, as well as various value densities and usage scenarios, such as the high value of live content such as sports events, it needs to be recorded. The playback is persistent storage, and the camera stream of the video surveillance scene is relatively low in value. It only needs to retain the video clips of the key events, and most of the data only needs to be cached for a few days or months.

solve the problem of access delay, availability and cost of distributed storage and hierarchical storage of data?

View computing is based on edge distributed file caching and central persistent OSS storage to provide a location-insensitive distributed storage solution. Data sources around the world can access edge nodes nearby through view computing, and the cache location will refer to the data access Enter and calculate the position to ensure the overall affinity. Periodic data will be cached to the edge, and long-term storage of high-value data and structured analysis data will be returned to the central storage.

At the same time, the view computing solves the problems of large file upload delay, slow speed, and easy interruption in cross-regional and cross-operator uploading through the edge acceleration network, and realizes the acceleration of transfer back to the cloud.

What users see and use is still a cloud, without paying attention to the specific storage location.

image.png

Distributed cache platform

Location-free storage access is realized through a distributed caching platform, which provides nearby access, large-capacity, and cost-effective periodic file caching. In the caching cycle, through multi-point coordinated storage scheduling, multi-node multi-copy redundancy, Realize the high availability of services and high reliability of data. View computing provides flexible cache access and scheduling strategies (national, regional, operator, custom node range). At the same time, it is compatible with the use of the central OSS (SDK/API). After downloading the OSS SDK, you only need to change the endpoint access domain name and you can switch to distributed cache at almost zero development cost. The difference is that the concept of Region is removed and unified directly. The centralized domain name access and management method of, truly realizes only one cloud and only one cloud.

realize this kind of positionlessness? The key point is:
1. Physical files are cached at edge nodes, and management and control data, file meta-information, etc. are unified to the center for centralized management and retrieval.

2. File writing and reading use the 302 scheduling method, write a unified domain name, after storage scheduling, jump to the real physical location for reading and writing.

3. Real-time node status and capacity monitoring, single point cannot be written, automatically migrate to other nodes, complete service non-inductive drift and switching, fast replication and synchronization after single point recovery.

4. Provide multi-node and multi-copy redundant storage to realize the rapid transfer of traffic when a single point is unavailable, and load balancing can also be carried out when the amount of access is large.

View connection platform and full-cycle PaaS service

In order to help the view data to be better on the cloud, the view computing provides a cloud connection platform on the terminal and a PaaS service covering the full life cycle of the view, including collection, calculation processing and content consumption. The ability to connect mainly lies in:

1. Access and control of equipment

2. Access and management of view content

3. View processing and view storage are based on the previously introduced position-free computing platform and caching platform, respectively, and provide basic capabilities and complex processing capabilities for view transcoding, AI analysis, encryption, and streaming rendering. View storage provides view access and retrieval capabilities, as well as life cycle cleaning strategies, and cloud storage and archiving strategies.

image.png

Safe and easy-to-use view (terminal) to the cloud with one click

There are currently three mainstream view terminal cloud solutions:

1. The national standard GB/T-28181 in the field of security is going to the cloud, and there are problems such as complicated access, low security and lack of functions, such as the transmission of signaling in plain text, and the data stream is basically not authenticated. It can only be based on simple SSRC authentication and cannot be effectively avoided. Collision or forgery; the national standard has multiple adaptation and transition issues in 2011 and 2016, and the overall cloud experience is poor.

2. Since ONVIF was put forward in 2008, it has received support from a large number of equipment manufacturers worldwide, but its multicast discovery mechanism cannot be implemented in public clouds, and it is not friendly to go to the cloud. At the same time, its interaction is based on the HTTP standard and is defined by the SOAP protocol format. Signaling content, communication delay is relatively large.

3. A large number of equipment manufacturers have launched private protocol standards to go to the cloud. There are many types, each of which is closed and black-boxed. The access to the cloud cannot be reused, and there are many repeated constructions.

image.png

View Computing's one-click cloud access solution, which provides an open, easy-to-use, secure, and flexible terminal one-click cloud access capability. The main features are:

  1. Compatible with national standard/ONVIF, etc., adapt to all kinds of terminals, and at the same time solve the complex and security problems of national standard access, as well as the public cloud access problem of ONVIF.
  2. The device access gateway constructed based on the edge nodes covered by Alibaba Cloud can ensure nearby access, reuse the CDN's low-latency transmission and acceleration network, and the characteristics of multi-protocol access to ensure low-latency device communication and information. Make control and data flow access.
  3. The core signaling channel realizes transparent two-way communication, and manufacturers and developers can also customize control signaling.

Alibaba Cloud Open Device Cloud Protocol ODCAP

The core of the one-click cloud access solution of the view computing connection platform is built on the ODCAP (Open Device Cloud Access Protocol) open device cloud protocol. We will fully open the protocol content and support independent access to diverse devices of any manufacturer.

The cloud main body on the terminal is interconnected through the network, and the ODCAP protocol supports a variety of network interconnection structures:

1. The device is on the internal network and accesses the public network through the firewall NAT, or it can be transferred through the device gateway;
2. The device is directly in the public network environment, such as a device with 4G/5G networking capabilities, can be directly connected;
3. ODCAP supports cascading mode at the same time, and the sub-devices can be connected to the upper-level device through other protocols. The direct-connected device shields the different access access of the lower-level sub-devices, and unified access to the cloud platform with the ODCAP protocol.

The ODCAP protocol supports multiple types of devices, enabling diversified terminals to go to the cloud. Different devices have different functions. For a unified description, we use the device model to define the device, including 4 levels:

  1. Resources, various data generated by the device, such as real-time video streams, video picture files, structured data after terminal AI analysis, etc.
  2. Configuration, various configuration information of the device
  3. Events, various events triggered by the device
  4. Services, functional services provided by the device

Follow-up Alibaba Cloud View Computing will share more latest product capabilities, solutions and technical practices in the "Alibaba Cloud Edge Plus" public account. Welcome everyone to discuss together.

Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。