Support edge-cloud collaborative lifelong learning features, KubeEdge sub-project Sedna 0.3.0 version released!

Abstract: With the exponential growth of the number of edge devices and the improvement of device performance, edge-cloud collaborative machine learning came into being, in order to get through the last mile of machine learning.

This article is shared from the Huawei Cloud Community " supports edge-cloud collaborative lifelong learning features, KubeEdge sub-project Sedna 0.3.0 version released! ", original author: technical torchbearer.

1. Current machine learning landing challenges

What are the current problems with the implementation of machine learning?

In the past two decades, machine learning has been widely used in data mining, computer vision, natural language processing, biometric recognition, search engines, medical diagnosis, detection of credit card fraud, stock market analysis, DNA sequence sequencing, speech and handwriting recognition, strategic games And robotics.

In the actual business landing process, most large cloud platform providers have provided resource services such as machine learning computing power, and support multiple machine learning frameworks to provide an open and flexible deployment environment. However, the data required by the machine learning model is often not generated from cloud platforms, but from edge devices such as sensors, mobile phones, and gateways. Data is generated from the side, and the cloud needs to collect data from the side to train and continuously improve the machine learning model. In the actual implementation, the current machine learning needs to face the following problems:

1) Mass equipment data causes delay and cost issues

Assuming that even with a 100 Mbps private network connection, it will take 10 days to ship 10 TB of data to the cloud.
Faced with a large number of edge-connected devices generating hundreds of megabytes or even terabytes of data every day, the delays and costs brought by them are often unbearable for customers and service providers;

2) Delay and accuracy problems caused by data compression

Because it is usually impractical to migrate all data, it is often necessary to "compress" the data (such as feature engineering, identification of difficult cases, etc.) and transmit it to the cloud, and the data compression process easily introduces new delays.
The compressed data may not completely represent the complete data set information, which may easily lead to loss of accuracy.

3) Side data privacy and real-time computing issues

The essential source of the above problem is that data is generated at the edge, while the computing power is more sufficient in the cloud . In other words, in the process of machine learning services transforming edge-generated data into knowledge, on the one hand, it needs to respond quickly and process locally-generated data at the edge, and on the other hand, it needs the support of cloud computing power and development environment. With the exponential growth of the number of edge devices and the improvement of device performance, edge-cloud collaborative machine learning came into being in order to get through the last mile of machine learning.

What are the current challenges of implementing edge-cloud collaborative machine learning?

The current classic model of side-cloud collaborative machine learning is: runs a machine learning algorithm on a given data set on the cloud to build a model, and then applies this model to multiple inference tasks on multiple sides without changes . This learning paradigm is called closed learning (also known as isolated learning [1]), because it does not consider knowledge learned in other situations and historical knowledge learned in the past. Although the related research and application of edge-cloud collaborative machine learning technology have made significant progress, there are still many challenges in terms of cost, performance, and security: data islands/small samples/data heterogeneity/resource limited [2].

In the context of edge cloud, 1) the distribution of different side data is always changing, 2) and side labeling samples are often scarce due to high cost. Therefore, closed learning needs to continuously label samples and retrain, which obviously brings huge challenges to the implementation of services. This kind of data distribution and data volume challenge distribution is called data heterogeneity and small samples, which belong to the four major challenges of edge-cloud collaborative machine learning.

Figure 1 Schematic diagram of the machine learning model in the thermal comfort prediction service changing with the side environment

This article introduces the corresponding challenges with an example of thermal comfort prediction service, as shown in Figure 1. The service inputs environmental characteristics such as outside temperature, and predicts the thermal comfort level (hot, comfortable, cold) of different personnel. Since the deployment position of the edge node changes from outdoor to indoor, for the same outdoor temperature characteristic value x=30, it can be seen that the actual thermal comfort label has changed significantly. The online prediction value of the original outdoor model is generally low. To match the indoor model, the training samples need to be re-adjusted. In other words, in the face of dynamically changing marginal data, closed learning requires frequent retraining due to the lack of memory history and different contextual task knowledge.

How to solve the current edge-cloud collaborative machine learning challenges?

It can be understood from the above discussion that the current closed learning paradigm can be used to provide data isomorphism and big data services, but it is difficult to deal with the problem of data heterogeneity and small samples, so it is not suitable for the establishment of a general machine learning system . Professor Liu Bing of the University of Illinois at Chicago also concluded in Frontiers of Computer Science that closed learning paradigm is that there is no memory , which usually requires a large number of training samples.

The corresponding paradigm improvement can be inspired by the human learning process. It can be seen that the reason why human beings are able to learn and become smarter is because everyone does not learn in isolation, but constantly accumulates the knowledge learned in the past, and uses the knowledge of others to learn more knowledge [1]. Drawing lessons from this kind of learning mechanism of human beings, lifelong learning combined with edge-cloud collaboration can develop edge-cloud collaboration lifelong learning. Edge-Cloud Collaborative Lifelong Learning 1) At the same time, it combines multi-task learning and incremental learning features to deal with the problem of data heterogeneity and small samples in the new situation. 2) Using the cloud-side knowledge base to memorize the new situation knowledge , Fundamentally solve the above-mentioned challenges of edge-cloud collaborative machine learning.

2. The concept of edge-cloud collaboration for lifelong learning

Based on the concept of lifelong learning proposed in 1995 [3], Sedna further defined edge-cloud collaborative lifelong learning as continuous learning of multi-machine learning tasks with edge-cloud collaboration. Among them, machine learning tasks refer to models used in specific situations, such as Chinese-English translation (translating given Chinese into English), Asian plant classification, etc. The formal definition is as follows:

Side-cloud collaborative lifelong learning: Given N historical training tasks in the cloud-side knowledge base, reason about the current tasks that continue to arrive and M future side-side tasks, and continuously update the cloud-side knowledge base. Among them, M tends to infinity, and the M inference tasks on the side are not necessarily among the N historical training tasks in the cloud-side knowledge base.

Figure 2 Schematic diagram of the edge-cloud collaborative lifelong learning process

Specifically, the general process of edge-cloud collaborative lifelong learning is shown in Figure 2:

1) Initialize the knowledge base: stores and maintains the knowledge trained and accumulated in the past N tasks (denoted as the TN to T-1 tasks) in the cloud-side knowledge base.

2) Learning the current task: When the side device faces the current task (denoted as the T-th task), the T-th task is trained based on the prior knowledge of the cloud-side knowledge base. Note that the Tth task is not necessarily among the N tasks in the history.

3) Update the knowledge base: the learned T-th task knowledge to the cloud-side knowledge base and updates it.

4) Learning future tasks: continues to learn M tasks in the future (denoted as T+1 to T+M tasks). Similar to the above T-th task using the past N task knowledge (from TN to T-1), the side task knowledge of the T+1th task uses the past N+1 cloud-side task knowledge (from TN to T) . And so on, until the T+M task is completed, and the whole process ends.

Edge-cloud collaborative lifelong learning has the following three characteristics:

1. Side-cloud collaborative continuous learning: can complete continuous inference and training based on cloud-side computing power and side-side data cooperation, and can become more and more good at model training when inference is running.
2. Side-side knowledge sharing on the cloud-side knowledge base: 160c302f36def2 takes the cloud-side knowledge base as the center to realize cross-side knowledge sharing and handle side tasks, while persisting and maintaining cloud knowledge.
3. Side-side processing of unknown tasks on the cloud side: requires the side-side to be able to discover and process unknown tasks in the cloud knowledge base. Unknown tasks refer to new tasks discovered during operation or testing, for example, their application scenarios or models are outside the current knowledge of the knowledge base.

3. Sedna Edge Cloud Collaborative Lifelong Learning Features

KubeEdge is an open source edge computing platform. It expands on Kubernetes' native container orchestration and scheduling capabilities to achieve edge-cloud collaboration, computing sinking, massive edge device management, and edge autonomy. KubeEdge will also support scenarios such as 5G MEC and AI cloud-side collaboration in the form of plug-ins, and it has been applied in many fields [3].

KubeEdge AI SIG released the KubeEdge sub-project open source platform Sedna in December 20. The architecture is shown in Figure 3. Based on the edge-cloud collaboration capabilities provided by KubeEdge, Sedna realizes cross-edge cloud collaborative training and collaborative reasoning capabilities of AI. Support existing AI applications to seamlessly sink to the edge, quickly realize cross-edge cloud incremental learning, federated learning, collaborative reasoning and other capabilities, and ultimately reduce the cost of building and deploying edge-cloud collaborative machine learning services, improve model performance, and protect data Privacy etc. [2].

Figure 3 Sedna overall architecture

In this 0.3 version update, Sedna provides feature support for edge-cloud collaborative lifelong learning. Sedna's lifelong learning features will be based on side data and cloud computing power, and gradually realize high-confidence automated artificial intelligence that adapts to side business and model heterogeneity.

Sedna's edge-cloud collaborative lifelong learning tasks are divided into three phases: training, evaluation and deployment, and maintaining a globally available knowledge base (KB) to serve each lifelong learning task. The architecture is shown in Figure 4:

1) Start the training worker to perform multi-task migration learning based on the developer's AI-based model and training data set to realize task knowledge induction, including: sample attributes, AI models, model hyperparameters, etc.

2) After the training completes the update of the knowledge base, the evaluation worker is started on the evaluation data set, and based on the evaluation strategy defined by the deployer, it is judged that it conforms to the task model for the deployment.

3) After GM captures the completion status of the evaluation task, it notifies Edge to initialize and start Inference Service for reasoning service. The application calls the model reasoning interface for reasoning, and judges the cloud on unknown tasks.

4) By connecting a third-party marking system and knowledge base-based migration learning, LC monitors new data changes based on pre-configured rules and triggers training workers for incremental learning according to the configured strategy, and re-issues the edge side after the retraining is completed.

Figure 4 Sedna Edge Cloud Collaborative Lifelong Learning Architecture

Among them, the modular solution and sample migration solution currently selected by Sedna enable the open source edge-cloud collaborative lifelong learning feature to achieve model independence: 1) The same feature can support different structured and unstructured models at the same time, and the model can be inserted in the feature. 2) The same feature can support classification, regression, target detection, anomaly detection, etc. at the same time.

4. Realize predictive control of building thermal comfort based on Sedna lifelong learning

4.1 Background

Smart buildings are an important part of smart cities

Buildings are the "users" of a large number of advanced industrial products, leading their manufacturing, operation and maintenance, and occupy an important position in this wave of energy revolution and industrial revolution.

Nowadays, buildings have automatic control systems, and they are usually on the edge. This makes many building-related applications more inclined to be deployed on the edge side. One type of application is thermal comfort prediction. Since 80% of people's work and life are spent in buildings, it is particularly important to improve work efficiency and life comfort (such as through building intelligence, etc.).

Thermal comfort prediction serves smart buildings

Thermal comfort is defined as the degree of satisfaction of the people in the building with the hot and cold environment. It provides a quantitative evaluation, which links the setting of indoor cold and hot environment parameters with human subjective evaluation. Improving the thermal comfort of office or occupants in buildings is an important consideration in the design of buildings and their systems. When the air conditioning system is operating, once the thermal comfort is predicted, it can be used to adjust the control strategy of the air conditioning in the building. For example, a control strategy based on thermal comfort is based on assumed air conditioning parameter settings and environmental characteristics such as temperature and humidity to give the expected thermal comfort of the human body. Then search for the most comfortable air-conditioning setting. Therefore, in this case, achieving the most comfortable air conditioning control relies on a higher-precision comfort prediction.

The original prediction of thermal comfort either requires the installation of additional equipment in the room or requires manual feedback. The complex deployment environment and frequent manual operations make the collection accuracy of thermal comfort very low in this case. Accordingly, a thermal comfort prediction method based on machine learning is proposed, which can reduce deployment requirements and does not require manual feedback, so it has more practical value.

The problem of data heterogeneity and small samples is more prominent when the thermal comfort prediction service is actually deployed

Due to individual differences in personnel, differences in rooms and cities, etc., different individuals and different locations have different feelings of thermal comfort, which will lead to different thermal comfort label values of the corresponding personnel under the same ambient temperature and air conditioning settings. , Which leads to more prominent data heterogeneity problems.

Thermal comfort prediction is mainly for individual room personnel in the building, with individual characteristics. In the case of many changes in environmental factors, the individual thermal comfort samples in the side rooms are usually limited, which are often not enough to support the training of individual personnel in a personalized model, which leads to a more prominent problem of small samples.

In addition to the problem of small samples, incremental learning can also resolve to a certain extent the data heterogeneity between historical and current scenarios (data heterogeneity in time). However, this side-cloud collaborative incremental learning paradigm usually does not have a knowledge base for memory, which makes it difficult to deal with non-temporal data heterogeneity. For example, for a room with multiple people, there will be heterogeneous data on different people at the same time. Since this situation is more than just heterogeneous data of the same person at different times, incremental learning becomes insufficient. At this time, you need to use edge-cloud collaboration for lifelong learning.

4.2 Scheme

Figure 5 The architecture of the thermal comfort prediction scheme for edge-cloud collaboration with lifelong learning

As shown in Figure 5, the thermal comfort prediction scheme of edge-cloud collaborative lifelong learning mainly has the following two steps:

Create comfort prediction lifelong learning tasks

After the lifelong learning task of comfort prediction is created, the Sedna knowledge base will generate a knowledge base instance for comfort prediction. The knowledge base will be initialized with historical data sets of multiple locations and multiple personnel, and provide reasoning and update interfaces to side applications.

Deployment of edge-cloud collaborative comfort prediction application

After the comfort prediction application is deployed, the application will obtain the setting parameters of the multi-connected air conditioning system and the current temperature and humidity and other environmental characteristic information through the device data collection interface on the side. The application finds the corresponding task information from the knowledge base by calling the Sedna Lib library lifelong learning interface:

If it is judged to be a known task, for example, a person who has already appeared in a known temperature and humidity condition, the corresponding model is directly obtained for reasoning;

If it is determined to be an unknown task, such as a newcomer, the knowledge base is used to obtain a model for the unknown task for reasoning. And these models and the relationship between the models will be written into the knowledge base to complete the update operation of the knowledge base, so that the knowledge base can be accumulated.

4.3 Effect

The solution in this case can achieve very good results on the open source Ashrae Thermal Comfort II dataset. In this open source data set, real data on thermal comfort of people in buildings from 1995 to 2015 in 99 cities in 28 countries are included. The goal is to build a machine learning classification model, given environmental characteristics, to predict the thermal tendency of the crowd (Thermal Preference). Heat tendencies are divided into three categories, wishing to be colder (feeling hot), not wishing to change (feeling comfortable) and wishing to be hotter (feeling cold).

The case results are shown in Figure 6 and Figure 7, the overall classification accuracy is compared with single-task incremental learning, a relative increase of 5.12% (including multi-task increase of 1.16%). Among them, in the two tasks of Kota Kinabalu and Athens, the prediction rate of the Kota Kinabalu data before and after the use of lifelong learning is relatively increased by 24.04%, and the prediction rate in the Athens data is relatively increased by 13.73%.

Figure 6 Comparison of Sedna lifelong learning prediction accuracy in ATCII cities

Figure 7 The accuracy of Sedna's lifelong learning prediction in each city of ATCII has been relatively improved

5. The next step of Sedna's lifelong learning features

Lifelong learning algorithm enhancement:
Multi-task transfer learning algorithm
Unknown task recognition algorithm
Unknown task processing algorithm
Distributed knowledge base
Security and privacy enhancement

6. Attachment: KubeEdge SIG AI community technical exchange address

Welcome more students who are interested in edge computing to join the KubeEdge community, participate in AI SIG, and build a cloud-native edge computing ecosystem together

related links

Project address: https://github.com/kubeedge/sedna

Regular meeting time and address:

Time: Every Thursday at 10:00 am

Address: https://zoom.us/my/kubeedge

SIG AI work goals and operation methods: https://github.com/kubeedge/community/tree/master/sig-ai

7, reference

[1] B. Liu, Lifelong machine learning: a paradigm for continuous learning., Frontiers of Computer Science. 11, no. 3 (2017): 359-361., 2017.
[2] "Accelerate AI Edge-Cloud Collaborative Innovation! KubeEdge Community Builds Sedna Subproject," 29 1 2021. [Online]. Available: https://mp.weixin.qq.com/s/FX2DOsctS_Z7CKHndFByRw.
[3] "KubeEdge architecture interpretation: cloud-native edge computing platform," 20 10 2020. [Online]. Available: https://mp.weixin.qq.com/s/8AvkgupCQpI_JCL2P7x8jw.
[4] "kubeedge/Sedna," 30 3 2021. [Online]. Available: https://github.com/kubeedge/sedna.

Click to follow and learn about Huawei Cloud's fresh technology for the first time~