1. Description
Fate
is an industrial-grade federated learning framework. The so-called federated learning refers to the ability to combine data from multiple parties to jointly build a model;
Compared with the traditional data usage method, it does not need to aggregate the data of all parties to build a data warehouse . In the process of federated learning, in the process of joint computing modeling, the data between multiple organizations will not be shared, so that the availability of data is invisible ; This article mainly shares the basic concepts related to the privacy computing platform Fate
, and the single-machine deployment based on Docker.
2. Privacy Computing
隐私计算
refers to a collection of technologies that realize data analysis and calculation on the premise of protecting the data itself from external leakage, so as to achieve the purpose of making the data available and invisible ; on the premise of fully protecting the data and privacy security, realizing the value of the data transformation and release.
The 百万富翁
problem proposed by Academician Yao Qizhi, the winner of the Turing Award in 1982:
Suppose there are two millionaires who both want to compare who is richer, but they both want to protect their privacy and are reluctant to let each other or any third party know how much money they really own. How to calculate who has more money while protecting the privacy of both parties?
This problem has created the field of secure multi-party computing. In today's blockchain-led series of trusted architectures, the multi-party computing problem is one of the key technologies for establishing machine trust.
At present, the mainstream technologies for realizing privacy computing are mainly divided into three major directions: the first category is cryptography-based privacy computing technology represented by 多方安全计算
d67f03e863a3922c178521358a1a6c2e---; the second category is 联邦学习
The representative technology is derived from the fusion of artificial intelligence and privacy protection technology; the third category is the privacy computing technology based on trusted hardware represented by 可信执行环境
.
Different technologies can often be used in combination to complete data computing and analysis tasks while ensuring the security and privacy of the original data.
3. Federated Learning
There are two main modes in federated learning:
horizontal federation
It means that among the multiple parties of the union, the features are the same, but the users are different; then through the union, the number of samples can be expanded when training the model;
For example, there are two banks in different regions (Beijing and Guangzhou), because the business between the banks is similar, so the data characteristics (fields) are likely to be the same; but their user groups come from the residential population of Beijing and Guangzhou respectively, and the intersection of users Relatively small; this scenario is more suitable for using horizontal federation to increase the amount of user data for model training.
vertical federation
It means that among the multiple parties of the union, the users of each party overlap a lot, but their features are different, then through the union, the feature dimension can be expanded when training the model;
For example, for shopping malls and banks in the same area, their user groups are likely to include most of the residents in the area, and the intersection of users may be large; since the banks record the user's income and expenditure behavior and credit ratings, the shopping malls keep The user's purchase history, so their user feature intersection is small; this scenario is more suitable for vertical federation to increase the number of features for model training and expand model capabilities .
4. Fate
FATE
(Federated AI Technology Enabler) is the world's first industrial-grade open source framework for federated learning developed by the AI team of WeBank. It provides a secure computing framework based on data privacy protection for machine learning, Deep learning and transfer learning algorithms provide strong secure computing support. And built-in protection linear model, tree model and a variety of machine learning algorithms including neural network.
github address: https://github.com/FederatedAI/FATE
There are three roles in Fate:
Guest
For the application side of the data, it means that there are business requirements to apply the data in the actual modeling scenario; and in the vertical algorithm, the Guest is often the one with the label y.
Host
For the data provider, usually it is just a cooperative organization responsible for providing data to assist the guest to complete this modeling, just to help improve the training effect.
Arbiter
For third-party collaborators, it is used to assist multiple parties to complete joint modeling. It is mainly responsible for issuing public keys, encryption and decryption, and aggregation models without providing data.
5. Deployment
5.1. Installing the image
First set the environment variable version
for specifying the version of Fate
later, and execute the following command:
export version=1.8.0
The latest version is used and can be modified as needed.
There are two ways to install the image, you can choose one of them;
Method 1 If the server can access the public network, you can directly pull the Tencent Cloud container image:
docker pull federatedai/standalone_fate:${version}
docker tag ccr.ccs.tencentyun.com/federatedai/standalone_fate:${version} federatedai/standalone_fate:${version}
Method 2 If the server does not have a public network, you can download the mirror and then import it:
Download the image package on a machine with a network:
wget https://webank-ai-1251170195.cos.ap-guangzhou.myqcloud.com/fate/${version}/release/standalone_fate_docker_image_${version}_release.tar.gz
Import the image on the target machine:
docker load -i standalone_fate_docker_image_${version}_release.tar.gz
View installed mirrors:
docker images | grep federatedai/standalone_fate
5.2. Starting the container
Execute the following command to start:
docker run -d --name standalone_fate -p 8080:8080 federatedai/standalone_fate:${version};
6. Test
Fate comes with test tasks;
First execute the following command to enter the Fate container:
docker exec -it $(docker ps -aqf "name=standalone_fate") bash
Execute the following command to start the toy test:
flow test toy -gid 10000 -hid 10000
After success the following is displayed:
success to calculate secure_sum, it is 2000.0
7. Graphical interface
FATE Board
is the service component responsible for visualization in Fate. This service has been integrated in the stand-alone container and can be accessed through the 8080
port:
Both the account and password are admin
Through the JOBS
button in the upper right corner, you can view the tasks run by our Toy测试
:
Because it is federated learning, I see the tasks of the roles of guest and host.
Scan the code to follow for a surprise!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。