Author: Sun Jingwen, Wu Di

background

network download

When mentioning the field of network download, you should first think of the C/S mode based on the TCP/IP protocol suite. This mode expects each client to establish a TCP connection with the server, and the server polls and listens to the TCP connection and responds in turn, as shown below:

 title=

At the end of the last century, based on the idea of C/S mode, people developed application layer protocols such as HTTP and FTP. However, the disadvantage of the C/S mode is obvious: the load of the server is too large and the download rate is too slow. With the increase in the scale of the Internet and the increasing demands of customers for download data size, download speed, etc., these drawbacks are constantly magnified.

P2P download principle

Based on the above background, some people combined the idea of P2P network and load balancing, and proposed a P2P download mode. This mode no longer throws all the download pressure to the server, the server is only responsible for transferring file metadata, and the real file download connection is established between the client and the client. At the same time, a file can be divided into multiple blocks, and different blocks in the same file can be downloaded on different clients, so that the downloaded files can be dynamically circulated in the P2P network, which greatly improves the download efficiency, as shown in the following figure:

 title=

Decentralized P2P download is based on DHT technology, which uses a distributed network-wide approach to store and retrieve information. All information is stored in the form of hash table entries, which are stored scatteredly on each node, thus forming a huge distributed hash table in a network-wide manner. On this basis, the decentralization of a single server is achieved, and the hash table is responsible for load sharing, and the entire network load is evenly distributed to multiple machines.

Dragonfly introduction and architecture overview

Dragonfly is a P2P-based intelligent mirroring and file distribution tool. It is designed to improve the efficiency and speed of large-scale file transfers, maximizing the use of network bandwidth. It is widely used in the fields of application distribution, cache distribution, log distribution and mirror distribution.

principle

Dragonfly combines the advantages of C/S architecture and P2P architecture. It provides client-oriented C/S architecture download mode. At the same time, it also provides a P2P back-to-source mode for server clusters. Different from traditional P2P, the peer-to-peer network is built inside the Scheduler, and the goal is to maximize the internal download efficiency of P2P, as shown in the following figure:

 title=

Architecture Introduction

Dragonfly is oriented towards image distribution and file distribution, combined with the idea of P2P network and server cluster, to provide users with stable and efficient download services. Dragonfly hopes to build a P2P network inside the server, and divides the different host nodes of the server into four roles of Manager, Scheduler, Seed Peer and Peer, respectively providing different functions.

Among them, Manager provides the overall configuration function, pulls the configuration of other roles and communicates with each other. The Scheduler provides the download scheduling function, and the scheduling result directly affects the download rate. Seed Peer is responsible for back-to-source download, pulling the required images or files from the external network. As a server in the C/S architecture, Peer provides download functions to clients through various protocols. The architecture diagram is as follows:

 title=

Among them, Seed Peer supports back-to-source download from the external network using multiple protocols, and also supports use as a Peer in the cluster. Peer provides download services based on various protocols, and also provides proxy services for mirror repositories or other download tasks.

Detailed explanation of components

Manager

Manager plays the role of manager when multiple P2P clusters are deployed, and provides a front-end console to facilitate users to visually operate P2P clusters. It mainly provides functions such as dynamic configuration management, maintaining cluster stability, and maintaining the relationship between multiple sets of P2P clusters. For maintaining the overall stability of the cluster, the Manager and each service maintain Keepalive to ensure that abnormal instances can be eliminated in case of abnormal instances. Dynamic configuration management can operate the control unit of each component on the Manager, such as controlling the load of Peer and Seed Peer, and the number of Scheduler scheduling Parents. The Manager can also maintain multiple sets of P2P cluster associations. A Scheduler Cluster, a Seed Peer Cluster and several Peers form a complete P2P cluster. Of course, different P2P clusters can be network isolated. Under normal circumstances, one computer room and one set of P2P clusters are used, and one Manager manages multiple P2P clusters.

Scheduler

The main job of the Scheduler is to find the optimal parent node for the current download node and trigger the Seed Peer for back-to-source download. Let Peer perform back-to-source download at the appropriate time. When the Scheduler is started, it first registers with the Manager. After successful registration, the dynamic configuration client is initialized, and the dynamic configuration is pulled from the Manager, and then the services required by the Scheduler itself are started.

The core of Scheduler is to select a set of optimal Parent nodes for the current download Peer to download. The Scheduler is Task-oriented, and a Task is a complete download task. The Scheduler stores the Task information and the DAG of the corresponding P2P download network. The scheduling process is to first filter abnormal Parent nodes, and filter them according to multiple dimensions, such as judging whether the Peer is a BadNode, the judgment logic is to assume that the response time of each node follows a normal distribution, if the current response time of a node is within the range of 6σ. Otherwise, consider the node to be a BadNode and remove the node. Then, score the remaining pending Parent nodes according to the historical download feature value, and return a set of Parents with the highest scores to provide the current Peer for download.

 title=

Seed Peers and Peers

Seed Peers and Peers have many similarities. They are all based on Dfdaemon, the difference is that Seed Peer adopts Seed Peer mode, which supports active triggering of back-to-source downloads. Peer adopts Peer mode, as a server in C/S architecture, it provides download function to users, and supports back-to-source download triggered passively by Scheduler. This shows that the relationship between Peer and Seed Peer is not fixed. A Peer can make itself a Seed Peer by returning to the source, and the Seed Peer can also change the running state to become a Peer, and the Scheduler will dynamically change the corresponding DAG. In addition, both Seed Peer and Peer need to participate in the scheduling download process. Scheduler may select Seed Peer or Peer as the parent node to provide download function to other Peers.

Dfstore and Dfcache

Dfcache is dragonfly's caching client, which communicates with dfdaemon and operates on files in a P2P network that acts as a caching system. The corresponding Task and DAG can be stored in the Scheduler.

Dfstore is a dragonfly storage client. It can rely on different types of object storage services as a Backend, providing a stable storage solution, and now supports S3 and OSS. Dfstore relies on the Backend object storage service combined with the acceleration characteristics of P2P itself. It can achieve fast writing and fast reading, and can save back-to-source and cross-machine room traffic, reducing the pressure on the origin site.

Advantage

stability

Dragonfly will automatically isolate abnormal nodes to improve download stability. Each component in Dragonfly communicates with the Manager through Keepalive, and the Manager can ensure that the Scheduler address returned to the Peer and the Seed Peer address returned to the Scheduler are available. The unavailable Scheduler and Seed Peer will not be pushed by the Manager to the Peer or Scheduler that needs to perform the download task, so as to achieve the purpose of isolating abnormal nodes, which is also the exception isolation of the instance dimension, as shown in the following figure:

 title=

In addition, Dragonfly uses Task as the unit during scheduling, which also ensures the stability of the entire scheduling process. After receiving a new Task scheduling request, the Scheduler triggers the Seed Peer to perform back-to-source download; after receiving a scheduling request from an existing Task, the Scheduler schedules the optimal Parent Peer set and returns it to the Peer. This logic ensures that Dragonfly can process the Task regardless of whether it has been downloaded or not. In addition, in the scheduling process of Scheduler, Peers whose response time is too slow are considered to be abnormal nodes and will not be returned as Parent Peers. This is also exception isolation for the Task dimension.

Efficiency

Dragonfly uses P2P for internal back-to-source on the server. P2P download itself distributes the load and minimizes the load of each server node. The following details ensure the efficiency of Dragonfly download:

  • By scoring each possible Parent, the Scheduler returns the current local optimal Parent set to the Peer, and the Peer downloads based on this set.
  • The download process is based on Tasks. Each Task divides the file to be downloaded into multiple Pieces. After the Peer gets the optimal Parent, it broadcasts the download request of each Piece to the collection. The Parent in the collection returns to the Peer after receiving the request. Corresponding to Piece's meta information, Peer takes the parent Peer corresponding to the first received Piece's meta information as the actual download source of the Piece. This approach takes into account the possible changes in the period from when the Scheduler returns the available Parent to when the download is triggered, and allows the Peer to obtain data from different download sources for different Pieces.
  • Dfdaemon is divided into Seed Peer mode and Peer mode, which allows switching between Seed Peer and Peer, and the number of machines that can be used as Seed Peer and Peer can be changed according to actual needs, and dynamic adjustment is more suitable for the actual situation.

Simple to use

Dragonfly provides multiple deployment methods for Helm Charts, Docker Compose, Docker Image, and binaries. Users can quickly deploy a simple POC with one click, and can also deploy large-scale production based on Helm Charts. Of course, each service of Dragonfly has perfect Metrics and also provides ready-made Granafa templates, which is convenient for users to observe the traffic trend of P2P.

other

Dragonfly, as the standard solution of CNCF in the field of image acceleration, combined with Dragonfly sub-project Nydus for on-demand loading can maximize the download speed of images. In the future, we will continue to work hard to build an ecological chain in the field of image acceleration. Thank you to all the students who participated in the community construction. We hope that more students who are interested in the field of mirror acceleration or P2P will join our community.

Related Links:

project address:

https://github.com/dragonflyoss/Dragonfly2

Official website:

https://d7y.io/

Slack:

https://cloud-native.slack.com/messages/dragonfly/

Twitter:

https://twitter.com/dragonfly_oss

Developer Group Email:

dragonfly-developers@googlegroups.com

Dingding scan the code to enter the group or search the group number: 23304666

 title=


阿里云云原生
1k 声望302 粉丝