头图.png

Author | Vineyard Team
Source | Alibaba Cloud Native Public

Vineyard is a distributed engine that provides in-memory data sharing for end-to-end workflows in big data analysis scenarios in a cloud native environment. We are pleased to announce that Vineyard will be accepted by the Cloud Native Foundation (CNCF) TOC on April 27, 2021 It is a sandbox project.

Vineyard project open source address:
https://github.com/alibaba/v6d

Project Introduction

In existing big data analysis scenarios, for end-to-end tasks, distributed file systems or object storage systems such as HDFS, S3, and OSS are usually used between different subtasks to share intermediate data between tasks. This method has many problems in operational efficiency and R&D efficiency. Take a risk control operation workflow as shown in the figure below as an example:

1.jpg

  1. In order to share intermediate data between different tasks in the workflow, the previous task writes the results into the file system. After completion, the latter reads the file as input. This process brings additional serialization and deserialization, and memory For copy, network, and IO overheads, we have observed from historical tasks that more than 60% of tasks spend more than 40% of the execution time for this.
  2. For the production environment, in order to efficiently solve a particular paradigm problem, a new system (such as distributed graph computing) is often introduced, but such a system is often difficult to directly seamlessly connect with other systems in the workflow, requiring a lot of repetition IO, data format conversion and adaptation research and development work.
  3. Using an external file system to share data brings additional interruptions to the workflow, because often only when one task has completely written all the results, the next task can begin to read and calculate, which makes cross-task pipeline parallelism unable to be applied.
  4. When the existing distributed file system shares intermediate data, especially in the cloud-native environment, it does not deal with the location problem of distributed data well, which causes a waste of network overhead, thereby reducing end-to-end execution efficiency.

In order to solve the above-mentioned problems in the existing big data analysis workflow, we designed and implemented the distributed memory data sharing engine Vineyard.

2.jpg

Vineyard addresses the above issues from the following three perspectives:

  1. In order to make the data sharing between tasks in the end-to-end workflow more efficient, Vineyard supports zero-copy data sharing between systems through memory mapping, eliminating additional IO overhead.
  2. In order to simplify the adaptation and development required for new computing engines to connect to existing systems, Vineyard provides out-of-the-box abstractions for common data types, such as Tensor, DataFrame, Graph, etc., so that different computing engines can communicate with each other. Sharing intermediate results no longer requires additional serialization and deserialization. At the same time, Vineyard implements reusable components such as IO, data migration, and snapshots in the form of plug-ins, enabling it to be flexibly registered in the computing engine on demand, reducing development costs that have nothing to do with the computing engine itself.
  3. Vineyard provides a series of operators to realize more efficient and flexible data sharing. For example, the Pipeline operator implements pipeline parallelism across tasks, so that subsequent tasks can perform calculations at the same time as the output of the previous tasks are generated, which improves the overall end-to-end efficiency.
  4. Vineyard is integrated with Kubernetes. Through the Scheduler Plugin, task scheduling can perceive the locality of the required data. In Kubernetes, the Pod of a single task is scheduled to the machine that matches the input data required by the Pod as much as possible to reduce The network overhead required for small data migration improves end-to-end performance.

In a preliminary comparison experiment, compared to using HDFS to share intermediate data, for evaluation tasks, Vineyard can greatly reduce the additional overhead introduced for exchanging intermediate results, and the end-to-end time of the entire workflow has been improved by 1.34 times.

core function

Next, I will introduce the core functions of Vineyard from two aspects: the design and implementation of Vineyard's core, and how Vineyard helps big data analysis tasks in the cloud native environment.

1. Distributed memory data sharing

Vineyard represents the data in memory as Object. Objects can be Local or Global. Take the distributed execution engines Mars and Dask as examples. A DataFrame is often split into many Chunks to take advantage of the computing power of multiple machines. There are multiple Chunks on each machine. , These Chunks are LocalObjects in Vineyard, and these Chunks together constitute a global view, namely GlobalDataFrame. This GlobalDataFrame can be directly shared with other computing engines, such as GraphScope, as the input of graph data. With the abstraction of these data types, different computing engines on Vineyard can seamlessly share intermediate results, and use the output of one task directly as the output of the next task.

More specifically, how about expressing a specific type of Object in Vineyard so that it can be easily adapted to different computing engines? This benefits from the flexibility provided by Vineyard in the representation of Objects. In Vineyard, an Object consists of two parts, Metadata, and a set of Blobs. Blob stores the actual data, and Metadata is used to explain the semantics of these blobs. For example, for a Tensor, a Blob is a continuous memory that stores all the elements in the Tensor, and the Metadata records the type, shape, and row-major or column-major attributes of the Tensor. In Python, this Object can be interpreted as a Numpy NDArray, and in C++, this Object can be interpreted as a tensor in xtensor. In the SDKs of these two different programming languages, sharing this Tensor will not bring additional IO, copy, serialization/deserialization, and type conversion overhead.

At the same time, the Metadata in Vineyard can be nested, which allows us to easily describe any complex data type as an Object in Vineyard without restricting the expressive power of the computing engine. Take GlobalDataFrame as an example, see the structure of Metadata in the figure below.

3.png

2. Collaborative scheduling of data and tasks in a cloud native environment

For a real deployed big data analysis pipeline, only data sharing between tasks is not enough. In a cloud environment, multiple subtasks contained in an end-to-end pipeline are only considered for the required resource constraints when being scheduled by Kubernetes. The co-locate of two consecutive tasks cannot be guaranteed, and the intermediate results are shared between the two tasks. There is still network overhead introduced by data migration. As shown in the figure below, when Task B is running, because the Pods of the two tasks are not aligned, data slices A3 and A4 need to be migrated to the Vineyard instance where the Pod is located.

4.png

In this regard, Vineyard expresses the data in the cluster (Vineyard Objects) as observable resources through CRD, and designs and implements a scheduler plug-in that considers data locality based on the Scheduler Framework of Kubernetes. After the current task Task A is completed, from the Metadata of the result object, the scheduler plug-in can know the location of all the shards. When starting the next task, the scheduler gives the node where the data is located (Node 1, Node 2 in the figure) With higher priority, task Task B is also scheduled to the corresponding node as much as possible, thereby eliminating the additional overhead introduced by data migration and improving end-to-end performance.

Get started quickly

Vineyard integrates Helm to facilitate user installation and deployment:

helm repo add vineyard https://vineyard.oss-ap-southeast-1.aliyuncs.com/charts/
helm install vineyard vineyard/vineyard

After installation, a Vineyard DaemonSet will be deployed in the system, and a UNIX domain socket will be exposed for shared memory and IPC communication with the application's task Pod.

In addition, you can also refer to Vineyard's demo video:
https://www.youtube.com/watch?v=vPbF1l5nwwQ&list=PLj6h78yzYM2NoiNaLVZxr-ERc1ifKP7n6&t=585

Future outlook

Vineyard has been used as the storage engine of the distributed scientific computing engine Mars and the one-stop graph computing system GraphScope. Vineyard helps big data analysis tasks inseparable from close interaction with the cloud native community. In the future, Vineyard will further improve with other projects in the community such as Kubeflow. The integration of, Fluid, etc. facilitates more big data analysis tasks on the cloud.

Vineyard will continue to walk with the community, support and follow community feedback, and is committed to promoting the ecological construction and application of cloud native technology in the field of big data analysis. Welcome everyone to pay attention to the Vineyard project, join the Vineyard community and participate in the co-construction and implementation of the project!

The 2021 Alibaba Cloud Developer Conference kicks off!

文末 banner.png

How to make better use of cloud capabilities in the digital age? What is a new and convenient development model? How to let developers build applications more efficiently? Technology empowers society, technology promotes change, expands the energy boundaries of developers, everything is different because of the cloud. Click to register now , 2021 Alibaba Cloud Developer Conference will give you the answer.


阿里云云原生
1k 声望302 粉丝