background

qGPU is a GPU sharing technology launched by Tencent Cloud. It supports the sharing of GPU cards among multiple containers, and provides the ability to isolate graphics memory and computing power between containers, so as to ensure business security based on the use of GPU cards at a smaller granularity. To achieve the purpose of increasing GPU utilization and reducing customer costs.

qGPU on TKE relies on the Nano GPU scheduling framework , which is open sourced by Tencent Cloud TKE, which can realize fine-grained scheduling of GPU computing power and video memory, and supports multi-container sharing of GPU and multi-container cross-GPU resource allocation. At the same time, relying on the powerful qGPU isolation technology at the bottom can achieve strong isolation of GPU memory and computing power. While using the GPU through sharing, it is possible to ensure that business performance and resources are not disturbed as much as possible.

Functional advantage

The qGPU solution achieves the purpose of sharing the use of multiple containers through more effective scheduling of tasks on the NVIDIA GPU card. The supported functions are as follows:

Flexibility : Users can freely configure the GPU memory size and computing power ratio

cloud native : support standard Kubernetes, compatible with NVIDIA Docker solution

Compatibility : Mirror image is not modified/CUDA library is not replaced/service is not reprogrammed, easy to deploy, no service perception

High-performance : Operate the GPU device at the bottom layer, efficiently converge, and the throughput is close to zero loss

Strong isolation : Supports strict isolation of video memory and computing power, and business sharing is not affected

Technology Architecture

qGPU on TKE uses the Nano GPU scheduling framework, extends the scheduling mechanism through Kubernetes, and supports GPU computing power and video memory resource scheduling. And rely on Nano GPU's container positioning mechanism, support refined GPU card scheduling, and support multi-container GPU card sharing allocation and multi-container GPU cross-card allocation.

qGPU directly uses the underlying hardware features of NVIDIA GPUs for scheduling to achieve fine-grained computing power isolation, breaking the traditional CUDA API hijacking scheme that can only use CUDA Kernel as the granularity of computing power isolation restrictions, and providing better QoS guarantees.

Customer benefits

  1. Multitasking and flexible sharing of GPU to improve utilization
  2. Strong isolation of GPU resources, unaffected business sharing
  3. Fully oriented to Kubernetes, zero cost for business use

future plan

supports fine-grained resource monitoring : qGPU on TKE will support the collection of Pod and container-level GPU usage, to achieve more fine-grained resource monitoring and integration with GPU elastic capabilities

supports offline mixing : qGPU on TKE will support high and low priority mixing of online services and offline services, maximizing GPU utilization

supports qGPU computing power pooling : GPU computing power pooling based on qGPU, to achieve decoupling of CPU, memory resources and heterogeneous computing resources

Closed Beta Application

qGPU has been open for free internal testing, welcome to add Tencent Cloud native assistant: TKEplatform, note "qGPU internal testing application" for trial!

about us

For more cases and knowledge about cloud native, please follow the public account of the same name [Tencent Cloud Native]~

Welfare: The official account backstage reply [manual], you can get "Tencent Cloud Native Roadmap Manual" & "Tencent Cloud Native Best Practices"~

[Tencent Cloud Native] Yunshuo new products, Yunyan new technology, Yunyou Xinhuo, Yunxiang information, scan the QR code to follow the public account of the same name, and get more dry goods in time! !

账号已注销
350 声望974 粉丝