Reporter: Hello readers and friends of Alibaba Cloud Native, I'm meeting you again. Today is the last time our old friend "Alibaba Cloud Container Service ACK Release" is a guest in a series of columns to explore the mystery of life experience. In the previous interview, it brought us a wonderful explanation, and interested friends welcome to review. We have learned that since its launch in December last year, the ACK release version of Container Service has received everyone's attention and support, and has achieved a good number of downloads. What do you think about this?
Cloud Container Service ACK Distro (ACK Distro for short): Yes, I have been fortunate to get 400+ downloads since it went online for three months, and I also exchange technology with you through different channels. Thank you for your attention, I hope you get better Container service experience.
Reporter: Okay, let's get to the point~ I learned earlier that sealer can help you quickly build & deploy, and hybridnet can help you build a hybrid cloud unified network plane, so who is the versatile friend that we are introducing today? ?
ACK Distro: We all know that stateful applications in the cloud-native context need a set of storage solutions for data persistence. Compared with distributed storage, local storage is better in cost, ease of use, maintainability, and IO performance, so today I will explain to you - Alibaba's open source local storage management system open-local, and I How to use it to play with container local storage. Let me first explain the opportunity for the birth of open-local. Although the advantages of local storage compared to distributed storage were mentioned just now, there are still many problems with local storage as the current low-cost delivery of Kubernetes clusters:
• Kubernetes lacks awareness of storage resources: Local storage, as a "non-standard" resource, is much less supported in Kubernetes than standard resources (cpu, memory, etc.). Using local storage requires a certain amount of labor costs, such as limiting Pod scheduling by marking nodes, manually managing disks of different models, manually mounting specified disks to containers through Hostpath, etc. At the same time, there are also some issues of on-site delivery of privatized software , such as binding the wrong host path so that the fault cannot be found in time, which seriously affects the delivery efficiency of Kubernetes and the stability of the application runtime;
• Lack of local storage space isolation capability: application mounts an inappropriate host directory (eg, mounts to the host root path), which causes the host to fail. Issues such as eviction and IO interaction between Pods;
• Kubernetes has insufficient support for using local storage for stateful applications: cannot maintain nodes through Hostpath, so that application data is lost after Pod drift; using semi-automatic static Local PV can ensure node retention, but it cannot be fully automated, and human participation is still required (such as creating folder paths, labeling nodes, etc.); some advanced storage capabilities (such as snapshots) cannot be used.
And open-local can avoid these problems to the greatest extent and allow everyone to have a better experience. Using local storage on Kubernetes is as simple as using centralized storage.
Architecture composition of open-local
Reporter from : you further explain to us the architectural components of open-local?
ACK Distro: Of course, open-local contains a total of four components:
1. scheduler-extender: as an extension component of kube-scheduler, is implemented through Extender, which expands the native scheduler's perception of local storage resources, so as to realize the ability of disk capacity, multi-disk awareness, disk media (ssd or hdd) ) and other information scheduling decisions to achieve mixed scheduling of storage resources;
2. csi-plugin: complies with CSI (Container Storage Interface) standard local disk management capabilities, including the ability to create/delete/expand storage volumes, create/delete snapshots, expose storage volume metrics, etc.;
3. agent: runs on each node in the cluster, initializes the storage device according to the configuration list, and reports the local storage device information in the cluster for the scheduler-extender to decide and schedule;
4. controller: obtains the initial configuration of the cluster storage, and sends a detailed resource configuration list to the agent running on each node.
At the same time open-local contains two CRDs:
- NodeLocalStorage: open-local reports the storage device information on each node through the NodeLocalStorage resource, which is created by the controller and updated by the agent component of each node. This CRD is a resource of global scope.
- NodeLocalStorageInitConfig: The open-local controller can create each NodeLocalStorage resource through the NodeLocalStorageInitConfig resource. The NodeLocalStorageInitConfig resource contains the global default node configuration and the specific node configuration. If the node label of the node satisfies the expression, the specific node configuration is used, otherwise the default configuration is used.
Its architecture diagram can refer to the following:
Open-local usage scenarios
reporter: So what kind of demand scenarios will you use open-local?
ACK Distro: I have summarized the following use cases. You can take a seat according to your own situation.
- The application expects the data volume to have capacity isolation capability to avoid situations such as the log filling up the system disk;
- The application requires a large amount of local storage and relies on node maintenance, such as Hbase, etcd, ZooKeeper, ElasticSearch, etc.;
- The number of local disks in the cluster is large, and it is hoped that the automatic deployment of stateful applications can be realized through the scheduler;
- Backup transient data for database applications through the storage snapshot capability.
How to use open-local with ACK Distro
Reporter from : Next comes the old question, how does the advantage of open-local manifest in you? Or how can you achieve best practice using open-local?
ACK Distro: I will explain to you by category~
1. Initialize settings
First, make sure that the lvm tool has been installed in the environment. When installing and deploying me, open-local will be installed by default, edit the NodeLocalStorageInitConfig resource, and perform storage initialization configuration.
# kubectl edit nlsc open-local
Using open-local requires a VG (VolumeGroup) in the environment. If a VG already exists in your environment and there is remaining space, you can configure it in the whitelist; if there is no VG in the environment, you need to provide a block device name for open- local Create VG.
apiVersion: csi.aliyun.com/v1alpha1
kind: NodeLocalStorageInitConfig
metadata:
name: open-local
spec:
globalConfig: # 全局默认节点配置,初始化创建 NodeLocalStorage 时会填充到其 Spec 中
listConfig:
vgs:
include: # VolumeGroup 白名单,支持正则表达式
- open-local-pool-[0-9]+
- your-vg-name # 若环境中已有 VG,可以写入白名单由 open-local 纳管
resourceToBeInited:
vgs:
- devices:
- /dev/vdc # 若环境中没有 VG,用户需提供一个块设备
name: open-local-pool-0 # 将块设备 /dev/vdc 初始化为名叫 open-local-pool-0 的 VG
After the NodeLocalStorageInitConfig resource is edited, the controller and agent will update the NodeLocalStorage resource of all nodes.
#### 2. Storage volume dynamic provisioning
open-local deploys some storage class templates in the cluster by default. I take open-local-lvm, open-local-lvm-xfs and open-local-lvm-io-throttling as examples:
# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
open-local-lvm local.csi.aliyun.com Delete WaitForFirstConsumer true 8d
open-local-lvm-xfs local.csi.aliyun.com Delete WaitForFirstConsumer true 6h56m
open-local-lvm-io-throttling local.csi.aliyun.com Delete WaitForFirstConsumer true
Create a Statefulset that uses the open-local-lvm storage class template. The file system of the storage volume created at this time is ext4. If the user specifies the open-local-lvm-xfs storage template, the storage volume file system is xfs.
# kubectl apply -f https://raw.githubusercontent.com/alibaba/open-local/main/example/lvm/sts-nginx.yaml
Check the Pod/PVC/PV status, you can see that the storage volume was created successfully:
# kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-lvm-0 1/1 Running 0 3m5s
# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
html-nginx-lvm-0 Bound local-52f1bab4-d39b-4cde-abad-6c5963b47761 5Gi RWO open-local-lvm 104s
# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS AGE
local-52f1bab4-d39b-4cde-abad-6c5963b47761 5Gi RWO Delete Bound default/html-nginx-lvm-0 open-local-lvm 2m4s
kubectl describe pvc html-nginx-lvm-0
3. Storage volume expansion
Edit the spec.resources.requests.storage field of the corresponding PVC to expand the storage size declared by the PVC from 5Gi to 20Gi.
# kubectl patch pvc html-nginx-lvm-0 -p '{"spec":{"resources":{"requests":{"storage":"20Gi"}}}}'
Check PVC/PV status:
# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
html-nginx-lvm-0 Bound local-52f1bab4-d39b-4cde-abad-6c5963b47761 20Gi RWO open-local-lvm 7h4m
# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-52f1bab4-d39b-4cde-abad-6c5963b47761 20Gi RWO Delete Bound default/html-nginx-lvm-0 open-local-lvm 7h4m
4. Storage volume snapshots
open-local has the following snapshot classes:
# kubectl get volumesnapshotclass
NAME DRIVER DELETIONPOLICY AGE
open-local-lvm local.csi.aliyun.com Delete 20m
Create the VolumeSnapshot resource:
# kubectl apply -f https://raw.githubusercontent.com/alibaba/open-local/main/example/lvm/snapshot.yaml
volumesnapshot.snapshot.storage.k8s.io/new-snapshot-test created
# kubectl get volumesnapshot
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
new-snapshot-test true html-nginx-lvm-0 1863 open-local-lvm snapcontent-815def28-8979-408e-86de-1e408033de65 19s 19s
# kubectl get volumesnapshotcontent
NAME READYTOUSE RESTORESIZE DELETIONPOLICY DRIVER VOLUMESNAPSHOTCLASS VOLUMESNAPSHOT AGE
snapcontent-815def28-8979-408e-86de-1e408033de65 true 1863 Delete local.csi.aliyun.com open-local-lvm new-snapshot-test 48s
Create a new Pod with the same storage volume data as the previous application snapshot point:
# kubectl apply -f https://raw.githubusercontent.com/alibaba/open-local/main/example/lvm/sts-nginx-snap.yaml
service/nginx-lvm-snap created
statefulset.apps/nginx-lvm-snap created
# kubectl get po -l app=nginx-lvm-snap
NAME READY STATUS RESTARTS AGE
nginx-lvm-snap-0 1/1 Running 0 46s
# kubectl get pvc -l app=nginx-lvm-snap
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
html-nginx-lvm-snap-0 Bound local-1c69455d-c50b-422d-a5c0-2eb5c7d0d21b 4Gi RWO open-local-lvm 2m11s
5. Native block device
The storage volume created with open-local support will be mounted in the container as a block device (in this case the block device is in the container /dev/sdd path):
# kubectl apply -f https://raw.githubusercontent.com/alibaba/open-local/main/example/lvm/sts-block.yaml
Check Pod/PVC/PV status:
# kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-lvm-block-0 1/1 Running 0 25s
# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
html-nginx-lvm-block-0 Bound local-b048c19a-fe0b-455d-9f25-b23fdef03d8c 5Gi RWO open-local-lvm 36s
# kubectl describe pvc html-nginx-lvm-block-0
Name: html-nginx-lvm-block-0
Namespace: default
StorageClass: open-local-lvm
...
Access Modes: RWO
VolumeMode: Block # 以块设备形式挂载入容器
Mounted By: nginx-lvm-block-0
...
6. IO current limit
open-local 支持为 PV 设置 IO 限流,支持 IO 限流的存储类模板如下:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: open-local-lvm-io-throttling
provisioner: local.csi.aliyun.com
parameters:
csi.storage.k8s.io/fstype: ext4
volumeType: "LVM"
bps: "1048576" # 读写吞吐量限制在 1024KiB/s 上下
iops: "1024" # IOPS 限制在 1024 上下
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Create a Statefulset that uses the open-local-lvm-io-throttling storage class template.
# kubectl apply -f https://raw.githubusercontent.com/alibaba/open-local/main/example/lvm/sts-io-throttling.yaml
After the Pod is in the Running state, enter the Pod container:
# kubectl exec -it test-io-throttling-0 sh
At this time, the storage volume is mounted on /dev/sdd as a native block device, and execute the fio command:
# fio -name=test -filename=/dev/sdd -ioengine=psync -direct=1 -iodepth=1 -thread -bs=16k -rw=readwrite -numjobs=32 -size=1G -runtime=60 -time_based -group_reporting
The results are shown below, and it can be seen that the read and write throughput is limited to around 1024KiB/s:
......
Run status group 0 (all jobs):
READ: bw=1024KiB/s (1049kB/s), 1024KiB/s-1024KiB/s (1049kB/s-1049kB/s), io=60.4MiB (63.3MB), run=60406-60406msec
WRITE: bw=993KiB/s (1017kB/s), 993KiB/s-993KiB/s (1017kB/s-1017kB/s), io=58.6MiB (61.4MB), run=60406-60406msec
Disk stats (read/write):
dm-1: ios=3869/3749, merge=0/0, ticks=4848/17833, in_queue=22681, util=6.68%, aggrios=3112/3221, aggrmerge=774/631, aggrticks=3921/13598, aggrin_queue=17396, aggrutil=6.75%
vdb: ios=3112/3221, merge=774/631, ticks=3921/13598, in_queue=17396, util=6.75%
7. Temporary volumes
open-local supports the creation of ephemeral volumes for Pods, where the life cycle of ephemeral volumes is consistent with that of Pods, that is, ephemeral volumes are deleted after Pods are deleted. This can be understood as the open-local version of emptydir.
# kubectl apply -f ./example/lvm/ephemeral.yaml
The result is as follows:
# kubectl describe po file-server
Name: file-server
Namespace: default
......
Containers:
file-server:
......
Mounts:
/srv from webroot (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dns4c (ro)
Volumes:
webroot: # 此为 CSI 临时卷
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: local.csi.aliyun.com
FSType:
ReadOnly: false
VolumeAttributes: size=2Gi
vgName=open-local-pool-0
default-token-dns4c:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-dns4c
Optional: false
8. Monitor the market
open-local comes with a monitoring panel. Users can view the local storage information of the cluster through Grafana, including storage device and storage volume information. As shown below:
ACK Distro: All in all, with the help of open-local, labor costs can be reduced in terms of operation and maintenance, and the stability of cluster runtime can be improved; in terms of functions, the advantages of local storage are maximized, so that users can not only experience the high performance of local disks, but also At the same time, various advanced storage features enrich the application scenarios, allowing developers to experience the dividends brought by cloud native, and realize the key step of cloud native deployment of applications, especially stateful applications.
Reporter from : like to thank ACK Distro for his wonderful explanation. These three visits have allowed us to have a deeper understanding of it and its friends, and I hope that the content of the interview can provide some help for you who are reading the article.
ACK Distro: Yes, project team members and I welcome everyone's "harassment" in the GitHub community and community!
Related Links
[1]open-local open source warehouse address:
https://github.com/alibaba/open-local
[2]ACK Distro official website:
https://www.aliyun.com/product/aliware/ackdistro
[3]ACK Distro official GitHub:
https://github.com/AliyunContainerService/ackdistro
[4] Make innovation at your fingertips, Alibaba Cloud Container Service ACK release is open for free download: https://mp.weixin.qq.com/s/Lc2afj91sykHMDLUKA_0bw
[5] The first in-depth interview:
https://mp.weixin.qq.com/s/wB7AS52vEA_VeRegUyrVkA
[6] The second in-depth interview:
https://mp.weixin.qq.com/s/O095yS5xPtawkh55rvitTg
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。