介绍
基于Kubernetes和Jenkins来实现CI/CD。 所有需要跑任务的jenkins slave(pod)通过模版动态创建,当任务执行结束自动删除。
系统整体架构
job流程
环境
kubernets
jenkins配置
jenkins-deployment.yaml
apiVersion: "apps/v1beta1"
kind: "Deployment"
metadata:
name: "jenkins"
labels:
name: "jenkins"
spec:
replicas: 1
template:
metadata:
name: "jenkins"
labels:
name: "jenkins"
spec:
containers:
- name: jenkins
image: jenkinsci/jenkins:2.154
imagePullPolicy: IfNotPresent
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
env:
- name: TZ
value: Asia/Shanghai
ports:
- containerPort: 8080
name: web
- containerPort: 50000
name: agent
volumes:
- name: jenkins-home
nfs:
path: "/nfs/jenkins/data"
server: "cpu029.hogpu.cc"
terminationGracePeriodSeconds: 10
serviceAccountName: jenkins
jenkins-account.yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: jenkins
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: jenkins
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get","list","watch"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
- apiGroups: [""]
resources: ["configmap"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: jenkins
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: jenkins
subjects:
- kind: ServiceAccount
name: jenkins
jenkins-service.yaml
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: jenkins
name: jenkins
spec:
type: NodePort
ports:
- port: 8080
name: web
targetPort: 8080
- port: 50000
name: agent
targetPort: 50000
selector:
name: jenkins
说明
说明一下:这里 Service 我们暴漏了端口 8080 和 50000,8080 为访问 Jenkins Server 页面端口,50000 为创建的 Jenkins Slave 与 Master 建立连接进行通信的默认端口,如果不暴露的话,Slave 无法跟 Master 建立连接。这里使用 NodePort 方式暴漏端口,并未指定其端口号,由 Kubernetes 系统默认分配,当然也可以指定不重复的端口号(范围在 30000~32767)
创建jenkins
接下来,通过 kubectl 命令行执行创建 Jenkins Service。
$ kubectl create namespace kubernetes-plugin
$ kubectl config set-context $(kubectl config current-context) --namespace=kubernetes-plugin
$ kubectl create -f jenkins-deployment.yaml
$ kubectl create -f jenkins-account.yaml
$ kubectl create -f jenkins-service.yaml
ps:
创建一个新的 namespace 为 kubernetes-plugin,并且将当前 context 设置为 kubernetes-plugin namespace 这样就会自动切换到该空间下。
查看状态
jianyu.tian@yz-gpu-k8s004 ~]$ kubectl get deployment,svc,pods
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/jenkins 1 1 1 1 1h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/jenkins NodePort 10.106.235.91 <none> 8080:31051/TCP,50000:30545/TCP 2h
NAME READY STATUS RESTARTS AGE
po/jenkins-64564fc5c9-pzlpb 1/1 Running 0 1h
ps:
Jenkins Master 服务已经启动起来了,并且将端口暴漏到 8080:31051,50000:30545,此时可以通过浏览器打开 http://<Cluster_IP>:30645 访问 Jenkins 页面了。
jenkins web界面初始化
1.主要对jenkins-plugin插件做说明
安装完毕后,点击 “系统管理” —> “系统设置” —> “新增一个云” —> 选择 “Kubernetes”,然后填写 Kubernetes 和 Jenkins 配置信息。
ps:
Name 处默认为 kubernetes,也可以修改为其他名称,如果这里修改了,下边在执行 Job 时指定 podTemplate() 参数 cloud 为其对应名称,否则会找不到,cloud 默认值取:kubernetes
Kubernetes URL 处我填写了 https://kubernetes.default.sv... 这里我填写了 Kubernetes Service 对应的 DNS 记录,通过该 DNS 记录可以解析成该 Service 的 Cluster IP,或者直接填写外部 Kubernetes 的地址 https://<ClusterIP>:<Ports>。
Jenkins URL 处我填写了 http://jenkins.kubernetes-plugin:8080,跟上边类似,也是使用 Jenkins Service 对应的 DNS 记录,不过要指定为 8080 端口,因为我们设置暴漏 8080 端口。同时也可以用 http://<ClusterIP>:<Node_Port>
配置完毕,可以点击 “Test Connection” 按钮测试是否能够连接的到 Kubernetes,如果显示 Connection test successful 则表示连接成功,配置没有问题。
测试
创建一个 Pipeline 类型 Job:
pipeline {
agent any
//并行操作
stages {
stage("test_all") {
parallel {
stage("python3-cuda9.2") {
agent {
kubernetes {
label 'mxnet-python3-cuda9'
yaml """
apiVersion: "v1"
kind: "Pod"
metadata:
labels:
name: "mxnet-python3-cuda9"
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hobot.workas
operator: In
values:
- gpu
- key: kubernetes.io/nvidia-gpu-name
operator: In
values:
- TITAN_V
containers:
- name: mxnetone
image: docker.hobot.cc/dlp/mxnetci:runtime-py3.6-cudnn7.3-cuda9.2-centos7
imagePullPolicy: Always
resources:
limits:
nvidia.com/gpu: 1
"""
}
}
stages {
stage("拉取代码") {
steps {
container("mxnetone") {
checkout(
[
$class: 'GitSCM',
branches: [[name: 'nnvm']],
browser: [$class: 'Phabricator', repo: 'rMXNET', repoUrl: ''],
doGenerateSubmoduleConfigurations: false, extensions: [[$class: 'SubmoduleOption', disableSubmodules: false, parentCredentials: true, recursiveSubmodules: true, reference: '', trackingSubmodules: false]],
submoduleCfg: [],
userRemoteConfigs: [[credentialsId: 'zhaoming_private', url: '']]
]
)
}
}
}
stage("编译") {
steps {
container("mxnetone") {
sh """
nvidia-smi
source /root/.bashrc
make deps
echo -e "USE_PROFILER=1\nUSE_GLOG=0\nUSE_HDFS=0" >> ./make/config.mk
sed -i "s#USE_CUDA_PATH = /usr/local/cuda-8.0#USE_CUDA_PATH = /usr/local/cuda-9.2#g" ./make/config.mk
make lint
make -j 12
ln -s /home/data ./
make test | tee unittest.log
"""
}
}
}
stage("单元测试") {
steps {
container("mxnetone") {
sh """
cp -rf python/mxnet ./
cp -f lib/libmxnet.so mxnet/
echo "-------Running tests under Python3-------"
python3 -V
python3 `which nosetests` tests/python/train
python3 `which nosetests` -v -d tests/python/unittest
"""
}
}
}
}
}
stage("python2-cuda9.2") {
agent {
kubernetes {
label 'mxnet-python2-cuda9'
yaml """
apiVersion: "v1"
kind: "Pod"
metadata:
labels:
name: "mxnet-python2-cuda9"
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hobot.workas
operator: In
values:
- gpu
- key: kubernetes.io/nvidia-gpu-name
operator: In
values:
- TITAN_V
containers:
- name: mxnettwo
image: docker.hobot.cc/dlp/mxnetci:runtime-cudnn7.3-cuda9.2-centos7
imagePullPolicy: Always
resources:
limits:
nvidia.com/gpu: 1
"""
}
}
stages {
stage("拉取代码") {
steps {
container("mxnettwo") {
checkout(
[
$class: 'GitSCM',
branches: [[name: 'nnvm']],
browser: [$class: 'Phabricator', repo: 'rMXNET', repoUrl: ''],
doGenerateSubmoduleConfigurations: false, extensions: [[$class: 'SubmoduleOption', disableSubmodules: false, parentCredentials: true, recursiveSubmodules: true, reference: '', trackingSubmodules: false]],
submoduleCfg: [],
userRemoteConfigs: [[credentialsId: 'zhaoming_private', url: '']]
]
)
}
}
}
stage("编译") {
steps {
container("mxnettwo") {
sh """
nvidia-smi
pip2 install numpy==1.14.3 -i https://mirrors.aliyun.com/pypi/simple/
source /root/.bashrc
make deps
echo -e "USE_PROFILER=1\nUSE_GLOG=0\nUSE_HDFS=0" >> ./make/config.mk
sed -i "s#USE_CUDA_PATH = /usr/local/cuda-8.0#USE_CUDA_PATH = /usr/local/cuda-9.2#g" ./make/config.mk
make lint
make -j 12
ln -s /home/data ./
make test | tee unittest.log
"""
}
}
}
stage("单元测试") {
steps {
container("mxnettwo") {
sh """
cp -rf python/mxnet ./
cp -f lib/libmxnet.so mxnet/
echo "-------Running tests under Python2-------"
python2 -V
python2 `which nosetests` tests/python/train
python2 `which nosetests` -v -d tests/python/unittest
"""
}
}
}
}
}
}
}
}
}
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。