概述
测试在 Contiv Netplugin 的 Kubernetes 集群中,通过 Cilium Chain 的方式,验证 kubeProxyReplacement 的能力。
测试环境
测试集群,其中有几个节点内核是 5.10 以上的,符合 Cilium 安装的要求。
1
2
3
4
|
# kubectl get node --show-labels|grep -i open
10.189.212.124 Ready 209d v1.20.3-vip.2 kubernetes.io/hostname=10.189.212.124,kubernetes.io/os=linux,nansha=true,os-type=openeuler,status=online
10.189.212.125 Ready 195d v1.20.3-vip.2 kubernetes.io/hostname=10.189.212.125,kubernetes.io/os=linux,os-type=openeuler,osp-proxy=true,production=true,status=online
10.189.222.60 Ready 191d v1.20.3-vip.2 kubernetes.io/hostname=10.189.222.60,kubernetes.io/os=linux,os-type=openeuler,usetype=jenkins-openeuler
|
选定 10.189.212.125 作为 cilium-agent 部署的节点。因为节点上有很多用于测试的无状态的 Pod,因此通过增加污点的方式,将正在运行的 Pod 驱逐。
1
|
kubectl taint nodes 10.189.212.125 key=cilium:NoExecute
|
Cilium部署
跟之前 Cilium 网络策略的测试类似,通过 Helm 来部署 Cilium,其中修改了大量的参数值,以满足 Staging 集群的环境的需求,主要就是污点的容忍、调度的 NodeSelector,以及 kube-proxy-replacement
的开启等等,具体可以参考下面的对比。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
|
diff vip-proxy-template.yaml vip-template.yaml
96c96
< # so cilium eBPF host routing shoud not work, and let it fall back to the legacy routing mode
---
> # so cilium eBPF host routing should not work, and let it fall back to the legacy routing mode
129c129
< # Unique ID of the cluster. Must be unique across all conneted clusters and
---
> # Unique ID of the cluster. Must be unique across all connected clusters and
168c168
< kube-proxy-replacement: "true"
---
> kube-proxy-replacement: "false"
170a171,173
> enable-host-port: "false"
> enable-external-ips: "false"
> enable-node-port: "false"
218c221
< unmanaged-pod-watcher-interval: "0"
---
> unmanaged-pod-watcher-interval: "15"
392a396,398
> # to automatically delete [core|kube]dns pods so that are starting to being
> # managed by Cilium
> - delete
1151c1157
< kubernetes.io/hostname: 10.189.212.125
---
> kubernetes.io/hostname: 10.189.212.124
1154,1157c1160
< - effect: NoExecute
< key: key
< operator: Equal
< value: cilium
---
> - operator: Exists
1341c1344
< kubernetes.io/hostname: 10.189.212.125
---
> kubernetes.io/hostname: 10.189.212.124
1344,1347c1347
< - effect: NoExecute
< key: key
< operator: Equal
< value: cilium
---
> - operator: Exists
1429c1429
< kubernetes.io/hostname: 10.189.212.125
---
> kubernetes.io/hostname: 10.189.212.124
1431,1435d1430
< tolerations:
< - effect: NoExecute
< key: key
< operator: Equal
< value: cilium
1508c1503
< kubernetes.io/hostname: 10.189.212.125
---
> kubernetes.io/hostname: 10.189.212.124
1510,1514d1504
< tolerations:
< - effect: NoExecute
< key: key
< operator: Equal
< value: cilium
|
同时,通过部署一个包含大量网络工具的 Pod,来验证 Cilium Chain 下的网络能力。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
apiVersion: apps/v1
kind: DaemonSet
metadata:
namespace: kube-system
name: nm
spec:
selector:
matchLabels:
app: network-multitool
template:
metadata:
labels:
app: network-multitool
spec:
tolerations:
- effect: NoExecute
key: key
operator: Equal
value: cilium
nodeSelector:
kubernetes.io/os: linux
os-type: openeuler
containers:
- name: network-multitool
image: runzhliu/network-multitool:latest
command: ["/bin/bash", "-c", "sleep infinity"]
securityContext:
privileged: true
|
还需要在指定的 cilium agent 所在的节点 10.189.212.125 部署一个 Nginx,用于下面测试 kubeProxyReplacement 的功能。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: kube-system
name: nm-nginx
spec:
selector:
matchLabels:
app: nm-nginx
replicas: 1
template:
metadata:
labels:
app: nm-nginx
spec:
tolerations:
- effect: NoSchedule
key: key
operator: Equal
value: cilium
nodeSelector:
kubernetes.io/os: linux
os-type: openeuler
kubernetes.io/hostname: 10.189.212.125
containers:
- name: nm-nginx
image: docker.xxx.com/llm/network-multitool:latest
command: ["/bin/bash", "-c", "sleep infinity"]
securityContext:
privileged: true
ports:
- containerPort: 80
|
并且通过下面的命令,创建一个 NodePort Service。
1
|
kubectl -n kube-system expose deploy nm-nginx --type=NodePort --port=80
|
同时,创建一个部署在非 cilium agent 的节点的 Nginx。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: kube-system
name: nm-nginx-ovs
spec:
selector:
matchLabels:
app: nm-nginx-ovs
replicas: 1
template:
metadata:
labels:
app: nm-nginx-ovs
spec:
tolerations:
- effect: NoExecute
key: key
operator: Equal
value: cilium
nodeSelector:
kubernetes.io/os: linux
os-type: openeuler
kubernetes.io/hostname: 10.189.212.124
containers:
- name: nm-nginx
image: docker.xxx.com/llm/network-multitool:latest
command: ["/bin/bash", "-c", "sleep infinity"]
securityContext:
privileged: true
ports:
- containerPort: 80
|
通过下面的命令,为其创建一个 NodePort Service。
1
|
kubectl -n kube-system expose deploy nm-nginx-ovs --type=NodePort --port=80
|
最终部署的结果如下,其中 cilium operator 和 cilium agent 都部署成功,另外 hubble 作为流量可视化的工具也部署完成。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
NAME READY STATUS RESTARTS AGE IP NODE
cilium-c987r 1/1 Running 0 166m 10.189.212.125 10.189.212.125
cilium-operator-7df8cb69b8-2h4gm 1/1 Running 0 166m 10.189.212.125 10.189.212.125
hh-coredns-79d96cc7b9-ktqkm 1/1 Running 0 22h 10.189.92.239 10.189.212.71
hh-coredns-79d96cc7b9-lmrpd 1/1 Running 0 22h 10.189.92.240 10.189.212.47
hubble-ui-7b4bcf6bcf-d4fpb 2/2 Running 0 166m 10.189.82.106 10.189.212.125
nm-7vvd4 1/1 Running 0 20h 10.189.55.117 10.189.212.124
nm-f4zbl 1/1 Running 0 20h 10.189.54.157 10.189.212.125
nm-nginx-57f8c69966-dg4l2 1/1 Running 0 20h 10.189.55.252 10.189.212.125
nm-nginx-ovs-786f85fc5d-9j48h 1/1 Running 0 21m 10.189.87.224 10.189.212.124
nm-vxbbb 1/1 Running 0 20h 10.189.54.156 10.189.222.60
# k -n kube-system get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cilium-agent ClusterIP None <none> 9964/TCP 146m
hubble-metrics ClusterIP None <none> 9965/TCP 146m
hubble-peer ClusterIP 10.200.158.121 <none> 80/TCP 21h
hubble-relay ClusterIP 10.200.159.195 <none> 80/TCP 21h
hubble-relay-metrics ClusterIP None <none> 9966/TCP 146m
hubble-ui ClusterIP 10.200.158.35 <none> 80/TCP 21h
nm-nginx NodePort 10.200.157.90 <none> 80:31314/TCP 19h
nm-nginx-ovs NodePort 10.200.159.14 <none> 80:31418/TCP 9s
|
kubeProxyReplacement测试
通过下面的方法进行验证,下面的输出表示集群中已经实行了 kube-proxy 的替换。
1
2
|
# kubectl -n kube-system exec ds/cilium -- cilium status | grep KubeProxyReplacement
KubeProxyReplacement: True [bond0.212 10.189.212.125 (Direct Routing)]
|
查看 cilium agent 节点是否存在 Kubernetes 的 SVC 规则,结果为空,则符合预期,因为代表 Cilium 通过 eBPF 实现的 kube-proxy 已经创建了新的 NodePort Service,因此无需原来 kube-proxy 用 iptables 创建的方式。
1
2
|
[root@ns-k8s-noah-staging001-node-s0093 runzhliu]# iptables-save | grep KUBE-SVC
[root@ns-k8s-noah-staging001-node-s0093 runzhliu]#
|
检查 cilium agent 创建的 service。
1
2
3
4
|
# kubectl -n kube-system exec ds/cilium -- cilium service list|grep -i 10.189.55.252
918 10.200.157.90:80 ClusterIP 1 => 10.189.55.252:80 (active)
921 10.189.212.125:31314 NodePort 1 => 10.189.55.252:80 (active)
922 0.0.0.0:31314 NodePort 1 => 10.189.55.252:80 (active)
|
测试此时 NodePort 和 ClusterIP 的连通性。
1
2
3
4
5
6
|
[root@ns-k8s-noah-staging001-node-s0093 ~]# curl 10.200.157.90:80
Praqma Network MultiTool based on nginx:alpine
[root@ns-k8s-noah-staging001-node-s0093 ~]# curl 0.0.0.0:31314
Praqma Network MultiTool based on nginx:alpine
[root@hh-k8s-noah-staging001-master-s1001 ~]# curl 10.189.212.125:31314
Praqma Network MultiTool based on nginx:alpine
|
另外再测试一下普通的 Contiv Netplugin 下的网络中 NodePort 和 ClusterIP 的连通性。从结果看,也是符合预期的,Service 都是不可用的状态。
1
2
3
4
5
6
7
|
# 切换了节点
[root@ns-k8s-noah-staging001-node-s0092 ~]# curl 10.200.159.14
^C
[root@ns-k8s-noah-staging001-node-s0092 ~]# curl 10.200.159.14:31418
^C
[root@ns-k8s-noah-staging001-node-s0092 ~]# curl 0.0.0.0:31418
curl: (7) Failed to connect to 0.0.0.0 port 31418 after 0 ms: Connection refused
|
从 Hubble UI 上也可以看到 ClusterIP 的连通性。
总结
从测试的结果来看,Cilium 以 Chain 的方式,也可以给 Contiv Netplugin 的 Kubernetes 集群增加 kubeProxyReplacement 的能力,也就是可以提供集群内的 Service。
参考资料
- Kubernetes Without kube-proxy
- Monitor Cilium and Kubernetes performance with Hubble
警告
本文最后更新于 2023年11月12日,文中内容可能已过时,请谨慎参考。