目录

cilium-chain和kubeProxyReplacement

概述

测试在 Contiv Netplugin 的 Kubernetes 集群中,通过 Cilium Chain 的方式,验证 kubeProxyReplacement 的能力。

测试环境

测试集群,其中有几个节点内核是 5.10 以上的,符合 Cilium 安装的要求。

1
2
3
4
# kubectl get node --show-labels|grep -i open
10.189.212.124   Ready   209d   v1.20.3-vip.2   kubernetes.io/hostname=10.189.212.124,kubernetes.io/os=linux,nansha=true,os-type=openeuler,status=online
10.189.212.125   Ready   195d   v1.20.3-vip.2   kubernetes.io/hostname=10.189.212.125,kubernetes.io/os=linux,os-type=openeuler,osp-proxy=true,production=true,status=online
10.189.222.60    Ready   191d   v1.20.3-vip.2   kubernetes.io/hostname=10.189.222.60,kubernetes.io/os=linux,os-type=openeuler,usetype=jenkins-openeuler

选定 10.189.212.125 作为 cilium-agent 部署的节点。因为节点上有很多用于测试的无状态的 Pod,因此通过增加污点的方式,将正在运行的 Pod 驱逐。

1
kubectl taint nodes 10.189.212.125 key=cilium:NoExecute

Cilium部署

跟之前 Cilium 网络策略的测试类似,通过 Helm 来部署 Cilium,其中修改了大量的参数值,以满足 Staging 集群的环境的需求,主要就是污点的容忍、调度的 NodeSelector,以及 kube-proxy-replacement 的开启等等,具体可以参考下面的对比。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
diff vip-proxy-template.yaml vip-template.yaml
96c96
<   # so cilium eBPF host routing shoud not work, and let it fall back to the legacy routing mode
---
>   # so cilium eBPF host routing should not work, and let it fall back to the legacy routing mode
129c129
<   # Unique ID of the cluster. Must be unique across all conneted clusters and
---
>   # Unique ID of the cluster. Must be unique across all connected clusters and
168c168
<   kube-proxy-replacement: "true"
---
>   kube-proxy-replacement: "false"
170a171,173
>   enable-host-port: "false"
>   enable-external-ips: "false"
>   enable-node-port: "false"
218c221
<   unmanaged-pod-watcher-interval: "0"
---
>   unmanaged-pod-watcher-interval: "15"
392a396,398
>   # to automatically delete [core|kube]dns pods so that are starting to being
>   # managed by Cilium
>   - delete
1151c1157
<         kubernetes.io/hostname: 10.189.212.125
---
>         kubernetes.io/hostname: 10.189.212.124
1154,1157c1160
<         - effect: NoExecute
<           key: key
<           operator: Equal
<           value: cilium
---
>         - operator: Exists
1341c1344
<         kubernetes.io/hostname: 10.189.212.125
---
>         kubernetes.io/hostname: 10.189.212.124
1344,1347c1347
<         - effect: NoExecute
<           key: key
<           operator: Equal
<           value: cilium
---
>         - operator: Exists
1429c1429
<         kubernetes.io/hostname: 10.189.212.125
---
>         kubernetes.io/hostname: 10.189.212.124
1431,1435d1430
<       tolerations:
<         - effect: NoExecute
<           key: key
<           operator: Equal
<           value: cilium
1508c1503
<         kubernetes.io/hostname: 10.189.212.125
---
>         kubernetes.io/hostname: 10.189.212.124
1510,1514d1504
<       tolerations:
<         - effect: NoExecute
<           key: key
<           operator: Equal
<           value: cilium

同时,通过部署一个包含大量网络工具的 Pod,来验证 Cilium Chain 下的网络能力。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
apiVersion: apps/v1
kind: DaemonSet
metadata:
  namespace: kube-system
  name: nm
spec:
  selector:
    matchLabels:
      app: network-multitool
  template:
    metadata:
      labels:
        app: network-multitool
    spec:
      tolerations:
        - effect: NoExecute
          key: key
          operator: Equal
          value: cilium      
      nodeSelector:
        kubernetes.io/os: linux
        os-type: openeuler
      containers:
        - name: network-multitool
          image: runzhliu/network-multitool:latest
          command: ["/bin/bash", "-c", "sleep infinity"]
          securityContext:
            privileged: true

还需要在指定的 cilium agent 所在的节点 10.189.212.125 部署一个 Nginx,用于下面测试 kubeProxyReplacement 的功能。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: kube-system
  name: nm-nginx
spec:
  selector:
    matchLabels:
      app: nm-nginx 
  replicas: 1
  template:
    metadata:
      labels:
        app: nm-nginx
    spec:
      tolerations:
      - effect: NoSchedule
        key: key
        operator: Equal
        value: cilium
      nodeSelector:
        kubernetes.io/os: linux
        os-type: openeuler
        kubernetes.io/hostname: 10.189.212.125
      containers:
      - name: nm-nginx
        image: docker.xxx.com/llm/network-multitool:latest
        command: ["/bin/bash", "-c", "sleep infinity"]
        securityContext:
          privileged: true
        ports:
        - containerPort: 80

并且通过下面的命令,创建一个 NodePort Service。

1
kubectl -n kube-system expose deploy nm-nginx --type=NodePort --port=80

同时,创建一个部署在非 cilium agent 的节点的 Nginx。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: kube-system
  name: nm-nginx-ovs
spec:
  selector:
    matchLabels:
      app: nm-nginx-ovs
  replicas: 1
  template:
    metadata:
      labels:
        app: nm-nginx-ovs
    spec:
      tolerations:
      - effect: NoExecute
        key: key
        operator: Equal
        value: cilium
      nodeSelector:
        kubernetes.io/os: linux
        os-type: openeuler
        kubernetes.io/hostname: 10.189.212.124
      containers:
      - name: nm-nginx
        image: docker.xxx.com/llm/network-multitool:latest
        command: ["/bin/bash", "-c", "sleep infinity"]
        securityContext:
          privileged: true
        ports:
        - containerPort: 80

通过下面的命令,为其创建一个 NodePort Service。

1
kubectl -n kube-system expose deploy nm-nginx-ovs --type=NodePort --port=80

最终部署的结果如下,其中 cilium operator 和 cilium agent 都部署成功,另外 hubble 作为流量可视化的工具也部署完成。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
NAME                               READY   STATUS    RESTARTS   AGE    IP               NODE             
cilium-c987r                       1/1     Running   0          166m   10.189.212.125   10.189.212.125   
cilium-operator-7df8cb69b8-2h4gm   1/1     Running   0          166m   10.189.212.125   10.189.212.125   
hh-coredns-79d96cc7b9-ktqkm        1/1     Running   0          22h    10.189.92.239    10.189.212.71    
hh-coredns-79d96cc7b9-lmrpd        1/1     Running   0          22h    10.189.92.240    10.189.212.47    
hubble-ui-7b4bcf6bcf-d4fpb         2/2     Running   0          166m   10.189.82.106    10.189.212.125   
nm-7vvd4                           1/1     Running   0          20h    10.189.55.117    10.189.212.124   
nm-f4zbl                           1/1     Running   0          20h    10.189.54.157    10.189.212.125   
nm-nginx-57f8c69966-dg4l2          1/1     Running   0          20h    10.189.55.252    10.189.212.125   
nm-nginx-ovs-786f85fc5d-9j48h      1/1     Running   0          21m    10.189.87.224    10.189.212.124   
nm-vxbbb                           1/1     Running   0          20h    10.189.54.156    10.189.222.60    
# k -n kube-system get svc
NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
cilium-agent           ClusterIP   None             <none>        9964/TCP       146m
hubble-metrics         ClusterIP   None             <none>        9965/TCP       146m
hubble-peer            ClusterIP   10.200.158.121   <none>        80/TCP         21h
hubble-relay           ClusterIP   10.200.159.195   <none>        80/TCP         21h
hubble-relay-metrics   ClusterIP   None             <none>        9966/TCP       146m
hubble-ui              ClusterIP   10.200.158.35    <none>        80/TCP         21h
nm-nginx               NodePort    10.200.157.90    <none>        80:31314/TCP   19h
nm-nginx-ovs           NodePort    10.200.159.14    <none>        80:31418/TCP   9s

kubeProxyReplacement测试

通过下面的方法进行验证,下面的输出表示集群中已经实行了 kube-proxy 的替换。

1
2
# kubectl -n kube-system exec ds/cilium -- cilium status | grep KubeProxyReplacement
KubeProxyReplacement:    True   [bond0.212 10.189.212.125 (Direct Routing)]

查看 cilium agent 节点是否存在 Kubernetes 的 SVC 规则,结果为空,则符合预期,因为代表 Cilium 通过 eBPF 实现的 kube-proxy 已经创建了新的 NodePort Service,因此无需原来 kube-proxy 用 iptables 创建的方式。

1
2
[root@ns-k8s-noah-staging001-node-s0093 runzhliu]# iptables-save | grep KUBE-SVC
[root@ns-k8s-noah-staging001-node-s0093 runzhliu]#

检查 cilium agent 创建的 service。

1
2
3
4
# kubectl -n kube-system exec ds/cilium -- cilium service list|grep -i 10.189.55.252
918   10.200.157.90:80       ClusterIP      1 => 10.189.55.252:80 (active)
921   10.189.212.125:31314   NodePort       1 => 10.189.55.252:80 (active)
922   0.0.0.0:31314          NodePort       1 => 10.189.55.252:80 (active)

测试此时 NodePort 和 ClusterIP 的连通性。

1
2
3
4
5
6
[root@ns-k8s-noah-staging001-node-s0093 ~]# curl 10.200.157.90:80
Praqma Network MultiTool based on nginx:alpine
[root@ns-k8s-noah-staging001-node-s0093 ~]# curl 0.0.0.0:31314
Praqma Network MultiTool based on nginx:alpine
[root@hh-k8s-noah-staging001-master-s1001 ~]# curl 10.189.212.125:31314
Praqma Network MultiTool based on nginx:alpine

另外再测试一下普通的 Contiv Netplugin 下的网络中 NodePort 和 ClusterIP 的连通性。从结果看,也是符合预期的,Service 都是不可用的状态。

1
2
3
4
5
6
7
# 切换了节点
[root@ns-k8s-noah-staging001-node-s0092 ~]# curl 10.200.159.14
^C
[root@ns-k8s-noah-staging001-node-s0092 ~]# curl 10.200.159.14:31418
^C
[root@ns-k8s-noah-staging001-node-s0092 ~]# curl 0.0.0.0:31418
curl: (7) Failed to connect to 0.0.0.0 port 31418 after 0 ms: Connection refused

从 Hubble UI 上也可以看到 ClusterIP 的连通性。

/cilium-chain%E5%92%8Ckubeproxyreplacement/img.png

总结

从测试的结果来看,Cilium 以 Chain 的方式,也可以给 Contiv Netplugin 的 Kubernetes 集群增加 kubeProxyReplacement 的能力,也就是可以提供集群内的 Service。

参考资料

  1. Kubernetes Without kube-proxy
  2. Monitor Cilium and Kubernetes performance with Hubble
警告
本文最后更新于 2023年11月12日,文中内容可能已过时,请谨慎参考。