K8S Network Test Daemonset

K8S Network Test Daemonset

Github Project: https://github.com/atkaper/k8s-network-test-daemonset


An on-premise K8S (kubernetes) cluster needs a proper working virtual network to connect all masters and nodes to each other. In our situation, the host machines (vmware redhat), are not all 100% the same, and can not easily be wiped clean on new K8S and OS upgrades. Therefor we sometimes experienced issues in which the nodes or masters could not always reach each other. We did use the flannel network, which often caused weird issues. We recently switched to using calico, which seems much more stable.

To detect if the network is functioning correct, we have created a piece of test software. Nothing fancy, just a simple shell script. This script runs as a daemonset in the cluster, on both masters and nodes. It tries to ping all members of the daemonset, and ping the host machines, and tests if the k8s nameservers are reachable. If all this works, it is a nice indication of the health of the cluster’s network.

You can look at the test results by looking at the log’s of each pod. The test runs every minute.

Another way to look at it, is by enabling prometheus to poll the data. The data is available in prometheus format on url /prometheus.report (port 8123). We have added prometheus.io annotations in the k8s.yml file, which trigger prometheus to poll all daemonset pods for this data. This way you can create a graph or counter on your dashboard with the network health. Note: this graph/counter is on our TODO list, so there’s no example here. You should probably create an expression to calculate the number of kube-dns instances + 1, and the number of nodes and masters (compute to power of 2), and subtract the dns/node-OK counters from that to get to zero errors on your dashboard. For now we use a query which adds all error counts: “sum(networktest_total_error_count)”. Disadvantage of this; if a test pod is down, it does not report error’s itself. Of course all other test pods will mark it as error, so it shows up anyway as non-zero errors 😉


docker build -t repository-url/k8s-network-test-daemonset:0.1 .
docker push repository-url/k8s-network-test-daemonset:0.1


In k8s.yml, replace “##DOCKER_REGISTRY##/k8s-network-test-daemonset:##VERSION##” by the image name/version. In our environment, the jenkins build pipeline takes care of that. The set will run in the kube-system namespace. It add’s the needed rbac (security) information also. If you do not have rbac enabled, you might need to strip down the k8s.yml file a bit.

kubectl apply -f k8s.ymlCode language: CSS (css)

Example output

[master1]$ kubectl get pods -n kube-system -o wide | grep k8s-network-test-daemonset

k8s-network-test-daemonset-2c7mk                           1/1       Running   0          39d    ahlt1828
k8s-network-test-daemonset-brxh6                           1/1       Running   0          39d    ahlt1827
k8s-network-test-daemonset-k6s9b                           1/1       Running   0          39d   ahlt1825
k8s-network-test-daemonset-kwsjp                           1/1       Running   0          38d    ahlt1625
k8s-network-test-daemonset-l47w7                           1/1       Running   1          39d    ahlt1826
k8s-network-test-daemonset-lsgn5                           1/1       Running   1          39d   ahlt1627
k8s-network-test-daemonset-mzw2z                           1/1       Running   0          39d   ahlt1799
k8s-network-test-daemonset-rwncl                           1/1       Running   1          4d     ahlt1626
k8s-network-test-daemonset-tmbbt                           1/1       Running   0          39d    ahlt1628
k8s-network-test-daemonset-tqxmx                           1/1       Running   0          39d   ahlt1569
k8s-network-test-daemonset-tvh57                           1/1       Running   0          39d    ahlt1632
k8s-network-test-daemonset-vzd4f                           1/1       Running   0          39d     ahlt1798
k8s-network-test-daemonset-wgn9j                           1/1       Running   0          39d    ahlt1630
k8s-network-test-daemonset-zvbfb                           1/1       Running   0          39d    ahlt1629

[master1]$ kubectl logs --tail 30 k8s-network-test-daemonset-2c7mk -n kube-system

... chopped some lines ...
Tue Jan 16 11:57:01 CET 2018 Tests running on node: ahlt1828, host: k8s-network-test-daemonset-2c7mk
DNS: kube-dns.kube-system.svc.cluster.tst.local.
DNS: kube-dns-5977b8689-2qbmq.
DNS: kube-dns-5977b8689-xmh6h.
Testing 14 nodes
Checking: ahlt1828 Running - k8s-network-test-daemonset-2c7mk;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1827 Running - k8s-network-test-daemonset-brxh6;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1825 Running - k8s-network-test-daemonset-k6s9b;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1625 Running - k8s-network-test-daemonset-kwsjp;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1826 Running - k8s-network-test-daemonset-l47w7;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1627 Running - k8s-network-test-daemonset-lsgn5;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1799 Running - k8s-network-test-daemonset-mzw2z;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1626 Running - k8s-network-test-daemonset-rwncl;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1628 Running - k8s-network-test-daemonset-tmbbt;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1569 Running - k8s-network-test-daemonset-tqxmx;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1632 Running - k8s-network-test-daemonset-tvh57;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1798 Running - k8s-network-test-daemonset-vzd4f;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1630 Running - k8s-network-test-daemonset-wgn9j;  host-ping: 0.00 pod-ping: 0.00 - OK
Checking: ahlt1629 Running - k8s-network-test-daemonset-zvbfb;  host-ping: 0.00 pod-ping: 0.00 - OK
No status changes since previous test run.Code language: JavaScript (javascript)


This daemonset has been tested on kubernetes 1.6.x (using flannel) and 1.8.4 (using calico). The last one being much more stable than the first one 😉



One thought on “K8S Network Test Daemonset

Leave a Reply

Your email address will not be published. Required fields are marked *