...
vSphere CSI Driver Integration doc - https://docs.pivotal.io/tkgi/1-12/vsphere-cns.html#uninstall-csi. After enabling the vSphere CSI Driver Integration on TKGI tile and applying change successfully, when we try to upgrade a cluster, it fails on a worker node with csi-node-registrar being down. Below is the error it shows in csi-node-registrar stderr logs: I0721 08:29:18.240065 10505 main.go:113] Version: v2.1.0-0-g80d42f2 I0721 08:29:18.240719 10505 main.go:137] Attempting to open a gRPC connection with: "/var/vcap/data/kubelet/plugins/csi.vsphere.vmware.com/csi.sock" I0721 08:29:18.240737 10505 connection.go:153] Connecting to unix:///var/vcap/data/kubelet/plugins/csi.vsphere.vmware.com/csi.sock I0721 08:29:18.241191 10505 main.go:144] Calling CSI driver to discover driver name I0721 08:29:18.241213 10505 connection.go:182] GRPC call: /csi.v1.Identity/GetPluginInfo I0721 08:29:18.241219 10505 connection.go:183] GRPC request: {} I0721 08:29:18.243326 10505 connection.go:185] GRPC response: {"name":"csi.vsphere.vmware.com","vendor_version":"v2.3.0"} I0721 08:29:18.243381 10505 connection.go:186] GRPC error: <nil> I0721 08:29:18.243389 10505 main.go:154] CSI driver name: "csi.vsphere.vmware.com" I0721 08:29:18.243462 10505 node_register.go:52] Starting Registration Server at: /var/vcap/data/kubelet/plugins_registry/csi.vsphere.vmware.com-reg.sock I0721 08:29:18.243607 10505 node_register.go:61] Registration Server started at: /var/vcap/data/kubelet/plugins_registry/csi.vsphere.vmware.com-reg.sock I0721 08:29:18.243660 10505 node_register.go:86] Starting healthz server at HTTP endpoint: :9809 F0721 08:29:18.248887 10505 node_register.go:105] listen tcp :9809: bind: address already in use goroutine 4 [running]:
If a manual CSI driver is being used, it'll occupy port 9809 which is the default port for csi-node-registrar. So the error "listen tcp :9809: bind: address already in use" is expected because the manual CSI installation conflicts with the automatic CSI installation.
It won't let the csi-node-registrar process to start. The node status will be showing as "Failing" and it'll not let the upgrade to be completed.
Follow the steps below to workaround this issue: 1. Change the port 9809 to a different port value (such as 9909) in the manual CSI yaml file - https://github.com/kubernetes-sigs/vsphere-csi-driver/blob/v2.3.1/manifests/vanilla/vsphere-csi-driver.yaml.Please note that there are two places you need to update.2. Remove the livenessProbe section from the manual CSI manifest. (below section to be removed) - name: liveness-probe image: quay.io/k8scsi/livenessprobe:v2.2.0 args: - "--v=4" - "--csi-address=/csi/csi.sock" volumeMounts: - name: plugin-dir mountPath: /csi 3. Apply the manifest after making the above changes. 4. Switch the manual CSI installation to automatic CSI installation per the guide https://docs.pivotal.io/tkgi/1-12/vsphere-cns.html#uninstall-csi
Click on a version to see all relevant bugs
VMware Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.