...
You are using Tanzu Kubernetes Grid version 1.2You have LDAP authentication enabled using Dex and gangway extensionsYou were able to fetch kubeconfig for your clusters but it had suddenly stopped working and you see the below error when trying to fetch kubeconfig via UI Post https://<some-lb>.us-east-1.elb.amazonaws.com/token: x509: certificate signed by unknown authority (possibly because of "x509: invalid signature: parent certificate cannot sign this kind of certificate" while trying to verify candidate authority certificate "tkg-dex")
This issue happens due to the automatic rotation of the Dex certificates. When the certificate rotation happens gangway is still referring to the old value of Dex ca in gangway-data-values secret on the workload cluster. Also, all the control plane nodes on the workload cluster have the stale Dex ca under /etc/tkg/dex-ca.crt. To avoid this issue the Dex CA is stored in secret dex-cert-tls and should be the same in the three configurations mentioned below.Management Clusterkubectl get secret dex-cert-tls -n tanzu-system-auth -o 'go-template={{ index .data "ca.crt" }}' Workload Cluster - Value of .dex.ca in the output belowkubectl get secret gangway-data-values -n tanzu-system-auth -o 'go-template={{ index .data "values.yaml" }}' | base64 -d Under /etc/tkg/pki/dex-ca.crt on control plane nodes of workload cluster ls -lrth /etc/tkg/pki/-rw-r--r-- 1 root root 1.2K May 21 19:00 dex-ca.crt
There is no resolution for this issue however a workaround is available to fix this problem. In addition to that as per our documentation If you manually deployed the Dex and Gangway extensions on clusters in a release before1.3 and you upgrade the clusters to Tanzu Kubernetes Grid 1.3, it is strongly recommended to migrate your identity management implementation from Dex and Gangway to Pinniped and Dex. If you did not implement Dex and Gangway on clusters from a previous version of Tanzu Kubernetes Grid and you upgrade them to 1.3, it is also strongly recommended to implement Pinniped and Dex on those clusters.
You can follow the steps below to workaround this issue Switch to management cluster config and get dex_ca kubectl get secret dex-cert-tls -n tanzu-system-auth -o 'go-template={{ index .data "ca.crt" }}' | base64 -d > dex_ca.crt Switch to workload cluster config and Get the gangway runtime configuration kubectl get secret gangway-data-values -n tanzu-system-auth -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > gangway-data-values-updated-dex-ca.yaml Replace value of .dex.ca in gangway-data-values-updated-dex-ca.yaml using the dex_ca.crt obtained in step 1 Recreate gangway data values, delete gangway pod to restart kubectl create secret generic gangway-data-values --from-file=values.yaml=gangway-data-values-updated-dex-ca.yaml -n tanzu-system-auth -o yaml --dry-run | kubectl replace -f- Verify the change in ca is reflected kubectl get secret gangway-data-values -n tanzu-system-auth -o 'go-template={{ index .data "values.yaml" }}' | base64 -d Wait for reconciliation to finishSwitch to workload cluster context and use the below commands kubectl describe cm dex-ca -n tanzu-system-auth ( The output should match contents of dex_ca.crt obtained in step 1)kubectl delete pod -n tanzu-system-auth gangway-<uuid> Change the CA on each of the control plane nodes of the workload clusters Copy the contents of dex_ca.crt on each control plane node under /etc/tkg/pki/dex-ca.crtRestart kube-apiserver only one at a time mv /etc/kubernetes/manifests/kube-apiserver.yaml /root/kube-apiserver.yamlmv /root/kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml