...
When you enable OIDC on a Tanzu Kubernetes Grid (TKG) management cluster, you will see the pinniped app in a Reconciled failed state and dexsvc which is of type LoadBalancer will be in the pending state because of the failure in LoadBalancer provisioning state. When you describe the service, you will see output similar to the following: kubectl -n tanzu-system-auth describe svc dexsvc Name: dexsvcNamespace: tanzu-system-authLabels: app=dex kapp.k14s.io/app=1618977512920384658 kapp.k14s.io/association=v1.07cd6e14046aafeb4d8a3195cb353a1dAnnotations: kapp.k14s.io/identity: v1;tanzu-system-auth//Service/dexsvc;v1 kapp.k14s.io/original: {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"dex","kapp.k14s.io/app":"1618977512920384658","kapp.k14s... kapp.k14s.io/original-diff-md5: 3ba1829de15e3013270fed06cad5b893Selector: app=dex,kapp.k14s.io/app=1618977512920384658Type: LoadBalancerIP Families: <none>IP: 100.69.4.55IPs: 100.69.4.55Port: dex 443/TCPTargetPort: https/TCPNodePort: dex 30113/TCPEndpoints: 100.116.40.18:5556Session Affinity: NoneExternal Traffic Policy: ClusterEvents: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 2m1s (x8 over 12m) service-controller Ensuring load balancer Warning SyncLoadBalancerFailed 2m (x8 over 12m) service-controller Error syncing load balancer: failed to ensure load balancer: not a vmss instance When you create a service of type LoadBalancer on a TKG cluster on the existing VNET, the status of the service is stuck in a Pending state with the service not getting an ExternalIP. When you describe the service, you will see output similar to the following: kubectl describe svc nginx-svc Name: nginx-svcNamespace: defaultLabels: run=nginxAnnotations: <none>Selector: run=nginxType: LoadBalancerIP Families: <none>IP: 100.69.12.81IPs: 100.69.12.81Port: <unset> 80/TCPTargetPort: 80/TCPNodePort: <unset> 32075/TCPEndpoints: 100.96.1.3:80Session Affinity: NoneExternal Traffic Policy: ClusterEvents: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 2m34s (x10 over 23m) service-controller Ensuring load balancer Warning SyncLoadBalancerFailed 2m34s (x10 over 22m) service-controller Error syncing load balancer: failed to ensure load balancer: nsg "azure-wlkd-prod-node-nsg" not found This issue may also occur when utilizing an existing network with provisioned NSGs.
This issue is resolved in Tanzu Kubernetes Grid 1.3.1
To workaround the first issue where the LoadBalancer service is not getting created with the error "Error syncing load balancer: failed to ensure load balancer: not a vmss instance", you must use a ytt overlay that forces the use of "vmss": "standard" on TKG 1.3.0. The overlay file can be created in the location ~/.tanzu/tkg/providers/infrastructure-azure/ytt/azure-overlay.yaml.Sample overlay file: #@ load("@ytt:overlay", "overlay") #@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"}),expects="1+" --- spec: #@overlay/match missing_ok=True template: #@overlay/match missing_ok=True spec: #@overlay/match missing_ok=True preKubeadmCommands: #@overlay/append - "if [ -f /etc/kubernetes/azure.json ]; then sed -i 's/\"vmType\": \"vmss\"/\"vmType\": \"standard\"/' /etc/kubernetes/azure.json; fi" #@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"}) --- spec: #@overlay/match missing_ok=True kubeadmConfigSpec: #@overlay/match missing_ok=True preKubeadmCommands: #@overlay/append - "if [ -f /etc/kubernetes/azure.json ]; then sed -i 's/\"vmType\": \"vmss\"/\"vmType\": \"standard\"/' /etc/kubernetes/azure.json; fi" For the second issue as stated in the cause section, creating an NSG for the node will fix the LB not getting the external IP, this will be added as a requirement in official docs soon. To workaround the second issue, a Network Security Group (NSG) must be created on the existing VNET.For the second issue, when a cluster is created on the existing VNET it looks for the NSG (Network Security Group) for the nodes. The name of the NSG has to be in the form <clustername>-node-nsg. For example, if your cluster name is azure-wlkd-dev, you need to create a Network Security Group on your Resource Group in the same region with the name azure-wlkd-dev-node-nsg. There is no need to modify the Inbound and Outbound rules when you create the NSG. Once the NSG is created, the Loadbalancer service will get an external IP and any new service you create of type LoadBalancer will automatically get added to the NSG.