Install VCF, Workload Management and Tanzu Kubernetes Cluster in an afternoon

Yes! It’s possible, I’ve done it.. twice, just to make sure :). At the risk of being redundant @Kyle Gleed was instrumental in driving K8’s in VCF consolidate and has a blog post and paper here, @Tom Stephens posted a great blog on Minimalistic VCF 4.0 deployments with Kubernetes, but I’m lazy and like to make things even easier. That is what I am to cover here, There are a lot of steps, and tons of screenshots making this blog post quite lengthy, my apologies.. I’ll learn to split these up!

To prep your physical systems and get your jumphost up and running for VLC follow the VLC Install guide using Automated mode. We’ve found, through much trial and error, that networking is the MOST important part to getting this all up and running. That being said to detail it out, on the physical side you will need the following on the ports connected to your physical ESXi hosts (If you have a single host, don’t worry about these):

  • 4 VLANs
    • With VLC Automated these are VLANs 10, 11, 12, and 13
  • MTU of 9000

You will also need the Internet access to the nested environment (this applies to all physical configurations) This means you will need to populate the External Gateway field:

  • There are several ways this can be accomplished, the most common is to use a virtual router like VYOS or pfSense using NAT.
    • Routes or an interface to the following subnets in your nested environment:
      • Management – (default in VLC Automated)
      • Ingress and Egress for K8s &

*TIP* – If you do not populate the External Gateway field, you can add a gateway later by logging into CloudBuilder (admin/VMware123!) and then enter “sudo su -” to become root, then use the command:
ip route default via <IP of external gateway> External gateway IP must be accessiable from CloudBuilder!

Once your physical side is ready to go you’ll add your licenses to the AUTOMATED_AVN_VCF_VLAN_10-13_NOLIC_v4.json file.

***The ESXi licenses need to be vSphere with Kubernetes*** While you are in the file, ensure the Edge size is set to LARGE. This will allow you to use the automatically installed and configured AVN edges as a load balancer for Workload Management.

Then use VLC Automated to complete a bringup with AVNs enabled. This will deploy everything we need to reach our goal. Details are in the VLC Install Guide

Once up and running get logged in to the NSX Manager, as we need to prep the Edge Cluster for Workload Management (yes, it’s already there because we deployed with AVNs!)

  • Navigate to System -> Fabric -> Compute Managers.
  • Select the vCenter Server instance and click EDIT.
  • Toggle Enable Trust to Yes.
  • Navigate to System -> Fabric -> Nodes -> Edge Clusters.
  • Click the Edge Cluster name.
  • Next to Tags, click MANAGE.
  • Add the tag WCPReady with a scope of Created for.
  • Log in to the vSphere Web Client instance.
  • Navigate to Menu -> Workload Management.

Select the “Compatible” cluster listed and click Next

Select Tiny for your control plane size. It may work with other sizes but I have not tested it.

Remember that networking stuff we spoke about in the beginning? This is where it becomes important! Enter the following values, the (i) icons have some good info when hovering over them.
Under the Management Network section:

  • Network: SDDC-VDS-Mgmt
    • You’ll need to select the management network portgroup.
  • Starting IP Address: (there is already a DNS entry for this!)
    • It will use 5 IP addresses starting at the above
  • Subnet Mask:
  • Gateway: (This is CloudBuilder’s IP)
  • DNS Server: (This is CloudBuilder’s IP)
  • NTP Server: (This is CloudBuilder’s IP)
  • DNS Search Domains: vcf.sddc.local (Even though this says Optional it is not)

Under the Workload Network section:

  • Select the SDDC-VDS01 (It’s the only one)
  • Select the Edge-Cluster (Also the only one)
  • API Server: kubeapi.vcf.sddc.local (this corresponds to the starting IP above
  • DNS Server: (This is CloudBuilder’s IP)
  • Pool CIDRs and Service CIDRs: Leave the default
  • Ingress CIDRs:
  • Egress CIDRs:
  • Click Next!

For all the storage select the vSAN Default Storage Policy and click Next.

Do a quick review of the information you entered.. although it’s hard to determine the VDS and Edge Cluster names 🙂 and click finish.

Keep an eye on the Task console in vCenter.. you’ll see some errors as indicated below, but they are normal and won’t affect operations.

For reference it takes about 1.5 hours on my system to finish the Workload Management enablement. While we’re waiting for all that to configure, we can setup the content library that we’ll need for our Tanzu image. This is also the first place you’ll be able to see if your internet access is working properly.

In vCenter Click Menu -> Content Libraries

Then click the + to create a new Content Library, give it a name like TKC-Content and click Next.

Select the Subscribed content library radio button and enter the following URL in the subscription URL field:
Then, click Next

If you have a good internet connection the nested environment you will see the box below, if not you will see an error message such as, “Unable to locate depot” If you see the error you will need to troubleshoot by logging into CloudBuilder and attempting to resolve external names and/or ping external IP’s.
If you see a box similar to the one below, click Yes.

Select the VCF-VSAN storage for the content library and click Next.

Do a Quick review and click Finish. This will create the content library and start downloading the Photon/Tanzu image, it is about 2.5Gb in size.

At this point Workload Management is likely still installing and configuring things, you can check on it by navigating back to Workload Management and then click on the Clusters tab. If there is a number next to the Config Status message you can click on it for more information.

When completed you will see a green check mark next to the config status, at this point you can click on the Namespaces tab, if it says it’s still being configured click the refresh icon in the upper right of the vCenter UI.

Once you click on the Namespaces tab you should see the below message, click on the Create Namespace button.

In the pop up window you’ll drill down to select the SDDC-Cluster-01 and then enter a name for the namespace. You can see that I’ve entered ns1, *Note* It must be lowercase a-z, 0-9 and it can have a “-” but not as the first or last character. I’ve also found that longer names tend to cause problems later on when using Kubernetes so keep it short.
Then click the Create button.

Once created you’ll see a message indicating this, click the Got It button

In the vCenter UI select Menu -> Workload Management and then click on the name of the namespace.

Now we’ll need to configure a few things for the namespace we created, namely permissions, storage and connecting that Content Library we created.

Click the add permissions button and select vsphere.local for the source, start typing administrator and select it when it pops up, and last select Can Edit for the role.
Then Click OK

Click the Add storage button and select the vSAN Default Storage Policy.
Then Click OK

Last click Add Library and then Add Library (<- no that’s not a typo) select the Content Library that we created before.
Then Click OK.

Next you’ll need to add routes for the Ingress and Egress networks to your Jumphost. Open a command prompt as administrator and add the routes as you see below with the gateway being CloudBuilder.

Then open a brower and enter the URL:
You should see the screen below, download the CLI Plugin and extract the file to a directory.

Open a command prompt and navigate to the extracted directory and then the /bin directory inside that. I have created a yaml directory to keep my config files in inside of this directory.

The first thing we’ll need to do login to the supervisor cluster, for this we’ll use kubectl-vsphere.exe

On the command line to login you should enter:
kubectl-vsphere.exe login –insecure-skip-tls-verify –server -u administrator@vsphere.local
Then, when prompted, enter the password.

Once successfully logged in you’ll see the contexts that we can use to deploy applications or a Tanzu Kubernetes Cluster. We are going to use the namespace “ns1” for our exercise. To switch to that context enter:
kubectl config use-context ns1

Next we should check that our image is there and our content library is connected. You should see the image listed below, if not go to the content library and make sure it’s connected to the vSphere cluster, and that the OVA image has completed downloading.
Enter kubectl get vmimage

To create our Tanzu Kubernetes Cluster (tkc) we’ll need to use a YAML file that describes the cluster. kubectl then converts the information to JSON when making the API request. Below is the yaml file we’ll be using to create our tkc. As you can see the name is “gc1”, and it will have a single control plane node and a single worker node, both of the class “best-effort-xsmall”. Something I learned about yaml files is that tabs and spaces are *very* important.

kind: TanzuKubernetesCluster
 name: gc1
     count: 1
     class: best-effort-xsmall
     storageClass: vsan-default-storage-policy
     count: 1
     class: best-effort-xsmall
     storageClass: vsan-default-storage-policy
   version: v1.16

To apply this yaml file and have kubernetes create the tkc it’s a simple command and takes about 40 minutes to complete:
kubectl apply -f <filename of yaml>
Then you can check on the creation in a couple of ways.
kubectl get tkc

There is also a much more comprehensive set of information we can get on the cluster as it’s being created. In the screenshot below we are ~20 minutes into the creation of the tkc, it has created the control plane VM, and we can see the worker is still pending.
kubectl describe tkc

If we look in vCenter we can see that there has been another cluster created inside our namespace, which contains the control plane VM and is deploying the worker at this point.

As things continue to build another helpful resource are the events in kubernetes. This can have some errors in them, that are automatically retried and in my experience resolve themselves much of the time.
kubectl get events

The tkc will be complete when you see the Node status for both the control plane worker is “ready” in the kubectl describe tkc

Before you get too excited there are a few more steps to go before you can start deploying workloads. By default the security policies are set to “deny, deny deny” in my layman terms. You need to create a rolebinding to allow the creation of applications and services. To do this we’ll use a yaml file that can be located here along with additional information:

The first thing we’ll need to do is login to the tkc directly with kubectl-vsphere. This can be done with the following command:
kubectl-vsphere.exe login –insecure-skip-tls-verify –server -u administrator@vsphere.local –tanzu-kubernetes-cluster-name gc1 –tanzu-kubernetes-cluster-namespace ns1

After logging in we’ll need to switch to the new tkc as our context!
kubectl config use-context

This yaml file is applied the same way that the tkc creation was done
kubectl apply -f <rolebinding yaml file>

kind: RoleBinding
  name: rolebinding-default-privileged-sa-ns_default
  namespace: default
  kind: ClusterRole
  name: psp:vmware-system-privileged
- kind: Group
  name: system:serviceaccounts

Congratulations, you’ve gotten VCF, Workload Management and a Tanzu Kubernetes cluster up and running! You’ve even made it ready to deploy applications! Head on over to and take a look at some of the Stateless and Stateful applications there!