This will be a multi-part post focused on the VMware Bitfusion product. I will give an introduction to the technology, how to set up a Bitfusion server and how to use its services from Kubernetes pods.
- Part 1 : A primer to Bitfusion
- Part 2 : Bitfusion server setup
- Part 3 : Using Bitfusion from Kubernetes pods and TKGS. (this article)
We saw in parts 1 and 2 what Bitfusion is and how to set up a Bitfusion Server cluster. The challenging part is to make this Bitfusion cluster usable from Kubernetes pods.
In order for containers to access Bitfusion GPU resources, a few general conditions must be met.
I assume in this tutorial that we have a configured vSphere-Tanzu cluster available, as well as a namespace, a user, a storage class and the Kubernetes CLI tools. The network can be organized with either NSX-T or distributed vSwitches and a load balancer such as the AVI load balancer.
In the PoC described, Tanzu on vSphere was used without NSX-T for simplicity. The AVI load balancer, now officially called NSX-Advanced load balancer, was used.
We also need a Linux system with access to Github or a mirror to prepare the cluster.
The procedure in a nutshell:
- Create TKGS cluster
- Get Bitfusion baremetal token laden and create K8s secret
- Load Git project and modify makefile
- Deploy device-plugin to TKGS-cluster
- Pod deployment