A deeper look into vSphere roles and privileges

Originally, this article was supposed to be called “The role paradox”. On further reflection, I came to the conclusion that this is not a paradox in the true sense of the word. The vCenter is just doing its job.

Authorizations under vSphere are basically simple (as long as we do not want to use restricted authorizations). If we are a member of the administrator group and have unrestricted access to all objects in the data center, privileges and roles are quickly explained.

Definition of terms

A privilege is the smallest unit. It allows the execution of a very specific action.

A role is a collection of privileges. The administrator role contains all available privileges. The no-access role, on the other hand, does not contain any privileges. “No access” is not to be understood here as an explicit denial, but as a lack of privileges. What may initially seem like a semantic quibble is an important difference to other authorization concepts such as Active Directory.

Missing privillege != denial

A permission is always made up of three components: A vSphere object, a role and a user or user group. A user (or a group) can have different roles on different objects. Permissions on objects can be propagated to child objects.

The challenge

Things get interesting when I assign rights globally, but then want to restrict them to certain objects.

Example: The administrators group should have access to all objects, with the exception of some VMs in a defined VM folder. Sounds simple – but it’s not.

I became aware of the problem described here through my colleague Alexei Prozorov, who came across this phenomenon in a customer project. The topic was so interesting that I had to recreate it in the laboratory.

Continue reading “A deeper look into vSphere roles and privileges”

Cluster Retreat Mode gone wrong – vSphere Client lockout

With the release of vSphere 7.0 Update 1, vSphere Cluster Services VMs (vCLS) appeared in vSphere clusters for the first time. This made cluster functions such as Distributed Resource Scheduler (DRS) and others independent of the availability of the vCenter Server Appliance (VCSA) for the first time. The latter still represents a single point of failure in the cluster. By outsourcing the DRS function to the redundant vCLS machines, a higher degree of resilience has been achieved.

Retreat Mode

The vSphere administrator has little influence on the provisioning of these VMs. Occasionally, however, it is necessary to remove these VMs from a datastore if it is to be put into maintenance mode, for example. There is a procedure for setting the cluster to retreat mode. This involves setting temporary advanced settings that lead to the deletion of the vCLS VMs by the cluster.

According to the VMware procedure, the Domain ID must be determined to activate Retreat Mode. The domain ID is the numerical value between ‘domain-c’ and the following colon. In the example from my lab, it has the value 8, but the number can also have four digits or more.

The domain ID has to be transferred to the Advanced Settings of the vCenter.

config.vcls.clusters.domain-c8.enabled = false
Correct Retreat Mode settings.

Admin error occured during activation of retreat mode.

After activating retreat mode on a vSAN cluster, administrators had lost all privileges to all objects in the vSphere Client.

A review of the services showed that the vCenter Server Daemon (vpxd) was not running.

Continue reading “Cluster Retreat Mode gone wrong – vSphere Client lockout”

ESXi Boot Media – New Requirements for v8

The requirements for ESXi boot media have changed fundamentally with ESXi v7 Update3. Partitioning has been changed and the requirements for the load capacity of the medium have also increased. I covered this in my blog article “ESXi Bootmedia – New features in v7 und legacy issues from the past v6.x“.

USB boot media did not turn out to be robust enough and were therefore no longer supported from v7U3 onwards. It is still possible to install ESXi on USB media, but the ESX-OSData partition needs to be redirected to permanent storage.

Warning! USB media and SD cards should not be used for production ESXi installations!

Valid setup targets for ESXi deployments

SD cards and USB media are unsuitable as installation targets due to their poor write endurance. Magnetic discs, SSDs and SATA DOMs (disc-on-modules) are still permitted and recommended.

SATA-DOM on a Supermicro E300-9D

New requirements from version ESXi v8 onwards

My Homelab previously used vSAN 7 and thus the classic OSA architecture. To run the cluster under the new vSAN ESA architecture, it was necessary to use vSphere 8 and new storage devices.

I tested the installation and hardware compatibility on a 64 GB USB medium (not recommended and not supported!). During the installation, there were warnings regarding the USB medium as expected. Nevertheless, I was able to successfully test the detection of the NVMe devices and the vCenter deployment.

Setup warning when trying to use an USB flash medium.

Having successfully completed the test phase, I installed ESXi 8U2 on the SATA DOM of my Supermicro E300 server. To my surprise, the setup failed at a very early stage with the message: “disk device does not support OSDATA“.

RTFM

The explaination is simple: “Read the fine manual!”

My 16 GB SATA DOM from Supermicro was simply too small.

The setup guide for ESXi 8 clearly states the requirements under “Storage Requirements for ESXi 8.0 Installation or Upgrade“:

For best performance of an ESXi 8.0 installation, use a persistent storage device that is a minimum of 32 GB for boot devices. Upgrading to ESXi 8.0 requires a boot device that is a minimum of 8 GB. When booting from a local disk, SAN or iSCSI LUN, at least a 32 GB disk is required to allow for the creation of system storage volumes, which include a boot partition, boot banks, and a VMFS-L based ESX-OSData volume. The ESX-OSData volume takes on the role of the legacy /scratch partition, locker partition for VMware Tools, and core dump destination.

VMware vSphere product doumentation

In other words: New installations will require a boot medium of at least 32 GB (128 GB recommended) and upgrading from an ESXi v7 version will require at least 8 GB, but the OSData partition of this installation must already be redirected to an alternate storage device.

Dirty Trick?

Needless to say, I tried a dirty trick. I first successfully installed an ESXi 7U3 on the 16 GB SATA DOM and then performed an upgrade installation to v8U2. This attempt also failed, as the OSData area was not redirected in the fresh v7 installation.

I don’t want to install on USB media as I have seen too many cases where these devices have failed. The only option is to invest in a larger SATA DOM.

I opted for the 64 GB model because it is a good compromise between minimum requirements and cost-effectiveness.

ESXi Config-Backup with PowerCLI requires HTTP

There is a really useful and convenient PowerCLI one-liner for backing up the host configuration. I have been using it for years and had also explained this in detail in an old blogpost.

Get-Cluster -Name myCluster | Get-VMHost | Get-VMHostFirmware -BackupConfiguration -DestinationPath 'C:\myPath'

This is a command I always teach my students as part of my VMware courses. Backing up the host configuration is downright mandatory before making changes to the host, installing patches and drivers, or host updates. Just a few seconds of additional effort, but these configuration backups have saved me more than once from major trouble and many hours of extra work.

Recently, I was backing up host configurations in a major datacenter. Surprisingly, the command did not work on some of the vCenter instances and aborted with an error message.

Get-VMHostFirmware : 18.08.2023 12:05:49 Get-VMHostFirmware An error occurred while sending the request.
At line:1 char:28
+… et-VMHost | Get-VMHostFirmware -BackupConfiguration -DestinationPath …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Get-VMHostFirmware], ViError
+ FullyQualifiedErrorId : Client20_SystemManagementServiceImpl_BackupVmHostFirmware_DownloadError,VMware.VimAutomation.ViCore.Cmdlets.Commands.Host.GetVMHostFirmware

To understand the error, we must first understand how the PowerCLI command works. First, a backup of the host configuration is triggered on the host via vCenter. The host stores this locally as a zipped TAR archive (.tgz). The name is configBundle-HostFQDN.tgz (example: configBundle-esx01.lab.local.tgz). The archive is then downloaded from the host in a second step. The URL for this is:

http://[HostFQDN]/downloads/[Host-UUID]/configBundle-HostFQDN.tgz

By reading the error message above, there was obviously a problem with the download of the TGZ file. With the help of the network admins, it quickly became obvious what had happened. My workstation, from which I sent the PowerCLI command, tried unsuccessfully to establish an HTTP connection to the ESXi host. But this was blocked by a firewall rule.

I was wondering why the transfer is handled using unencrypted HTTP. In the log of the firewall you can see a connection attempt to the ESXi host with HTTP and HTTPS.

Is there a way to force the download using HTTPS?

My first thought was that there might be a parameter to the command that enforces the HTTPS protocol. A query in the VMTN forum unfortunately brought some disillusionment.

It is a bit surprising that VMware uses an unencrypted protocol for this sensitive data. All the more since the PowerCLI session to vCenter already runs over HTTPS anyway. The most plausible explanation would be that it was simply ‘forgotten’ to secure the transfer via SSL with this quite old command.

So currently there is no other choice but creating a firewall rule that allows downloading via HTTP.