Originally, this article was supposed to be called “The role paradox”. On further reflection, I came to the conclusion that this is not a paradox in the true sense of the word. The vCenter is just doing its job.
Authorizations under vSphere are basically simple (as long as we do not want to use restricted authorizations). If we are a member of the administrator group and have unrestricted access to all objects in the data center, privileges and roles are quickly explained.
Definition of terms
A privilege is the smallest unit. It allows the execution of a very specific action.
A role is a collection of privileges. The administrator role contains all available privileges. The no-access role, on the other hand, does not contain any privileges. “No access” is not to be understood here as an explicit denial, but as a lack of privileges. What may initially seem like a semantic quibble is an important difference to other authorization concepts such as Active Directory.
Missing privillege != denial
A permission is always made up of three components: A vSphere object, a role and a user or user group. A user (or a group) can have different roles on different objects. Permissions on objects can be propagated to child objects.
The challenge
Things get interesting when I assign rights globally, but then want to restrict them to certain objects.
Example: The administrators group should have access to all objects, with the exception of some VMs in a defined VM folder. Sounds simple – but it’s not.
I became aware of the problem described here through my colleague Alexei Prozorov, who came across this phenomenon in a customer project. The topic was so interesting that I had to recreate it in the laboratory.
Preparations
We first need a test user in the SSO domain of the vCenter. To do this, we go to Administration > Single Sign On > Users and Groups.
With ‘ADD’ we add a new user.
We create it in the SSO domain vsphere.local and name it TestUser1. This test user should also be part of a user group ‘TestGroup1‘. Under the Groups tab, we click on ‘ADD’.
We give this user group global permissions. To do this, we select Administration > Access Control > Global Permissions > Add.
It is important that we select the SSO domain (here: vsphere.local) and inherit the role (Propagate to children).
The user TestUser1 from the group TestGroup1 therefore has the administrator role on all objects. We can check this on any object.
So far, so good.
New requirement: restrict access to certain VMs
In our fictitious scenario, the management now requests that TestUser1 should not have access to certain VMs. For this purpose, the VMs are moved into a VM folder and the privileges of the TestGroup1 group are withdrawn. It is given the no-access role.
The highly secret VM LinuxFTTest1 is moved to the folder‘NoAccess4UserGroup1‘.
We set the “No-Access” role for the TestGroup1 group directly on the folder. This overwrites the inherited Administrator role, as the No-Access role was set closer to the object.
We check the result as administrator@vsphere.local. The rights to the “NoAccess4UserGroup1” folder are set correctly. However, the permissions to the VM LinuxFTTest1 is interesting. There, the UserGroup1 group has the inherited Administrator role and the NoAccess role from the folder above. We will see the consequences of this in a moment.
We check the result by logging on to the vSphere client as testuser1.
When we look at the VMs and Folders view, we cannot see the “NoAccess4UserGroup1” folder. This is exactly what we wanted to achieve.
However, if we change the view to “Hosts & Clusters”, the secret VM from the hidden folder is visible there. And that’s not all. The VM can even be started, although we should actually have the no-access role there.
Repeat experiment without a group
Invisible on the one hand, administrator on the other. Is it the same for discrete users instead of groups?
We repeat the experiment described above. This time, however, without groups, but directly with TestUser1. This user receives the administrator role globally and the no-access role on the folder “NoAccess4UserGroup1”.
The result is similar to the group test. TestUser1 has both the “No-Access” and “Administrator” role on the VM.
Explanation
What we see here is not a bug or a malfunction in vCenter. Rather, vCenter is doing exactly what it was told to do.
As I explained in the definition of the term at the beginning, the “no-access” role is by no means a denial of access, but merely the absence of privileges. I like to explain this in my courses using a bunch of keys. A privillege is a key and a role is a key ring. If I give a person or group the administrator role, this is comparable to a key ring that contains all the keys. In contrast, the “no-access” role is an empty key ring without any keys. If I give a person the full keyring and the empty keyring, they still have the full keyring as a result. In other words, all privileges.
Roles are combined with the UNION command in vCenter.
Everything + nothing = everything.
Roles that are set closer to the object overwrite inherited roles.
If we look at the hierarchical inheritance of permissions in vCenter, it becomes clear why the first approach could not meet our requirements. The administrator role at global level (top level) is inherited by all objects below it. The “No-Access” role at VM folder level had the desired effect by overriding the administrator role there. However, via the “Host Folder” branch, the administrator role was transferred back to the VM via the root resource pool and we therefore had both roles on the object, which were combined via UNION. The administrator role remained.
Solution approach
How could the requirement be implemented that a group is given the administrator role on almost all VMs, but no access to a specific group of VMs?
One possible solution would be the following:
We do not assign a global administrator role for the TestGroup1 group. We create a VM folder “All-VMs” under which we organize all other VM folders. In addition, we create a VM folder “Restricted”, as shown in the figure below.
TestGroup1 is assigned the administrator role on the “All-VMs” folder, which is inherited downwards. On the “Restricted” subfolder, we assign the “No-Access” role to TestGroup1. This overwrites the administrator role, as it was set closer to the object.
TestGroup1 has no access to the VM and the “Restricted” folder.
The administrator role is retained for all other folders and VMs.
As we had not set any global authorizations in this experiment, the authorization for the VM is not overwritten in the Hosts & Clusters view and remains at “No-Access”.
Not quite there yet
Our strategy still has one flaw, which we’ll recognize as soon as we log in with TestUser1 (from the TestGroup1 group).
As we have not set the permissions globally, TestUser1 lacks all permissions from the data center. This is not sufficient for a user who is actually supposed to be an administrator – with the exception of the VMs under “Restricted”.
The explanation is shown in the diagram below. At global level, no permissions are set for the TestGroup1 group, which is equivalent to a no-access role. The administrator role only takes effect from the VM folder “All-VMs” downwards. Objects such as Network, Storage, Hosts or Data Center are hidden.
We remedy this by cloning the Administrator role and revoking all VM privileges. We call this “Infra-Admin”, for example. A second role “VM-Admin” contains only VM privileges. If we combine both new roles, the result is the default administrator role.
We give TestGroup1 the Infra-Admin role globally and propagate it to children. TestGroup1 thus has all infrastructure privileges on all objects with the exception of VM privileges.
We assign the VM Admin role to the “All VMs” folder and propagate to children. We set the “No-Access” role on the “Restricted” folder, which overwrites the “VM-Admin” role there.
The root resource pool transfers the Infra-Admin role to the VMs. Access to the VMs below the restricted folder is still denied, as the infra admin role does not contain any VM privileges.