ESXi hosts which are booting from SAN or USB flash media do not have a permanent scratch location. Instead a path /tmp/Scratch will be mapped into their RAMdisk. That has a big disadvantage, because after a reboot all logs are lost. Especially for troubleshooting this is a real problem.
Continue reading “Move ESXi scratch location”Runecast (beta) with Hardware Compatibility Checks
Fully automated VMware HCL checks in Runecast Analyzer
Runecast opened a beta testing program for early adopters to try their latest feature. In a future release of Runecast Analyzer users will not only be able to scan their environment against VMware KB issues, but also to check their hardware against VMware hardware compatibility list (HCL).
I’ve been talking a lot with the Runecast team about this ‘missing’ feature. Now I’m lucky to be one of the beta testers and can get a glimpse to the future. 🙂
The challenge
Getting information about software and configuration issues in your vSphere cluster is priceless. But what about hardware?
Look how a future release of Runecast Analyzer can help. It will check your current hardware configuration against VMware HCL. Continue reading “Runecast (beta) with Hardware Compatibility Checks”
ESXi host restore with obstacles
Unable to re-join EVC cluster after restore of ESXi system
Changing boot media of ESXi hosts (unfortunately) has become a routine job. It is based on the fact, that many flash media have a limited lifespan. To be fair, I need to point out that many customers use (cheap and dirty) USB flash sticks as boot media. But what is good in a homelab, turns out to be a bad idea in enterprise environments.
The usual procedure for media replacement is fairly simple:
- export host configuration
- evacuate and shut down host
- prepare fresh boot medium with installation ISO that has the same or lower patchlevel as the old installation
- boot freshly installed host
- apply (intermediate) IP address if no DHCP available
- restore host configuration
- re-connect to cluster
- apply patches if neccessary
So far so good. But last week I had a nasty experience with a recovered ESXi host. Continue reading “ESXi host restore with obstacles”
Troubleshoot vmnic malfunction
Malfunction is worse than failure
Redundancy is key in virtual environments. If one component fails, another will jump in and take over. But what happens if a component does not really fail but isn’t working properly any more. In this case it isn’t easy to detect a failure.
I recently got a call by a friend, that he has suddenly lost all file shares on his (virtual) file server. I opened a connection to a service machine and started some troubleshooting. These were the first diagnostic results:
- Fileserver did not respond to ping.
- Ping to gateway was successful.
- Name resolution against virtual DC was successful.
- A browser session to vCenter failed and vCenter did not respond to ping.
It is a little two-node cluster running on vSphere 6.5 U2. Maybe one ESX has failed? But then HA should have restarted all affected VMs. That was not the case. So I’ve pinged both hosts and got instant reply. No, it did not look like a host crash.
Next I’ve opened the host client to have a look on VMs. All VMs were running.
I’ve opened a console session to the file server and could not login with domain credentials, but with a local account. The file server looked healthy from inside.
Now it became obvious that there was a problem with networking. But all vmnics were active and link status was “up”. The virtual standard switch on which the VM-Network portgroup resided had 3 redundant uplinks with status “up”. So where’s the problem?
I’ve found another VM that responded to ping and had internet connectivity on the same host as vCenter and the fileserver.
I opened a RDP session and from there I was able to ping every VM on the same host. Even vCenter could be connected by browser. Now the picture became clearer. One of the uplinks must have a problem, although it didn’t fail. But which one? Continue reading “Troubleshoot vmnic malfunction”