
OPERATIONAL DEFECT DATABASE
...

...
When using ESXi on hosts with AMD Zen3 (7XX3) based CPUs, independent of load and even without running virtual machines, you might notice: 100% CPU Usage spikes on any PCPU at random times when observed in the vSphere ClientMulti thousand % or more PCPU spikes when looking at esxtop / esxtop batch dataHost CPU Usage averages might over-report significantly with more spikesNo correlating high CPU usage from virtual machines or other worlds at the time of the spikesFor running VMs, large amounts of "CPU Latency", consisting of mostly "Overlap" In esxtop, those metrics are called %LAT_C and %OVRLPIn vROps, CPU Contention and Overlap respectively CPU Usage for VMs might under-report and even drop below utilization due to "CPU Latency"Cluster level CPU Usage is derived from VM CPU usage and might also under-reportWhen running many highly utilized virtual machines, some performance impact might be seen
ESXi utilizes the Processor Monitor Counter "Non Halted Core Cycles" (NHCC) for frequency scaling aware CPU usage accounting. This counter is read via the RDPMC instructions, which in itself is not guaranteed to only return an increased value when executed in short succession. When those results are returned in an unexpected order, the calculated values "wrap around" and will lead to excessive CPU usage accounting. While this issue might be seen on other CPUs, it is more noticeable on AMD Milan due to architectural differences. Note that with the exception of capacity planning or alerts triggering based on the increased CPU usage, the issue is mostly cosmetic and should not impact operation or performance. However, when running many VMs, especially when vCPU overcommitted, fairness might not be ensured. So some usually more entitled VMs might notice more contention compared to other (usually) less entitled VMs.
This issue is resolved in ESXi 6.7 P06 and ESXi 7.0 U3.
In the unlikely event that this is impacting performance or you need stable metrics for capacity planning, you can disable NHCC based CPU usage accounting by setting the advanced kernel (boot) setting "useNHCC" to false. Note that this will make ESXi unaware of run-time frequency scaling and might result in different scheduling behavior for some workloads. There is no scheduling impact to ESXi hosts that run at a set or their maximum frequency at all times. On the ESXi CLI this can be done via: esxcli system settings kernel set -s useNHCC -v FALSE
Click on a version to see all relevant bugs
VMware Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.