...
Typically there is a default scheduled task created to run Disk Defragmentation from 7 days of every "Last Run date". [ Windows Task Scheduler --> \Microsoft\Windows\Defrag\ScheduledDefrag ] If these scheduled defragmentation tasks run on multiple VDI Desktops at the same time then it may cause vSCSI resets and lower the performance of the storage due to the nature of instant clones running on snapshots. On large-scale deployments, this may slow down the operating system which interrupts the Horizon Agent's internal threads, causing a JMS exception. As a result, the instant clones desktops may go into an Agent Unreachable state. Log Line entries similar to the below can be seen in Horizon Agent Debug LogsLocation of Horizon (VDM) log files (1027744): YYYY-MM-DDTHH:MM:SS.XXX+Timezone ERROR (038C-03D0) <TimerService> [lsass] Timer callback missed by 1.4x, name=class LsaSessionCache::RemoveExpiredPending, expected=504136828, now=504245859, interval=60000, intervalElapsed=109031 YYYY-MM-DDTHH:MM:SS.XXX+TimezoneDEBUG (1970-0F9C) <pool-3-thread-3> [AbstractTopicPublishingManager] (DesktopControlPublishingManager-agent) Requeuing message after JMS failure YYYY-MM-DDTHH:MM:SS.XXX+Timezone ERROR (1970-0F9C) <pool-3-thread-3> [AbstractJmsConnectionUser] (DesktopControlPublishingManager-agent) Hit an unexpected exception, requesting clean reconnect YYYY-MM-DDTHH:MM:SS.XXX+Timezone DEBUG (1970-1F3C) <Thread-13> [EventLoggerService] Info_Event:[AGENT_SHUTDOWN] "The agent running on machine <vdimachinename> has shut down, this machine will be unavailable": MachineId=xxxxxx-xxxxx-xxxxx-xxxx-xxxxxx, PoolId=<poolname>, MachineName=<machinename>, Node=<machinefqdn>, Severity=INFO, Time=Day Date HH:MM:SS <Timezone> YYYY, Module=Agent, Source=com.vmware.vdi.events.client.EventLogger, Acknowledged=true Esxi VMkernel Logs: ESXi Log File Locations - VMware Docs YYYY-MM-DDTHH:MM:SS.Secs cpu1:111148975)WARNING: VSCSI: 3502: handle 83505(vscsi0:0):WaitForCIF: Issuing reset; number of CIF:2 YYYY-MM-DDTHH:MM:SS.Secs cpu1:111148975)WARNING: VSCSI: 2650: handle 83505(vscsi0:0):Ignoring double reset
This article provides more details about impact of running disk defragmentation on large scale Instant Clone Virtual Machines running on snapshots.
The operating system running disk defragmentation tasks re-orders a large number of data.This can add up additional IO loads on the Virtual Scsi controllers.Windows 1809 and later operating system runs the getLbaStatus command on the entire disk.If multiple virtual machine executes Space Reclamation / Disk Defrag together can cause repeated SCSI resets and sometimes VMs may go to a hung state.
This issue can impact only when large-scale of virtual machines running disk defragmentation at the same time
It is always best practice to disable "Disk Defragmentation" with Linked clones and Instant Clones deployments. Please see Benefits of Disabling Windows Services and Tasks We could defrag the disk on Parent Image/Master Image and disable the scheduled task at [ Windows Task Scheduler --> \Microsoft\Windows\Defrag\ScheduledDefrag ]. Alternatively, please run VMware OS Optimization Tool which can be downloaded from here, and disables the disk defragmentation scheduled task This tool not only disables the disk defragmentation but also prepares the image best suited for different types of Horizon Deployments. For more information about VMware OS Optimization Tool, please click here. Once the Operating System is Optimized with this tool or the defragmentation task is disabled, please take a new snapshot of the Master Image and push the changes to target Horizon Desktop Pools.
For additional details on Best practices with Horizon Images, please see Horizon View Best Practices: Parent Image Creation and Maintenance. (90152)