
OPERATIONAL DEFECT DATABASE
...

...
This article is applicable to all Avamar client versions running Windows. Perfmon (Performance Monitor) can be a powerful troubleshooting tool.It can collect Windows performance metrics over time, at specified intervals, and generate logs that can be graphically analyzed to identify system performance issues.Here we discuss which metrics to collect and how to correctly configure this tool to collect them.For more information about investigating Avamar client performance, see: Avamar client slow backup performance How to identify bottlenecks (RESOLUTION PATH)
Slow backup performance.
How to access Perfmon: Press Windows-W to open the RUN window.Type Perfmon. What and When to Measure Bottlenecks occur when a resource reaches its capacity and can cause slow performance. Bottlenecks are caused because of insufficient or misconfigured resources, malfunctioning components, and incorrect requests for resources by a program. There are five major resource areas that can cause bottlenecks and affect server performance: Physical diskMemoryProcessCPUNetwork If any of these resources are overutilized, the server or application can become noticeably slow or crash.We discuss these areas and advise which counters and thresholds can help measure the performance of a server. Sampling interval has a significant impact on the size of the log file and the server load.Set the sample interval based on the average elapsed time for the issue to occur to establish a baseline before the issue occurs again. This helps spot any trend that leads to an issue. Fifteen minutes provide a good window for establishing a baseline during normal operations. If the average elapsed time for the issue to occur is about four hours, set the sample interval to 15 s.If the time for the issue to occur is eight hours or more, set the sampling interval to no less than five minutes. These guidelines help to avoid creating a large log file, making it more difficult to analyze the data. Performance Objects and Counters Objects - component managing the performance dataCounters - performance statistics specific to particular object that describe performance characteristics specific to an object. For example, \PhysicalDisk\%Idle Time gives performance data about the Idle time observed by a spindle.Instances - multiple replicas representing a unique resource. Observing \PhysicalDisk\%Idle Time may show different spindles available on the system and their corresponding %Idle time values. Sample Interval Keep in mind the purpose and duration of the monitoring. A 15 minute logging interval is fine for routine monitoring intervals.The sample interval should be reduced to a time interval that captures the problem.For problems that build gradually, over a period of time, longer sample intervals can be used. For transient issues, use a short interval of a few seconds. This sampling interval is helpful for disk subsystem issues. Keep the duration of monitoring in mind when setting up the sample interval.If monitoring runs >8 hours, a sample interval <300 secs can result into a large file. The overhead of running the collection process itself can affect results. How to enable Perfmon Logging@. Open the command prompt as the admin user.Copy the following commands to start or stop log capture. The command below creates a Performance Monitor Dataset. Logman.exe create counter Avamar -o "c:\perflogs\Emc-avamar.blg" -f bincirc -v mmddhhmm -max 250 -c "\LogicalDisk(*)\*" "\Memory\*" "\Network Interface(*)\*" "\Paging File(*)\*" "\PhysicalDisk(*)\*" "\Processor(*)\*" "\Process(*)\*" "\Redirector\*" "\Server\*" "\System\*" -si 00:00:05 Start the logs with: Logman.exe start Avamar Stop the logs with: Logman.exe stop Avamar Above commands can be modified to collect SQL server Performance Monitor data during backups as: First create a folder for log collection as C:\SQL_Performance_Logs\ For default SQL instance run: Logman create counter Avamar_SQL_perf_log -f bin -c "\Network Interface(*)\*" "\Redirector\*" "\Paging File(*)\*" "\Memory\*" "\PhysicalDisk(*)\*" "\LogicalDisk(*)\*" "\Server\*" "\System\*" "\Process(*)\*" "\Processor(*)\*" "\SQLServer:Databases(*)\*" "\SQLServer:Buffer Manager\*" "\SQLServer:Memory Manager\*" "\SQLServer:SQL Statistics\*" -si 00:00:05 -max 800 -cnf 0 -o C:\SQL_Performance_Logs\AvamarSQL_perf_log.blg For named instance, replace server with instance name Logman create counter Avamar_SQL_perf_log -f bin -c "\Network Interface(*)\*" "\Redirector\*" "\Paging File(*)\*" "\Memory\*" "\PhysicalDisk(*)\*" "\LogicalDisk(*)\*" "\Server\*" "\System\*" "\Process(*)\*" "\Processor(*)\*" "\SQLServer:Databases(*)\*" "\SQLServer:Buffer Manager\*" "\SQLServer:Memory Manager\*" "\MSSQL$InstanceName:SQL Statistics\*" -si 00:00:05 -max 800 -cnf 0 -o C:\SQL_Performance_Logs\AvamarSQL_perf_log.blg start collecting logs: Logman start Avamar_SQL_perf_log stop log collection: Logman stop Avamar_SQL_perf_log Counters and their Threshold values Memory %Committed bytes in use: Committed memory is the physical memory in use for which space has been reserved in the paging file should it need to be written to disk.The size of the paging file determines the commit limit. If the paging file is enlarged, the commit limit increases, and the ratio is reduced.This counter displays the current percentage value only. It is not an average. If this value is consistently over 80%, then the page file may be too small. Available bytes: Available Bytes is the amount of physical memory, in bytes, immediately available for allocation to a process or for system use. This is rarely a constraint on X64 systems. If this value falls below 5% of installed RAM on a consistent basis, you should investigate. If the value drops below 1% of installed RAM on a consistent basis, there is a definite problem. Committed Bytes: Committed memory is the physical memory which has space reserved on one or more disk paging files.There can be one or more paging files on each hard drive.This counter should ideally never change. Changes indicate page file expansion and should be investigated immediately. Free System Page Table entries:This used to be a concern on older x86 versions. On a Windows Server 2003 SP2 server, booting up without the /3Gb switch, the value is approximately 200,000 PTEs.When booting with the /3Gb switch, this drops to ~25000 PTEs. Pool Nonpaged Bytes: Pool Nonpaged Bytes is the size (bytes) of the nonpaged pool. This is an area of system memory (physical memory used by the operating system) for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated. If a Nonpaged pool is running at greater than 80%, on a consistent basis, you may be headed for a Nonpaged Pool Depletion issue (Event ID 2019). Pool Paged Bytes:Pool Paged Bytes is the size, in bytes, of the paged pool, an area of system memory (physical memory used by the operating system) for objects that can be written to disk when they are not being used.Paged Pool is a larger resource than Nonpaged pool. If this value is consistently greater than 70% of the maximum configured pool size, you may be at risk of a Paged Pool depletion (Event ID 2020). Processor (Check For EACH processor and overall) %Interrupt time:The time the processor spends receiving and servicing hardware interrupts during sample intervals. This value is an indirect indicator of the activity of devices that generate interrupts. For example, the system clock, the mouse, disk drivers, data communication lines, network interface cards and other peripheral devices. These devices interrupt the processor when they have completed a task or require attention. %DPC time:Indicates time required to complete an I/O Operation. Similar to the above, any value of >25% should be investigated. %Privileged Time:Time operating system kernel is doing work. Usually the threshold is less than 30% for application or WEB servers. %Processor Time:Sustained values > 90% on a single processor machine, or > 80% on a multiprocessor machine should be investigated. Network Interface Packets received discarded:This is used to check potential hardware issues. Threshold value > 1. A possible remedy is to adjust network buffers. Packets received errors:This is used to check potential hardware issues. Threshold value > 2 Disk (For Each Disk) %Idle time:This counter provides a precise measurement of the time that the disk was idle, meaning all the requests from the operating system to the disk have been completed and there are zero pending requests.Calculation occurs by the system timestamping an event when the disk goes idle, then timestamping another event when the disk receives a new request.At the end of the capture interval, it calculates the percentage of the time spent idle. This counter ranges from 100 (meaning always Idle) to 0 (meaning always busy). This counter accurately determines the saturation of the disk subsystem. Avg. Disk Queue Length:Avg. Disk Queue Length is equal to the (Disk Transfers/sec) *(Disk sec/Transfer).This is based on Little s Law from the mathematical theory of queues. Note, this is a derived value and not a direct measurement. Any value less than double the number of spindles is a good value. Avg Disk Sec/Transfer:Displays the average time that the disk transfers took to complete, in seconds.Although the scale is seconds, the counter has millisecond precision, meaning a value of 0.004 indicates the average time for disk transfers to complete was 4 milliseconds.This is the counter in Perfmon used to measure I/O latency. Here are the sample values. These may vary on the quality of the disks being used: Reads Excellent < 08 Msec (.008 seconds) Good < 12 Msec (.012 seconds) Fair < 20 Msec (.020 seconds) Poor > 20 Msec (.020 seconds) Writes Excellent < 01 Msec (.001 seconds) Good < 02 Msec (.002 seconds) Fair < 04 Msec (.004 seconds) Poor > 04 Msec (.004 seconds) Split I/Os:Measures the rate of I/O split due to file fragmentation. This happens if the I/O request touches data on noncontiguous file segments. Should be close to zero.This might be different because of the RAID Stripe size or the NTFS block size being too small. % Free Space: Display the percentage of the total usable space on the selected logical disk that was free. There should always be >15% free space, the recommended being >=25%. Process Handle Count: Correlate with pool leaks. Virtual bytes: Virtual memory reserved to be used by an application. Working set bytes: Private bytes resident in physical memory that is owned by an application. What is the difference between the Physical Disk vs. Logical Disk performance objects in Perfmon? Perfmon has two objects directly related to disk performance, Physical Disk and Logical Disk.Their counters are calculated in the same way but their scope is different. The Physical Disk performance object monitors disk drives on the computer. It identifies the instances representing the physical hardware. The counters are the sum of the access to all partitions on the physical instance. The Logical Disk Performance object monitors logical partitions. A performance monitor identifies logical disks by their drive letter or mount point.If a hard drive contains multiple partitions, this counter reports the values for the partition selected and not for the entire disk.When using Dynamic Disks, the logical volumes may span more than one hard drive, in this scenario the counter values include access to the logical disk in all the hard drives it spans. Which counters in Windows Performance Monitor show the hard drive latency? Physical disk performance object -> Avg. Disk sec/Read counter - Shows the average read latency.Physical disk performance object -> Avg. Disk sec/Write counter - Shows the average write latency.Physical disk performance object -> Avg. Disk sec/Transfer counter - Shows the combined averages for both read and writes.The _Total instance is an average of the latencies for all hard drives in the computer. Each other instance represents an individual Physical Disk. Counters to watch while monitoring in different situations -: Component Performance aspect being monitored Counters to monitor Disk Usage Physical Disk\ Sec/Read Physical Disk\ Sec/Write Physical Disk\ Disk Reads/sec Physical Disk\ Disk Writes/sec Physical Disk\ Avg Queue Length Read Physical Disk\ Avg Queue Length Write Physical Disk\ % Idle Time Logical Disk\ % Free Space Interpret the % Disk Time counter carefully. Because the _Total instance of this counter may not accurately reflect utilization on multiple-disk systems, it is important to use the % Idle Time counter. This counter accurately reflects the amount of work done by the system but not the capacity of the disk subsystem. The Idle time accurately reflects the capacity of the disk subsystem. Disk Bottlenecks Physical Disk\ ALL COUNTERS Logical Disk\ % Free Space System\File Control Operations/sec System\File Data Operations/sec Note: The location of this key counter is under system. It is not volume-specific but is useful if you have only one active volume. Memory Usage Memory\ Available Bytes Memory\ Cache Bytes Memory\ % Committed Bytes in Use. Memory\ Pool Non-Paged Bytes Memory\ Pool Paged Bytes. Memory\ Pages Input or Reads/sec Memory\ Free System Pages Memory Bottlenecks or leaks Memory\ Available Bytes Memory\ Cache Bytes Memory\ Pages/sec Memory\ Page Inputs or Reads /sec Memory\ Page Output or Write /sec. Memory\ Pool Paged Bytes. Memory\ Pool Non-Paged Bytes Memory\ Free System Pages Although not specifically Memory object counters, the following are also useful for memory analysis: Paging File\ % Usage object (all instances) Cache\ Data Map Hits %. Processor Usage Processor\ % Processor Time (all instances) Processor\ % Privileged Time Processor\ % User Time Processor Bottlenecks Processor\ % Processor Time (all instances) Processor\ % DPC Time Processor\ % Interrupt Time Processor\ % Privileged Time Processor\ % User Time Processor\ Interrupts/sec Processor\ DPC s Queued /sec. System\Context switches /sec. System\System Calls/sec System\ Processor Queue Length (all instances)
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.