
OPERATIONAL DEFECT DATABASE
...


...

Below is a VPLEX Call Home message that is reported due to a director's '/var/log' file system showing a high utilization which exceeded the threshold. Example output of the call home message: SymptomCode: 0x8a4b9001 Category: Status Severity: Critical Status: Failed Component: DIRECTOR ComponentID: director-1-2-B <- Note the component the call home was reported from SubComponent: fsmon SubComponentID: CallHome: Yes FirstTime: 2017-12-07T15:13:43.029Z LastTime: 2017-12-13T05:00:13.093Z Count: 8 CDATA: The filesystem mounted at '/var/log' is 94% full. [Versions:MS{x.x.x.x.x, x.x.x.x, x.x.x.x}, Director{x.x.x.x.x}, ClusterWitnessServer{x.x.x.x}] RCA: A filesystem has exceeded its usage threshold and is getting low on free space. In the sample Call Home message the device (lisedt as ComponentID) that reported the issue was director-1-2-B, whose IP address is 128.221.252.38 (the IP address is not listed in the call home message). The sample data below shows accessing the above affected director from the management server prompt using the secure shell (ssh) command as follows: Sample output: login as: service Keyboard-interactive authentication prompts from server: Password: End of keyboard-interactive prompts from server Last login: Wed Dec 13 11:39:14 2017 from x.x.x.x service@managementserver:~> ssh root@128.221.252.38 << Denotes director-1-2-B (Cluster-1, engine-1-2, Director-B) Once on the director the disk filesystem (df) command is run to check the usage of the file system reported in the Call Home message, in this case '/var/log'. The '-h' option displays the sizes in human readable values. Sample output director-1-2-b:~ # df -h director-1-2-b:~ # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 2.0G 945M 970M 50% / udev 18G 200K 18G 1% /dev tmpfs 18G 4.0K 18G 1% /dev/shm /dev/sda1 124M 29M 89M 25% /boot /dev/sda5 124M 7.6M 111M 7% /var/opt/zephyr/flashDir /dev/sda6 1007M 892M 64M 94% /var/log <- Use has exceeded the threshold in percentage. /dev/sda7 6.0G 1.5G 4.2G 26% /cores tmpfs 1.0G 132K 1.0G 1% /tmp Filesystem Size Used Avail Use% Mounted on /dev/sda2 2.0G 945M 970M 50% / udev 18G 200K 18G 1% /dev tmpfs 18G 4.0K 18G 1% /dev/shm /dev/sda1 124M 29M 89M 25% /boot /dev/sda5 124M 7.6M 111M 7% /var/opt/zephyr/flashDir /dev/sda6 1007M 892M 64M 94% /var/log <- Use has exceeded the threshold in percentage. /dev/sda7 6.0G 1.5G 4.2G 26% /cores tmpfs 1.0G 132K 1.0G 1% /tmp Change the directories (cd) to the /var/log directory as follows: Sample output: director-1-2-b:~ # cd /var/log The disk usage (du) command was used to check the disk usage of files in the partition '/var/log/'. NOTE: In the below sample output, you see that the following options were used, The '-s' option is for reporting only the sum of the usage,The '-k' option lists the usage in 1024 bytes,The 'sort -n' option sorts the output according to the numerical value, smallest to largest,The 'tail -xx' option shows that the last number of lines indicated by the value entered, in this case -20 is to show the last 20 lines,The asterisk ' * ' is a wildcard that says show all files that you are looking for, files that look to be large. For these options noted, the steps further down were used to check each file noted with a large file size, 10 MB to 20 MB. Sample output: director-1-2-b:/var/log # du -sk * |sort -n |tail -20 712 vpsplitter.log.27.gz 716 vpsplitter.log.17.gz 716 vpsplitter.log.25.gz 716 vpsplitter.log.29.gz 1028 consolebufferDump 2212 syslog-ng.log 3360 cimomlog.txt 3780 cimomlog.txt.0 3780 cimomlog.txt.1 3780 cimomlog.txt.2 3780 securitylog.txt.0 3780 securitylog.txt.1 3780 securitylog.txt.2 4620 fll_blip_server.log 4856 vpsplitter.log.00 5136 fll_blip_server.log.1 10988 cron 13788 messages 113784 ecomofl.log <- logrotate job failed, file size is 113 MB. 384436 nsfw.log <- logrotate job failed, file size is 384 MB. Next, the defined rotation log size is checked for files that were showing to have large file sizes. In the above example, the configuration file of 'ecomofl' and 'nsfw' logs are defined in the following file: director-1-2-b: # cat /etc/logrotate.d/vplex ... } /var/log/ecomofl.log { missingok rotate 3 <-number of times a file will be rotated size=5M <-defined file size before log is rotated compress copytruncate } director-1-2-b:~ # cat /etc/logrotate.d/nsfw /var/log/nsfw.log { missingok rotate 20 <- number of times a file will be rotated size=10M <- defined file size before log is rotated create 640 root root compress nocopy sharedscripts postrotate /etc/init.d/syslog reload >/dev/null endscript Note: The "rotate" value listed above is how many times a log is rotated before it starts to over write the previous rotated logs.
About the logrotate feature:Logrotate is designed to ease administration of systems that generate large numbers of log files. It allows automatic rotation, compression, removal, and mailing of log files. Each log file may be handled daily, weekly, monthly, or when it grows too large. Logrotate is run as a daily cron job. It does not rotate a log multiple times in one day unless the criterium for that log, based on the log's size, is met. Cause:If you find the size of the large log file exceeds the file size defined in the logrotate file, it means that the logrotate process may have failed for some reason. In the sample output below, where you see "error:" listed, this reports the corrupt 'logrotate.status' file, which is the cause of this failure: Sample output: director-1-2-b:~ # logrotate -v /etc/logrotate.conf ... reading config info for /var/log/zypp/history reading config file zypp-refresh.lr reading config info for /var/log/zypp-refresh.log error: bad line 4 in state file /var/lib/logrotate.status <- why logrotate failed The below 'cat' command was used to check the 'logrotate.status' file to see what the reason was for the error. director-1-2-b:~ # cat /var/lib/logrotate.status logrotate state -- version 2 ... "/var/log/nsfw_alarms.log" 2017-10-23 "/var/log/nsfw.log" 2017-10-23 ar/log/wtmp" 2017-10-22 <- abnormal line, should read "/var/log/wtmp."
Permanent Fix: The logrotate issue is permanently fixed in GeoSynchrony 5.5 SP2 P5 and 6.1 and later. NOTE:For VS2s, GeoSynchrony code versions through 6.1.x are End of Service Support (EOSS) and no longer supported. If you are running your VPLEX on one of these EOSS code versions it is recommended that you contact your local field representative and discuss the planning for the upgrade of the VPLEX to the code version, which has many fixes, enhancements and vulnerability fixes not available in the EOSS code versions and that you benefit from. Workaround: For those code versions the fix is not available for, to fix the corrupted 'logrotate.status' file for the affected director (in the example mentioned above, the affected director is director-1-2-B, follow the steps below. Log in into the management server with the service account credentials. NOTE: If this is a VPLEX-Metro, be sure you access the management server for the director where the issue was reported as shown in the Issue section above. In the sample output mentioned in "issue" section above, the director reporting the issue is director-1-2-B of cluster-1 as follows: login as: service Using keyboard-interactive authentication. Password: Last login: Mon Dec 13 11:39:14 2021 from xx.xx.xx.xx service@ managementserver :~> Next ssh to the root of the affected director the report came from using that directors internal Primary IP Address, in this case director-1-2-B's primary IP address is 128.221.252.38. service@managementserver:~> ssh root@128.221.252.38 Last login: Mon Dec 13 11:52:37 2021 from 128.221.252.33 VPLEX director-1-2-b:~ # Follow the below steps to replace and remove the corrupted 'logrotate.status' file. From the affected director context, Change Directory (cd) to the '/var/lib' folder as follows: director-1-2-b:~ # cd /var/lib/ To ensure that the file is there, type the 'ls -l' command on the 'logrotate.status' file as follows: director-1-2-b:/var/lib # ls -l logrotate.status -rw-r--r-- 1 root root 1531 Dec 28 05:15 logrotate.status Make a backup copy of the original 'logrotate.status' file using the copy (cp) command. NOTE: The -p option preserves the specified attributes, the -r option performs a recursive copy of the file. director-1-2-b:/var/lib # cp -pr logrotate.status logrotate.status.ori Run the 'ls -l' command again on the 'logrotate.status' file using the asterisk wildcard, '*', to ensure the original and backup files are present as follows: director-1-2-b:/var/lib # ls -l logrotate.status* -rw-r--r-- 1 root root 1531 Dec 28 05:16 logrotate.status -rw-r--r-- 1 root root 1531 Dec 28 05:16 logrotate.status.ori Remove the current/original 'logrotate.status' file as follows: NOTE: -r option is for recursively removing the contents of the file, the -f option forces the command to run without prompts. director-1-2-b:/var/lib # rm -rf logrotate.status Follow the below steps to re-create the 'logrotate.status' file. In this step, you re-create the 'logrotate.status' file through syslog, From the affected director context, Change Directory (cd) to the '/usr/sbin' folder. director-1-2-b:/var/lib # cd /usr/sbin director-1-2-b:/usr/sbin # Manually run the logrotate command which re-creates the 'logrotate.status' file. director-1-2-b:/usr/sbin # logrotate -f /etc/logrotate.d/* error: /etc/logrotate.d/syslog.rpmnew:12 duplicate log entry for /var/log/warn error: found error in /var/log/warn /var/log/messages /var/log/allmessages /var/log/localmessages /var/log/firewall /var/log/acpid /var/log/NetworkManager , skipping error: /etc/logrotate.d/syslog.rpmnew:28 duplicate log entry for /var/log/mail error: found error in /var/log/mail /var/log/mail.info /var/log/mail.warn /var/log/mail.err , skipping error: /etc/logrotate.d/syslog.rpmnew:44 duplicate log entry for /var/log/news/news.crit error: found error in /var/log/news/news.crit /var/log/news/news.err /var/log/news/news.notice , skipping error: "/var/log/rabbitmq" has insecure permissions. It must be owned and be writable by root only to avoid security problems. Set the "su" directive in the config file to tell logrotate which user/group should be used for rotation. error: "/var/log/rabbitmq" has insecure permissions. It must be owned and be writable by root only to avoid security problems. Set the "su" directive in the config file to tell logrotate which user/group should be used for rotation. Reload syslog service Verify that the 'logrotate.status' file is available under the '/var/lib/' folder, showing the current date on which it was re-created. director-1-2-b:/var/lib # ls -l logrotate.status -rw-r--r-- 1 root root 1531 Mon 13:15:15 logrotate.status Manually verify whether the log rotation is working now by running the logrotate command with the verbose, '-v' option, against the '/etc/logrotate.conf' file. If no errors are reported, the logrotate is now working again. director-1-2-b:/var/lib # logrotate -v /etc/logrotate.conf reading config file /etc/logrotate.conf compress_prog is now /usr/bin/bzip2 compress_ext was changed to .bz2 uncompress_prog is now /usr/bin/bunzip2 including /etc/logrotate.d reading config file net-snmp ... rotating pattern: /var/log/zypp/history 10485760 bytes (99 rotations) empty log files are not rotated, old logs are removed considering log /var/log/zypp/history log /var/log/zypp/history does not exist -- skipping rotating pattern: /var/log/zypp-refresh.log 10485760 bytes (99 rotations) empty log files are not rotated, old logs are removed considering log /var/log/zypp-refresh.log log /var/log/zypp-refresh.log does not exist -- skipping Confirm if the use percentage of the '/var/log/' partition on the affected director is reduced using the 'df -h' command again. Sample output: director-1-2-b:~ # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 2.0G 945M 970M 50% / udev 18G 200K 18G 1% /dev tmpfs 18G 4.0K 18G 1% /dev/shm /dev/sda1 124M 29M 89M 25% /boot /dev/sda5 124M 7.6M 111M 7% /var/opt/zephyr/flashDir /dev/sda6 1008M 114M 843M 12% /var/log <- size is below the threshold /dev/sda7 6.0G 1.5G 4.2G 26% /cores tmpfs 1.0G 132K 1.0G 1% /tmp NOTE: If the issue still persists even after following the above resolution steps, contact Dell.VPLEX Customer Support and mention this article.
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.