...
Document Version Release Date Details 2 December 9, 2022 Updated the Resolution with the permanent fix, HPE Serviceguard for Linux version 12.40.00 (or later). 1 November 7, 2018 Original Document Release. When running Serviceguard Extension for SAP on Red Hat Enterprise Linux 7 with SAP HANA, SGeSAP safesync blocks may not be set correctly if files in /var/tmp/.sgesap directory are deleted. This may prevent proper takeover of the HANA primary by secondary if the primary fails. There are a number of symptoms in the log files. When HANA attempts to update the HA DR Provider to inform Serviceguard of events, the database may block and the nameserver trace file will show many retries as follows (only a single retry is shown here): [21652]{-1}[-1/-1] 2018-07-19 16:32:37.187785 i ha_dr_SGeSAPcl SGeSAPcl.py ( 00485) : replication of HANA service for database BOB on port 31043 of host node1 reports a replication error (11) [21652]{-1}[-1/-1] 2018-07-19 16:32:37.187827 i ha_dr_SGeSAPcl SGeSAPcl.py ( 00356) : acquiring safesync hook lock... [ 21652]{-1}[-1/-1] 2018-07-19 16:32:37.187862 w ha_dr_SGeSAPcl SGeSAPcl.py ( 00358) : failed to acquire safesync lock - another safesync operation is in progress - retry later Additionally, /var/log/messages on the secondary may report messages similar to the following: Jul 20 11:03:53 node1 rootsh[05239]: sap1: 051: [21652]{-1}[-1/-1] 2018-07 - 19 16:32:37.187785 i ha_dr_SGeSAPcl SGeSAPcl.py(00485) : replication of HANA service for database BOB on port 31043 of host node1 reports a replication error (11) Jul 20 11:03:53 node1 rootsh[05239]: sap1: 052: [21652]{-1}[-1/-1] 2018-07 - 19 16:32:37.187827 i ha_dr_SGeSAPcl SGeSAPcl.py(00356) : acquiring safesync hook lock... Jul 20 11:03:53 node1 rootsh[05239]: sap1: 053: [21652]{-1}[-1/-1] 2018-07 - 19 16:32:37.187862 w ha_dr_SGeSAPcl SGeSAPcl.py(00358) : failed to acquire safesync lock - another safesync operation is in progress - retry later Jul 20 11:03:53 node1 rootsh[05239]: sap1: 054: [21652]{-1}[-1/-1] 2018-07 - 19 16:32:37.187869 e ha_dr_provider PythonProxyImpl.cpp(01107) : SGeSAPcl/SGeSAPcl:srConnectionChanged() failed with return code 1 != 0 The actual problem may have occurred earlier and the symptoms can be observed if the named pipe in the /var/tmp/.sgesap directory gets deleted by routine systemd cleanup utilities that remove any file not accessed in more than 30 days. When this occurs, the named pipe gets recreated by the SGeSAP monitor, but the reader process does not get restarted. Messages similar to the following will be displayed in the package log: Jun 17 08:40:04 rootnode1 saphdbsys.mon[23317]: (s1phdbsys_monitor): Forked ha_dr_provider listener (PID: 23980) ... Jul 18 01:19:44 rootnode1 saphdbsys.mon[23317]: (saphdbsys_monitor): Created ha_dr_provider pipe Due to the reader not being restarted, and the named pipe recreated, any writes to the named pipe will block until it is read. The read will never occur because the reader still has the now deleted named pipe open for reading however, the writer will write to the newly create pipe. This causes HANA updates to SGeSAP to block, causing this issue to occur.
Any server running Red Hat Enterprise Linux 7 that is also running SAP HANA and Serviceguard Extension for SAP 12.30.00 (or earlier) with safesync configured. SUSE Linux Enterprise Server is not affected due to differences in the shipped /usr/lib/tmpfiles.d/tmp.conf file and Red Hat Enterprise Linux 6 is not affected because it does not use systemd for this task.
To prevent this issue, upgrade to HPE Serviceguard for Linux version 12.40.00 (or later). To download the HPE Serviceguard for Linux version 12.40.00 (or later), perform the following steps: Click the following link: Hewlett Packard Enterprise Support Center Enter a product name (e.g., "HPE Serviceguard for Linux") in the text search field and wait for a list of Suggested Products to display. From the Suggested Products list displayed, identify the desired product and select it. The page should refresh to display the "DRIVERS AND SOFTWARE" tab and the components that support the selected product. From the "DRIVERS AND SOFTWARE" expandable filter menus on the top of the page: Locate and select the appropriate HPE Serviceguard for Linux edition (Base, Advanced, or Enterprise) and version (12.40.00 or later). Note: To ensure that you have selected the latest version of the firmware/driver, click the Revision History tab to check if a new version of the firmware/driver is available. For more important information, review the Release Notes tab. Click the Download button. Refer to the appropriate change Cumulative Update Release Changes document and search for the below change requests for additional information. QXCR1001665498 SGeSAP needs to preserve or move /var/tmp/.sgesap directory to prevent cleaning And QXCR1001638781 SGeSAP creates regular file instead of pipe in HANA safesync config If upgrading to HPE Serviceguard for Linux version 12.40.00 (or later) is not an option, refer to the workaround below however, HPE recommends upgrading to HPE Serviceguard for Linux version 12.40.00 (or later): As a workaround, disable this /var/tmp/ cleaning on any Red Hat Enterprise Linux 7 system that is running SGeSAP with SAP HANA. On Red Hat Enterprise Linux 7, to disable cleaning of files from /var/tmp and its subdirectories, perform the following. There is no need to restart the system, systemd will read the new configuration on the next cleaning interval and take the assigned action, which will no longer remove files not modified in more than 30 days. To undo the change and revert to standard Red Hat Enterprise Linux 7 cleaning policy, remove the /etc/tmpfiles.d/tmp.conf file as follows: # cp -p /usr/lib/tmpfiles.d/tmp.conf /etc/tmpfiles.d/ # vi /etc/tmpfiles.d/tmp.conf Replace line: v /var/tmp 1777 root root 30d With this one: v /var/tmp 1777 root root - If by analysis of package logs, this issue has already occurred, due to observing the "Created ha_dr_provider pipe" message without a subsequent "Forked ha_dr_provider listener" message, disable /var/tmp/ cleaning first. Either restart the SGeSAP HANA packages to restart the listeners and recreate the named pipe or kill the currently running listener (pid listed in the latest "Forked" message) and the monitor should restart the listener and associate it with the later created named pipe. If the named pipe /var/tmp/.sgesap/<$PKG_NAME>.sgesap.hadrpipe.tmp exists as a regular file instead of a fifo, it should be removed prior to restarting the package. If there are any questions about this procedure, please log a call with the HPE Solution Center and reference this Customer Advisory for assistance. In North America, contact HPE Customer Support at 1-844-806-3425. OR For HPE Customer Support outside of North America, click on the following URL. http://www8.hp.com/us/en/hpe/contact/support.html RECEIVE PROACTIVE UPDATES : Receive support alerts (such as Customer Advisories), as well as updates on drivers, software, firmware, and customer replaceable components, proactively in your e-mail through HPE Support Alerts. Sign up for Support Alerts at the following URL: HPE Email Preference Center. NAVIGATION TIP: For hints on navigating HPE.com to locate the latest drivers, patches and other support software downloads, refer to the Navigation Tips document. SEARCH TIP: For hints on locating similar documents on HPE.com, refer to the Search Tips document.