Symptoms
We do not see any specific errors in daemon logs.
Error messages in libnsrifmx-xxxxx.log:
XBSA-1.0.1 #.#.#.#.Build.### 16384588 DAY MONTH ## HH:mm:SS YYYY _nwbsa_is_retryable_error: received a retryable network error (Severity 0 Number -13): busy
Error messages during backup in onbar (bar_act.log) log:
YYYY-MM-DD HH:mm:SS 20120112 7930210 XBSA Error: (BSACreateObject) A system error occurred. Aborting XBSA session.
YYYY-MM-DD HH:mm:SS 10814124 7930210 XBSA Error: (BSACreateObject) A system error occurred. Aborting XBSA session.
YYYY-MM-DD HH:mm:SS 16843426 7930210 XBSA Error: (BSACreateObject) A system error occurred. Aborting XBSA session.
YYYY-MM-DD HH:mm:SS 28508590 7930210 XBSA Error: (BSACreateObject) A system error occurred. Aborting XBSA session.
Error messages during restoration.
YYYY-MM-DD HH:mm:SS 11207038 18088436 XBSA Error: (BSAGetData) A system error occurred. Aborting XBSA session.
YYYY-MM-DD HH:mm:SS 11207038 18088436 (-43391) Skipped backup/restore of space ''.
YYYY-MM-DD HH:mm:SS 18088436 28442904 (-43246) The ON-Bar process 11207038 exited with a problem (exit code 3 (0x3),signal 0).
YYYY-MM-DD HH:mm:SS 23396846 18088436 XBSA Error: (BSAGetData) A system error occurred. Aborting XBSA session.
YYYY-MM-DD HH:mm:SS 23396846 18088436 (-43391) Skipped backup/restore of space ''.
Error messages in Informix online.log:
MM/DD/YY HH:mm:SS Archive on Completed.
MM/DD/YY HH:mm:SS Level 0 Archive started on
MM/DD/YY HH:mm:SS Archive on ABORTED.
MM/DD/YY HH:mm:SS Aborted by client.
MM/DD/YY HH:mm:SS Archive on ABORTED.
MM/DD/YY HH:mm:SS Aborted by client.
Cause
Customer has set the value for BAR_MAX_BACKUP as 0 in the onconfig file which means unlimited backup streams. The specific Informix database has 47 dbspaces and hence when backup (database or logical log backup) or restore starts the database server creates many onbar processes. The same situation occurs during restore as well. That results in a stalled situation and some of the threads gets failed after retrying as per the retry value set for BAR_RETRY.
Resolution
Setting BAR_MAX_BACKUP value as 10 in the onconfig file (location: /infx/inst//informixdir/etc/onconfig.xxx) resolved the problem. We can go with a lower value, for example, 4 or 6 depending on the situation.