...
What were you trying to do that didn't work? fapolicyd crashes while reloading, crash indicates the SQLite database may be the culprit and leaves the system in an unresponsive state. Please provide the package NVR for which bug is seen: sqlite-3.34.1-7.el9_3.x86_64 rpm-4.16.1.3-27.el9_3.x86_64 fapolicyd-1.3.2-100.el9.x86_64 How reproducible: Very often on customer's system (we gave a hotfix to fapolicyd to better handle signals to the customer in order to gather the coredump) Steps to reproduce Coredump available at appcore.usersys (crashing thread #3) ID: 5e18ee0f949d969f22d0678e5e8f383e2b667adbbb2e965078e66c2340fabdc8 $ appcore-cli gdb --id 5e18ee0f949d969f22d0678e5e8f383e2b667adbbb2e965078e66c2340fabdc8 Expected results SQLite thread not to crash Actual results RMetrich's analysis: The coredump shows Thread 3 is crashing, even though Thread 1 got the signal. This is because Thread 3 sent the signal to Thread 1 then paused itself (it's the patch I made to avoid the deadlock in fapolicyd). (gdb) info threads Id Target Id Frame * 1 Thread 0x7f9b7641b780 (LWP 217901) 0x00007f9b763426ff in __GI___poll (fds=0x7ffc0b4c0b10, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29 2 Thread 0x7f9b719ff640 (LWP 217903) __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55af20de24cc <do_decision+44>) at futex-internal.c:57 3 Thread 0x7f9b729ff640 (LWP 217902) 0x00007f9b763184c2 in __libc_pause () at ../sysdeps/unix/sysv/linux/pause.c:29 4 Thread 0x7f9b711fe640 (LWP 217904) 0x00007f9b76313975 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7f9b711fdd40, rem=rem@entry=0x7f9b711fdd40) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48 (gdb) thread 3 [Switching to thread 3 (Thread 0x7f9b729ff640 (LWP 217902))] #0 0x00007f9b763184c2 in __libc_pause () at ../sysdeps/unix/sysv/linux/pause.c:29 29 return SYSCALL_CANCEL (pause); (gdb) bt #0 0x00007f9b763184c2 in __libc_pause () at ../sysdeps/unix/sysv/linux/pause.c:29 #1 0x000055af20dcaee5 in coredump_handler (sig=7) at daemon/fapolicyd.c:226 #2 <signal handler called> #3 0x00007f9b7617bbc9 in sqlite3WalFindFrame.constprop.0 (pWal=0x7f9b6c0ae298, pgno=2945, piRead=piRead@entry=0x7f9b729faf34) at /usr/src/debug/sqlite-3.34.1-7.el9_3.x86_64/sqlite3.c:62715 #4 0x00007f9b760a24e2 in readDbPage (pPg=pPg@entry=0x7f9b6c096a30) at /usr/src/debug/sqlite-3.34.1-7.el9_3.x86_64/sqlite3.c:54887 #5 0x00007f9b760a49f7 in getPageNormal (pPager=0x7f9b6c115848, pgno=2945, ppPage=0x7f9b729fafa0, flags=<optimized out>) at /usr/src/debug/sqlite-3.34.1-7.el9_3.x86_64/sqlite3.c:57461 [...] The code dies in frame 3: (gdb) f 3 #3 0x00007f9b7617bbc9 in sqlite3WalFindFrame.constprop.0 (pWal=0x7f9b6c0ae298, pgno=2945, piRead=piRead@entry=0x7f9b729faf34) at /usr/src/debug/sqlite-3.34.1-7.el9_3.x86_64/sqlite3.c:62715 62715 while( (iH = AtomicLoad(&sLoc.aHash[iKey]))!=0 ){ (gdb) p &sLoc.aHash[iKey] $1 = (volatile ht_slot *) 0x7f9b76419bfe (gdb) p sLoc.aHash[iKey] $2 = 0 The code looks valid to me. Indeed, we have a AtomicLoad() call on the pointer, which does this (line 206 or 209): 205 #if GCC_VERSION>=4007000 || __has_extension(c_atomic) 206 # define AtomicLoad(PTR) __atomic_load_n((PTR),__ATOMIC_RELAXED) 207 # define AtomicStore(PTR,VAL) __atomic_store_n((PTR),(VAL),__ATOMIC_RELAXED) 208 #else 209 # define AtomicLoad(PTR) (*(PTR)) 210 # define AtomicStore(PTR,VAL) (*(PTR) = (VAL)) 211 #endif Assuming it's line 209 (semantic is similar), then we would have (*(&sLoc.aHash[iKey])) which evaluates to (sLoc.aHash[iKey]) which is here (0).
Duplicate
Red Hat Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.