...
Description of problem: The setup consists of an IPA server with 10K users: $ ldapsearch -xLLL -D"cn=Directory Manager" -W -b "cn=accounts,dc=example,dc=com" uid=* 1.1 | grep -c "^dn: " Enter LDAP Password: 10006 $ Each user has its own private group. eg: ldapsearch -xLLL -D"cn=Directory Manager" -W -b "cn=groups,cn=accounts,dc=example,dc=com" cn=user_1234 1.1 description Enter LDAP Password: dn: cn=user_1234,cn=groups,cn=accounts,dc=example,dc=com description: User private group for user_1234 $ All users belong to the ipausers group: $ ldapsearch -xLLL -D"cn=Directory Manager" -W -b "cn=ipausers,cn=groups,cn=accounts,dc=example,dc=com" member | grep -c "^member: " Enter LDAP Password: 10005 $ Almost each user has its own home directory: $ ls -1 /home | grep -c user_ 9903 $ When enumeration is enabled, there are constant paged searches sent to the LDAP server: $ grep ^enum /etc/sssd/sssd.conf enumerate = true $ LDAP access log excerpt: ================================================ [16/Oct/2022:16:48:32.117068689 +0100] conn=4504 op=12 SRCH base="cn=accounts,dc=example,dc=com" scope=2 filter="(&(objectClass=posixAccount)(uid=)(uidNumber=)(gidNumber=*))" attrs="objectClass uid userPassword uidNumber gidNumber gecos homeDirectory loginShell krbPrincipalName cn memberOf ipaUniqueID ipaNTSecurityIdentifier modifyTimestamp entryusn shadowLastChange shadowMin shadowMax shadowWarning shadowInactive shadowExpire shadowFlag krbLastPwdChange krbPasswordExpiration pwdattribute authorizedService accountexpires useraccountcontrol nsAccountLock host logindisabled loginexpirationtime loginallowedtimemap ipaSshPubKey ipaUserAuthType usercertificate;binary mail" [16/Oct/2022:16:48:32.733454360 +0100] conn=4504 op=12 RESULT err=0 tag=101 nentries=1000 wtime=0.000415409 optime=0.616401060 etime=0.616812792 notes=U,P details="Partially Unindexed Filter,Paged Search" pr_idx=0 pr_cookie=0 [16/Oct/2022:16:48:32.868938487 +0100] conn=4504 op=13 SRCH base="cn=accounts,dc=example,dc=com" scope=2 filter="(&(objectClass=posixAccount)(uid=)(uidNumber=)(gidNumber=*))" attrs="objectClass uid userPassword uidNumber gidNumber gecos homeDirectory loginShell krbPrincipalName cn memberOf ipaUniqueID ipaNTSecurityIdentifier modifyTimestamp entryusn shadowLastChange shadowMin shadowMax shadowWarning shadowInactive shadowExpire shadowFlag krbLastPwdChange krbPasswordExpiration pwdattribute authorizedService accountexpires useraccountcontrol nsAccountLock host logindisabled loginexpirationtime loginallowedtimemap ipaSshPubKey ipaUserAuthType usercertificate;binary mail" [16/Oct/2022:16:48:33.445037639 +0100] conn=4504 op=13 RESULT err=0 tag=101 nentries=1000 wtime=0.000337243 optime=0.576117464 etime=0.576449937 notes=U,P details="Partially Unindexed Filter,Paged Search" pr_idx=0 pr_cookie=0 ================================================ These searches keep ongoing. For instance, more than 30 minutes later: $ grep "16/Oct/2022:17:27:" access | grep -c "notes=U,P" 24 $ SSSD domain log keeps increasing with the following message: ================================================ ... (2022-10-16 17:32:25): [be[example.com]] [sysdb_create_ts_entry] (0x0040): Error: 17 (File exists) ... skipping repetitive backtrace ... (2022-10-16 17:32:25): [be[example.com]] [sysdb_create_ts_entry] (0x0040): ldb_add failed: [Entry already exists](68)[Entry name=user_<XXX>@example.com,cn=users,cn=example.com,cn=sysdb already exists] ... skipping repetitive backtrace ... (2022-10-16 17:32:25): [be[example.com]] [sysdb_create_ts_entry] (0x0040): Error: 17 (File exists) (2022-10-16 17:32:25): [be[example.com]] [server_setup] (0x1f7c0): Starting with debug level = 0x0070 ... ================================================ This will eventually fill all available disk space: $ df -lk / Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/rhel_unused-root 22529284 22508108 21176 100% / $ $ du -sh /var/log/sssd/ 11G /var/log/sssd/ $ $ date; service sssd stop ; rm -f /var/lib/sss/db/* /var/log/sssd/* ; service sssd start Sun Oct 16 16:39:38 IST 2022 Redirecting to /bin/systemctl stop sssd.service Redirecting to /bin/systemctl start sssd.service $ $ df -lk / Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/rhel_unused-root 22529284 11422644 11106640 51% / $ There is a high CPU usage from SSSD backend and LDAP processes: ================================================ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 257035 root 20 0 667688 130424 21576 R 98.3 1.6 0:09.08 sssd_be 197719 dirsrv 20 0 1492560 295052 62544 S 1.3 3.7 461:33.43 ns-slapd ... PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 197719 dirsrv 20 0 1492560 295064 62544 S 84.4 3.7 461:42.90 ns-slapd 257124 root 20 0 596752 60432 19620 S 20.9 0.8 0:00.94 sssd_be ================================================ SSSD cache is mounted in tmpfs: $ grep sss /etc/fstab tmpfs /var/lib/sss/db/ tmpfs size=300M,mode=0700,uid=sssd,gid=sssd,rootcontext=system_u:object_r:sssd_var_lib_t:s0 0 0 $ $ df -lk /var/lib/sss/db Filesystem 1K-blocks Used Available Use% Mounted on tmpfs 307200 30608 276592 10% /var/lib/sss/db $ Version-Release number of selected component (if applicable): $ cat /etc/redhat-release Red Hat Enterprise Linux release 8.6 (Ootpa) $ $ rpm -qa | grep sssd sssd-2.6.2-4.el8_6.1.x86_64 sssd-client-debuginfo-2.6.2-4.el8_6.1.x86_64 sssd-common-2.6.2-4.el8_6.1.x86_64 sssd-ipa-2.6.2-4.el8_6.1.x86_64 sssd-krb5-2.6.2-4.el8_6.1.x86_64 sssd-debugsource-2.6.2-4.el8_6.1.x86_64 sssd-client-2.6.2-4.el8_6.1.x86_64 sssd-dbus-2.6.2-4.el8_6.1.x86_64 sssd-krb5-common-2.6.2-4.el8_6.1.x86_64 python3-sssdconfig-2.6.2-4.el8_6.1.noarch sssd-nfs-idmap-2.6.2-4.el8_6.1.x86_64 sssd-tools-2.6.2-4.el8_6.1.x86_64 sssd-kcm-2.6.2-4.el8_6.1.x86_64 sssd-common-pac-2.6.2-4.el8_6.1.x86_64 sssd-ad-2.6.2-4.el8_6.1.x86_64 sssd-ldap-2.6.2-4.el8_6.1.x86_64 sssd-proxy-2.6.2-4.el8_6.1.x86_64 sssd-debuginfo-2.6.2-4.el8_6.1.x86_64 $ How reproducible: Always. Steps to Reproduce: 1. Create 10K IPA users 2. Enable SSSD enumeration 3. Restart SSSD 4. Check SSSD and LDAP logs Actual results: Constant high CPU usage and LDAP requests. Expected results: After collecting the initial data from LDAP and warming its caches, SSSD should perform less LDAP requests. Additional info: LDAP IDL scan limit is set to 100K. Paged searches will use the same limit: $ grep idlistscanlimit /etc/dirsrv/slapd-EXAMPLE-COM/dse.ldif nsslapd-idlistscanlimit: 100000 nsslapd-pagedidlistscanlimit: 0 $ $ ldapsearch -xLLL -D"cn=Directory Manager" -W -b "fqdn=XXX,cn=computers,cn=accounts,dc=example,dc=com" nsPagedIDListScanLimit Enter LDAP Password: dn: fqdn=XXX,cn=computers,cn=accounts,dc=example,dc=com $
Won't Do