Symptoms
NSX Edge 6.3.5 experiences crash with following messagesvserrdd[1729]: [daemon.warning] Memory overloaded: 96.29 used, threshhold: 90.00Crash dump[DR Back End]: [kern.warning] log_cleanup.sh invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0[DR Back End]: [kern.info] log_cleanup.sh cpuset=/ mems_allowed=0[kern.warning] CPU: 1 PID: 29614 Comm: log_cleanup.sh Tainted: G O 4.4.57 #1[DR Back End]: [kern.warning] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015[DR Back End]: [kern.warning] 0000000000000000 ffff880013923b58 ffffffff81310844 0000000000000007[DR Back End]: [kern.warning] 0000000000000000 ffff88003afd8000 ffff880013923bb8 ffffffff8117c8e6[DR Back End]: [kern.warning] ffff88002888b180 0000000000000206 000000000005e154 ffff880013923b90[DR Back End]: [kern.warning] Call Trace:[DR Back End]: [kern.warning] [<ffffffff81310844>] dump_stack+0x58/0x84[DR Back End]: [kern.warning] [<ffffffff8117c8e6>] dump_header.isra.10+0x54/0x19e[DR Back End]: [kern.warning] [<ffffffff8161089e>] ? _raw_spin_unlock_irqrestore+0xe/0x10[DR Back End]: [kern.warning] [<ffffffff81139a83>] oom_kill_process+0x203/0x420[DR Back End]: [kern.warning] [<ffffffff81139f9f>] out_of_memory+0x29f/0x2e0[DR Back End]: [kern.warning] [<ffffffff8113e91e>] __alloc_pages_nodemask+0x8de/0x930[DR Back End]: [kern.warning] [<ffffffff81181ecc>] ? get_empty_filp+0x5c/0x1c0[DR Back End]: [kern.warning] [<ffffffff8113ebbb>] alloc_kmem_pages_node+0x1b/0x20[DR Back End]: [kern.warning] [<ffffffff81056232>] copy_process+0x162/0x1ae0[DR Back End]: [kern.warning] [<ffffffff8117a2b0>] ? kmem_cache_alloc+0x140/0x150[DR Back End]: [kern.warning] [<ffffffff81057d21>] _do_fork+0x71/0x350[DR Back End]: [kern.warning] [<ffffffff81067fb1>] ? sigprocmask+0x51/0x80
Cause
Memory issue within nagios process on NSX edge causes appliance to crash
Impact / Risks
NSX Edge crash triggers a reboot of the edge.
Resolution
This issue affects both 6.3.x and 6.4.x. The solution is expected to be part of 6.4.2 & possibly, 6.3.7.
Workaround
Work around:Change L4 LB to L7 LB by setting the "Acceleration Status" to disabled at the Global Configuration page under the Load Balancer tab.