Symptoms
Impact:Switch panic due to Out of Memory (OOM) with cached memory going up during I/O, with no apparent memory leak.Environment:EMC Hardware: Connectrix DS-5100BEMC Hardware: Connectrix DS-5300BEMC Hardware: Connectrix DS-6505BEMC Hardware: Connectrix DS-6510BEMC Hardware: Connectrix DS-6520BEMC Hardware: Connectrix MP-7840BEMC Hardware: Connectrix ED-DCX-BEMC Hardware: Connectrix ED-DCX-4SEMC Hardware: Connectrix ED-8510-4BEMC Hardware: Connectrix ED-8510-8BBrocade Software: Fabric OS 7.3.2Problem:Kernel Panic due to Out of MemoryOut Of Memory (OOM) condition with cached memory going up very high (>70% of total memory) during IO, with no apparent memory leak.Errdump log:[HAM-1004], 164186, SLOT 4 | CHASSIS, INFO, DCX, Processor rebooted - Software Fault:Kernel Panic.[TRCE-1001], 164187, SLOT 4 | CHASSIS, WARNING, DCX, Trace dump available (Slot 4)! (reason: PANIC).Pdshow output: ---- OOM Alert: System is about to reboot ! ---- Kernel panic - not syncing: ---- OOM : Hafailover ---- PD Start PowerPC Book-E Watchdog Shutdown soft timer PLATFORM_FIRST MTRACER mtracer_panicdump mtracer_panicdump write=0x4c50 reboot_reason set_reboot_reason reason=Software Fault:Kernel Panic PD_MISC CONSOLE_LOG
Cause
This Out of Memory condition may be encountered more often with CMCNE polling FCR fabrics.Brocade DEFECT000603672
Resolution
Fix: Upgrade to Fabric OS 7.3.2 or 7.4.1a.Workaround:There are some system variables that can be changed on a switch to eliminate the Out of Memory condition prior to upgrading to fixed Fabric OS release.To change the parameters run the following commands with root level access:sysctl -w vm.dirty_background_ratio=5sysctl -w vm.dirty_ratio=10sysctl -w vm.dirty_expire_centisecs=300To validate the settings on a switch for these, the following commands may be run:sysctl vm.dirty_background_ratiosysctl vm.dirty_ratiosysctl vm.dirty_expire_centisecsAfter the changes the user can run the following command to monitor memory:With root level access:1.) Monitor the memory every day or two with the command cat /proc/meminfo2.) If memory drops below 25MB please have capture other supportsave3.) Then perform an hafailover to prevent another OOM panic.These changes will be removed in case of a hafailover, reboot, firmwareupgrade, etc. and the workaround parameters will need to be redone if any of these are done.Brocade DEFECT000603672