Symptom
This fix improve debug ability of the system to capture more data for RCA of issue when there is not possible to read ext3 disk as described below
HW: N77-C7710 with SUP3E
SW: 8.4(5)
Issue:
After the following logs, the device will reload.
%AAA-3-AAA_ERR_MSG: File Writing Failed for /mnt/pss/aaa_vdc_1.seqnum
%KERN-2-SYSTEM_MSG: [1798760.913362] EXT3-fs error (device sdb4): ext3_get_inode_loc: unable to read inode block - inode=31, block=1315123763 - kernel
%KERN-2-SYSTEM_MSG: [1798760.914313] EXT3-fs (sdb4): error in ext3_reserve_inode_write: Out of memory - kernel
%KERN-2-SYSTEM_MSG: [1798760.915187] EXT3-fs (sdb4): error in ext3_orphan_add: Out of memory - kernel
%KERN-2-SYSTEM_MSG: [1798760.916057] EXT3-fs (sdb4): error in ext3_setattr: Out of memory - kernel
This fix does not prevent issue to happen but only provide data for future RCA
Further Problem Description
- This is not a fix but debug improvement to collect more data in case issue happen
- You MUST enable following CLI to enable debug to collect data for RCA - this is not done by default
- To enable data collection
1. Enable kernel core:
n7k# conf t
Enter configuration commands, one per line. End with CNTL/Z.
n7k(config)# system kernel core
n7k(config)# end
2. Enable debug for SDB on active and standby supervisor:
- This is NOT configuration therefore it must be enable after each reload SSO
- Utilise EEM script or NMS to enable this CLI always after
2A. Active sup:
n7k# system kernel sdb4_ext3_desc_watch
2B. Standby sup:
n7k# sh mod | in stand
6 0 Supervisor Module-3 N77-SUP3E ha-standby
n7k# att m 6
Attaching to module 6 ...
To exit type 'exit', to abort type '$.'
n7k(standby)# system kernel sdb4_ext3_desc_watch
n7k(standby)# exit
3. When issue happen KERNEL core will be created - collect it from core:
arlon# dir logflash:core
arlon# dir logflash://sup-remote/core
Collect kernel core and provide it to CX TAC