...
There is no functional impact as: 1) Port is not failing continuously and fails very rarely. 2) Port is in Admin Shut and the port behaves well when it is in link UP state. Possible reason could be due to the PHY is taking long time to execute a command in the few instances. Fixing this timing issue by adding more delays may have other undesirable impact such as port bring up time, port channel convergence time etc. Driver and Diag team discussed this and concluded that this is a rare issue and with no functional impact. FIxing this may be complex and may need to add delays in the PHY driver, which may result into other issues.
Diagnostic Port Loopback test is failing intermittently on some of the ports N7K-F248XT-25E. However, the diag test continues to run and results are good (PASS). N7K-175(config)# sh module Mod Ports Module-Type Model Status --- ----- ----------------------------------- ------------------ ---------- 10 48 1/10 Gbps BASE-T Ethernet Module N7K-F248XT-25E ok N7K-175(config)# sh diagnostic result module 10 test 6 detail Current bootup diagnostic level: complete Module 10: 1/10 Gbps BASE-T Ethernet Module Diagnostic level at card bootup: complete Test results: (. = Pass, F = Fail, I = Incomplete, U = Untested, A = Abort, E = Error disabled) ______________________________________________________________________ 6) PortLoopback: Port 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ----------------------------------------------------- . . . . . . . . . . . . . . . . Port 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 ----------------------------------------------------- . . . . . . . . . . . . . . . . Port 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 ----------------------------------------------------- . . . . . . . . . . . . . . . . Error code ------------------> DIAG TEST SUCCESS Total run count -------------> 437 Last test execution time ----> Sat Feb 27 11:47:54 2016 First test failure time -----> Sat Feb 27 04:11:09 2016 Last test failure time ------> Sat Feb 27 06:07:09 2016 Last test pass time ---------> Sat Feb 27 11:51:06 2016 Total failure count ---------> 2 Consecutive failure count ---> 0 Last failure reason ---------> Loopback test failed. Unable to analyze the reason for failure Next Execution time ---------> Sat Feb 27 11:48:54 2016 ______________________________________________________________________ module-10# sh hard inter statistics dev mac error port 15 | no-more 6147 GD Rx bad CRC frames, excluding RUNT/JABBER 0000000000000010 15 - 6193 GD Received frames with symbol/sequence errs 0000000000000002 15 - 6284 GD XGMAC rx CRC error interrupt 0000000000000003 15 - 6285 GD XGMAC rx code violation interrupt 0000000000000002 15 - 6286 GD XGMAC rx code error interrupt 0000000000000003 15 - 6323 GD Received frame with code error interrupt 0000000000000002 15 - 6324 GD Received frame with CRC error interrupt 0000000000000003 15 - 16398 PL ingress_rx_err 0000000000000010 15 - 16400 PL ingress_pl_drop (truncated or crc etc) 0000000000000008 15 - 16423 PL ingress_cbl_drop 0000000000000002 15 - 18472 IB ingress_ib_drop (small cnt) 0000000000000008 15 -
Port is in "Administrative Down" and gold diagnostic Port loopback test is run on these ports. Port loopback test fails intermittently, but in the next cycle of the same tests run successfully. There is no functional impact as: 1) Port is not failing continuously and fails very rarely. 2) Port is in Admin Shut and the port behaves well when it is in link UP state.
Please do ?unshut, shut? and then run the tests. ~~~~~~~~~ Note down the configs so that we can re-store for module: Show run | inc diag Now, clear the result status of diag tests: (we use module 7 port 6 as an example) config t no diagnostic monitor module 7 test all diagnostic monitor module 7 test all no diagnostic monitor module 7 test all diagnostic clear result module 7 test all diagnostic ondemand iteration 100 diagnostic ondemand action-on-failure stop show diagnostic internal port_lb info Verify if there is any port_lb test is running for module 7 Show diagnostic internal port_lb info Start the diag port loopback test and you can repeat whenever you require diagnostic start module 7 test 6 port 6 verify if the test on module 7 is complete, wait until the steps complete: show diagnostic internal port_lb info Verify the detailed result: show diagnostic result module 7 test 6 detail Restore the configs noted using the below in first step: show run | inc diag
We tried in the lab and I could see this below: N7K-252-40# show diagnostic result module 15 test 6 detail Current bootup diagnostic level: complete Module 15: 1/10 Gbps BASE-T Ethernet Module Diagnostic level at card bootup: complete Test results: (. = Pass, F = Fail, I = Incomplete, U = Untested, A = Abort, E = Error disabled) ______________________________________________________________________ 6) PortLoopback: Port 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ----------------------------------------------------- . . . . . . . . . . . . . . . . Port 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 ----------------------------------------------------- . . . . . F . . . . . . . . . . Port 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 ----------------------------------------------------- . . . . . . . . . . . . . . . . Error code ------------------> DIAG TEST FAIL Total run count -------------> 30 Last test execution time ----> Wed Sep 7 10:52:46 2016 First test failure time -----> Wed Sep 7 10:56:01 2016 Last test failure time ------> Wed Sep 7 10:56:01 2016 Last test pass time ---------> Wed Sep 7 10:52:46 2016 Total failure count ---------> 1 Consecutive failure count ---> 0 Last failure reason ---------> Loopback test failed. Unable to analyze the reason for failure Next Execution time ---------> n/a ______________________________________________________________________ N7K-252-40# module-15# sh hard inter statistics dev mac error port 22 | no-more |------------------------------------------------------------------------| | Device:Clipper MAC Role:MAC Mod:15 | | Last cleared @ Wed Sep 7 09:25:59 2016 | Device Statistics Category :: ERROR |------------------------------------------------------------------------| Instance:5 Cntr Name Value Ports ----- ---- ----- ----- 2051 GD Rx bad CRC frames, excluding RUNT/JABBER 0000000000000008 22 - 2188 GD XGMAC rx CRC error interrupt 0000000000000003 22 - 2189 GD XGMAC rx code violation interrupt 0000000000000002 22 - 2190 GD XGMAC rx code error interrupt 0000000000000002 22 - 2228 GD Received frame with CRC error interrupt 0000000000000003 22 - 12302 PL ingress_rx_err 0000000000000008 22 - 12304 PL ingress_pl_drop (truncated or crc etc) 0000000000000008 22 - 18470 IB ingress_ib_drop (small cnt) 0000000000000008 22 - 18474 IB ingress_ib_de_and_pl_drop (small cnt) 0000000000000008 21-24 - 22745 IB INT DE packet drop (cr_type = 0, all fpoe = 0) 0000000000000001 21-24 - 22746 IB INT PL packet drop 0000000000000001 21-24 - 22803 IB INT port inst1 packet drop 0000000000000001 21-24 - 22810 IB INT PL packet drop 0000000000000001 21-24 - Possible reason could be due to the PHY is taking long time to execute a command in the few instances. Fixing this timing issue by adding more delays may have other undesirable impact such as port bring up time, port channel convergence time etc. Driver and Diag team discussed this and concluded that this is a rare issue and with no functional impact. FIxing this may be complex and may need to add delays in the PHY driver, which may result into other issues. Mail discussion is attached to the CDETS.