...
VPN flaps during IPSEC SA rekey. The VPN recovers quickly, but applications connecting over VPN are loosing TCP connections. Example - database job, that cannot recover after TCP connection times out, and terminates.
IKEv2 L2L FTD initiating IPSEC SA rekey Peer quickly starts using new outbound (FTD inbound) SPI - immediately after sending create child SA response. It was seen with Palo Alto peer.
Increase IPSEC SA lifetime on FTD, so that the peer has lower time and FTD is rekey responder, not initiator.
In IKEv2 debugs (debug crypto ikev2 platform 15, debug crypto ikev2 protocol 15, debug crypto ipsec 15) we see that during rekey initiated by FTD first new inbound SPI is created, and then FTD sends information that the same SPI is invalid. This is not expected. %FTD-5-711001: IPSEC: Completed inbound permit rule, SPI 0xFEFD3E1E <------- new inbound SPI Rule ID: 0x00002b4937d47980 %FTD-5-711001: IPSEC INFO: Destroying an IPSec timer of type SA Purge Timer %FTD-5-711001: IPSEC INFO: Destroying an IPSec timer of type SA Purge Timer %FTD-5-711001: IKEv2-PLAT-4: Received PFKEY Invalid SPI for SPI 0x1E3EFDFE, error FALSE %FTD-5-711001: IKEv2-PLAT-4: Received PFKEY delete SA for SPI 0x9E3A169C error FALSE %FTD-5-711001: IKEv2-PLAT-4: PFKEY Delete Ack from IPSec %FTD-5-711001: IKEv2-PLAT-4: Received PFKEY add SA for SPI 0x92EC7FC4, error FALSE %FTD-5-711001: IKEv2-PLAT-4: Received PFKEY update SA for SPI 0xFEFD3E1E, error FALSE %FTD-5-711001: IKEv2-PLAT-4: Success on pfkey update %FTD-5-711001: IKEv2-PLAT-4: (3276): PSH added CTM sa hdl 452272917 %FTD-5-711001: IKEv2-PLAT-4: Received PFKEY Active SA for SPI 0xFEFD3E1E, error FALSE %FTD-5-711001: IKEv2-PROTO-7: (3276): SM Trace-> SA: I_SPI=16E0EB3D1B795DA7 R_SPI=611FCB1009386D19 (I) MsgID = 0000000B CurState: CHILD_I_PROC Event: EV_OK_RECD_LOAD_IPSEC %FTD-5-711001: IKEv2-PROTO-7: (3276): Action: Action_Null %FTD-5-711001: IKEv2-PROTO-7: (3276): SM Trace-> SA: I_SPI=16E0EB3D1B795DA7 R_SPI=611FCB1009386D19 (I) MsgID = 0000000B CurState: CHILD_I_DONE Event: EV_OK %FTD-5-711001: IKEv2-PROTO-4: (3276): Have accepted policies %FTD-5-711001: IKEv2-PROTO-4: (3276): Child FO event generated - success %FTD-5-711001: IKEv2-PROTO-4: (3276): IKEV2 SA created; inserting SA into database. SA lifetime timer (86400 sec) started %FTD-5-711001: IKEv2-PROTO-7: (3276): SM Trace-> SA: I_SPI=16E0EB3D1B795DA7 R_SPI=611FCB1009386D19 (I) MsgID = 0000000B CurState: EXIT Event: EV_CHK_PENDING %FTD-5-711001: IKEv2-PROTO-7: (3276): Processed response with message id 11, Requests can be sent from range 12 to 12 %FTD-5-711001: IKEv2-PROTO-7: (3276): SM Trace-> SA: I_SPI=16E0EB3D1B795DA7 R_SPI=611FCB1009386D19 (I) MsgID = 0000000B CurState: EXIT Event: EV_NO_EVENT %FTD-5-711001: IKEv2-PROTO-7: (3276): SM Trace-> SA: I_SPI=16E0EB3D1B795DA7 R_SPI=611FCB1009386D19 (I) MsgID = 0000000B CurState: EXIT Event: EV_FREE_NEG %FTD-5-711001: IKEv2-PROTO-7: (3276): Deleting negotiation context for my message ID: 0xb %FTD-5-711001: IKEv2-PROTO-7: Process delete IPSec API %FTD-5-711001: IKEv2-PROTO-7: (3276): SM Trace-> SA: I_SPI=16E0EB3D1B795DA7 R_SPI=611FCB1009386D19 (I) MsgID = 00000001 CurState: READY Event: EV_SEND_INVALID_SPI %FTD-5-711001: IKEv2-PROTO-7: (3276): Action: Action_Null %FTD-5-711001: IKEv2-PROTO-7: (3276): SM Trace-> SA: I_SPI=16E0EB3D1B795DA7 R_SPI=611FCB1009386D19 (I) MsgID = 00000001 CurState: INFO_I_BLD_INFO Event: EV_SEND_INVALID_SPI %FTD-5-711001: IKEv2-PROTO-4: (3276): Sending INVALID_SPI notify %FTD-5-711001: IKEv2-PROTO-7: Construct Notify Payload: INVALID_SPI %FTD-5-711001: IKEv2-PROTO-4: (3276): Building packet for encryption. %FTD-5-711001: (3276): Payload contents: %FTD-5-711001: (3276): NOTIFY(INVALID_SPI) %FTD-5-711001: (3276): Next payload: NONE, reserved: 0x0, length: 12 %FTD-5-711001: (3276): Security protocol id: ESP, spi size: 0, type: INVALID_SPI %FTD-5-711001: (3276): %FTD-5-711001: (3276): fe fd 3e 1e <------- same new inbound SPI is deleted After receiving delete the PA device initiates new VPN, and the FTD terminates the old one with Initial Contact. Syslog: %FTD-4-113019: Group = x.x.x.x, Username = x.x.x.x, IP = x.x.x.x, Session disconnected. Session Type: LAN-to-LAN, Duration: 5h:56m:56s, Bytes xmt: 680819287, Bytes rcv: 1075508562, Reason: Peer Reconnected"