...
BugZero found this defect 2558 days ago.
Triggered by PyMongo's test suite, on my branch where I'm developing sessions. This is a mongos error from a sharded cluster with auth, when PyMongo is calling "getMore" on an aggregation cursor with "lsid": 2017-09-17T14:19:16.577-0400 F - [conn187] Invariant failure remotesExhausted_inlock() || _lifecycleState == kKillComplete src/mongo/s/que ry/async_results_merger.cpp 84 mongos(_ZN5mongo15invariantFailedEPKcS1_j+0x2E6) [0x10e848a46] mongos(_ZN5mongo18AsyncResultsMergerD2Ev+0x196) [0x10dfed976] mongos(_ZN5mongo16RouterStageMergeD0Ev+0x1C) [0x10dfea69c] mongos(_ZN5mongo23ClusterClientCursorImplD0Ev+0xB6) [0x10dfe93c6] mongos(_ZN5mongo20ClusterCursorManager14checkOutCursorERKNS_15NamespaceStringExPNS_16OperationContextE+0x3D3) [0x10e1b0d93] mongos(_ZN5mongo11ClusterFind10runGetMoreEPNS_16OperationContextERKNS_14GetMoreRequestE+0x4A) [0x10dfe30ea] mongos(_ZN5mongo12_GLOBAL__N_117ClusterGetMoreCmd3runEPNS_16OperationContextERKNSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEERKNS_7BSONObjERNS_14BSONObjBuilderE+0x116) [0x10df87ab6] mongos(_ZN5mongo12BasicCommand11enhancedRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE+0x77) [0x10e2f0037] mongos(_ZN5mongo7Command9publicRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE+0x20) [0x10e2ee530] mongos(_ZN5mongo12_GLOBAL__N_110runCommandEPNS_16OperationContextERKNS_12OpMsgRequestEONS_14BSONObjBuilderE+0xC8F) [0x10dfc6a7f] mongos(_ZN5mongo8Strategy13clientCommandEPNS_16OperationContextERKNS_7MessageE+0x341) [0x10dfc32d1] mongos(_ZN5mongo23ServiceEntryPointMongos13handleRequestEPNS_16OperationContextERKNS_7MessageE+0x2E5) [0x10df21c25] mongos(_ZN5mongo19ServiceStateMachine15_processMessageERNS0_11ThreadGuardE+0x18A) [0x10df2967a] mongos(_ZN5mongo19ServiceStateMachine15_runNextInGuardERNS0_11ThreadGuardE+0x175) [0x10df28b35] mongos(_ZN5mongo19ServiceStateMachine7runNextEv+0x38) [0x10df294a8] Log attached. PyMongo was executing: # Use batchSize to ensure multiple getMore messages cursor = db.test.aggregate( [{'$project': {'_id': '$_id'}}], batchSize=5) self.assertEqual( expected_sum, sum(doc['_id'] for doc in cursor))
xgen-internal-githook commented on Wed, 20 Sep 2017 16:26:09 +0000: Author: {'email': 'jcarey@argv.me', 'name': 'Jason Carey', 'username': 'hanumantmk'} Message: SERVER-31120 fix invalid session getMore invariant When passing the wrong lsid to a cursor (not the lsid used to create it) we invariant in sharding. This appears to be about poor lifetime issues in mongos cursors. This papers over the bad api and adds a test for the fix. Branch: master https://github.com/mongodb/mongo/commit/b4fa6b5c46612b7943230dc1a4b24ce7867aa681 jesse commented on Tue, 19 Sep 2017 15:05:53 +0000: That's great! Very helpful for driver testing if the server uasserts when getMore doesn't have the right lsid. I think the PyMongo code I was testing at the time did send a different lsid with getMore than with aggregate; that's fixed in my code now. jason.carey commented on Mon, 18 Sep 2017 19:21:58 +0000: I was able to reproduce this by issuing a getMore with a different lsid than than the lsid used to create a cursor. It may also happen if a getMore is issued without an lsid for a cursor that was created with one. The fix will clean that up (so that a helpful uassert shows up instead of an invariant)