Info
Currently cluster_aggregate.cpp asserts that for each involved namesapce, either all of the pipeline stages work with sharded collections or the namespace is not sharded (via this function that is passed in).
However, it is not correct for a router to use its routing table cache to check if a collection is sharded, because the cache can be stale: the collection could have been dropped and recreated as unsharded.
Instead, the router could assume the collection is unsharded and the data node could error if the collection is actually sharded.
Top User Comments
esha.maharishi@10gen.com commented on Mon, 13 Dec 2021 21:42:25 +0000:
steve.la, an example user setup is:
User creates and shards a collection
User drops the collection and recreates it as unsharded
User runs a query that uses the collection in a stage not supported on sharded collections and gets back an error.
It does not bring down any node in the cluster, just fails the query.
It seems like this code is at least 3 years old.
There is a workaround (the user can call flushRouterConfig against the mongos or restart the mongos, after which mongos will load a fresh list of the sharded collections). I also hope all stages will work with sharded collections within the next one or two years.
I mainly wanted to raise this because I've seen similar bugs a few times (SERVER-61333, SERVER-42788). I see SERVER-45186 is a generic ticket about this. It sounds fine to close this ticket, SERVER-61333, and SERVER-42788 as dupes of SERVER-45186 and just prioritize SERVER-45186.