...
We are having the problem of duplicate posts (orphans) being returned in the secondary nodes. I saw that this problem was resolved at https://jira.mongodb.org/browse/SERVER-5931; however, we currently have a cluster with 9 shards and 3 replica sets, all nodes using version 3.6.8 and the problem persists. The collection has 2.2 billion documents with hashed shard key. I have been periodically excluding orphans and, even moments after that process, I find duplicates. I performed this aggregation as a way of debugging and got the output below: db.investigation_cards.aggregate([{ $match: { _id: {$gt:ObjectId("5e43dab00000000000000000")} }}, {$group: {"_id" : "$_id" , "count" : { "$sum" : 1}}}, {$match: {count: {$gt: 1}}} ]) {"_id":{"$oid":"5e43f3a8ae813900169e6156"},"count":2} {"_id":{"$oid":"5e43f3a8ae813900169e6155"},"count":2} {"_id":{"$oid":"5e43f08b96d31b0015e73a00"},"count":2} {"_id":{"$oid":"5e43e506d914da00158f1cc3"},"count":2} {"_id":{"$oid":"5e43e508e1a3d00016b249d9"},"count":2} {"_id":{"$oid":"5e43e5bfc048ba0015973c58"},"count":2} {"_id":{"$oid":"5e43e25f5eea640015f12ea1"},"count":2} {"_id":{"$oid":"5e4400a82656d10015a9397d"},"count":2} {"_id":{"$oid":"5e43e5bfc048ba0015973c5a"},"count":2} {"_id":{"$oid":"5e43e508e1a3d00016b249da"},"count":2} {"_id":{"$oid":"5e43dbfa7e15b900156d0f9f"},"count":2} {"_id":{"$oid":"5e43dbfa7e15b900156d0f9b"},"count":2} {"_id":{"$oid":"5e43e5bfc048ba0015973c5b"},"count":2} {"_id":{"$oid":"5e43e9f464e8b30015fc7f24"},"count":2} {"_id":{"$oid":"5e43eb009c38d3103c3bcce4"},"count":2} {"_id":{"$oid":"5e43e7e364e8b30015fc7906"},"count":2} {"_id":{"$oid":"5e43f3a8ae813900169e6153"},"count":2} {"_id":{"$oid":"5e43e508e1a3d00016b249dc"},"count":2} {"_id":{"$oid":"5e43e7e364e8b30015fc790a"},"count":2} {"_id":{"$oid":"5e43dbfa7e15b900156d0f9d"},"count":2} If I do a search for any of these _id I get two identical documents in response. Is there any other solution to solve this problem?
cintyfmoura@yahoo.com.br commented on Thu, 5 Mar 2020 14:34:35 +0000: I sent the original files. The information security team had made some changes to the files. josef.ahmad commented on Wed, 4 Mar 2020 09:37:34 +0000: cintyfmoura@yahoo.com.br unfortunately restoring the dumps returns BSON and demultiplexing errors. Can you reattempt to obtain the dumps following the instructions above? cintyfmoura@yahoo.com.br commented on Fri, 28 Feb 2020 17:38:48 +0000: Hello I sent the files requested in the attachment. Database: shipyard Collection: investigation_cards Some duplicate pids: [{"pid":"1223203089928290304"}, {"pid":"1217150675483791361"}, {"pid":"1214120659103686656"}, {"pid":"1217371434126843906"}, {"pid":"1222825232358170625"}, {"pid":"1216433939746824194"}, {"pid":"1220050955766575106"}, {"pid":"1215620540188282881"}, {"pid":"1215002543799095297"}, {"pid":"1216728683651780608"}, {"pid":"1215732961104080897"}, {"pid":"1214199271085490177"}, {"pid":"1218215549760458752"}, {"pid":"1215688826083053570"}, {"pid":"1213622938475401221"}, {"pid":"1219943313266089985"}, {"pid":"1214606205781495809"}, {"pid":"1217843334846259200"}, {"pid":"1214969449402585089"}, {"pid":"1217061651633131521"}, {"pid":"1213692044515909632"}, {"pid":"1219627243699392515"}, {"pid":"1220032228799000577"}, {"pid":"1214229325291110400"}, {"pid":"1213883846342725632"}, {"pid":"1215667764813275136"}, {"pid":"1217143384281878529"}, {"pid":"1217105376384110593"}, {"pid":"1217087430261690369"}, {"pid":"1215619338293645312"}] We are keeping the balancer and autosplit off josef.ahmad commented on Tue, 25 Feb 2020 09:21:09 +0000: Thank you Cintia. We would need to analyse the query routing metadata. Can you please run the instructions below in-order, and upload the attachments our support uploader? Please note that only MongoDB engineers will be able to read the files therein and they will be automatically deleted after 180 days. The name of the database and the collection. Connect to a mongos Run sh.stopBalancer() and wait until sh.isBalancerRunning() returns false. Run sh.disableAutoSplit(). Run the aggregation with secondary read preference and local read concern and verify that it currently returns duplicate documents. Share with us one "pid" value of a duplicate document returned by the aggregation above. Share the output of db.getSiblingDB('admin').runCommand({getShardVersion : 'MYDB.MYCOLL'}) . Replace MYDB.MYCOLL with the actual database.collection . Attach a dump of the config database. mongodump --uri="mongodb://PATH_TO_MONGOS" --gzip --archive=config.gz -d config For each of the 9 shards, attach a dump of the config database. mongodump --uri="mongodb://PATH_TO_SHARD1_PRIMARY" --gzip --archive=config-shard1.gz -d config cintyfmoura@yahoo.com.br commented on Thu, 20 Feb 2020 16:52:45 +0000: Hi Josef. 1. My shard key is composed by a hashed index on a custom string field named "pid". This field is basically one object identification but can repeat for a thousand times, for example. 2. I see duplicate documents independent of chunk migrations, we are removing orphans periodically and they keep coming up. 3. The duplicates appear only in the secondary read preference. josef.ahmad commented on Thu, 20 Feb 2020 16:39:58 +0000: Hi cintyfmoura@yahoo.com.br, thank you for the output. To progress the diagnosis, please confirm: Is the collection sharded on _id: "hashed", or another field? Do you see duplicate documents while the collection is running chunk migrations, or also at a time when no chunk migration occurs? Do you see duplicate documents when using primary read preference and local read concern, or are the duplicates only observed with secondary read preference? cintyfmoura@yahoo.com.br commented on Mon, 17 Feb 2020 13:37:19 +0000: hi Kelsey, The aggregation: db.investigation_cards.aggregate([{ $match: { _id: {$gt:ObjectId("5e4a64200000000000000000")} }}, {$group: {"_id" : "$_id" , "count" : { "$sum" : 1}}}, {$match: {count: {$gt: 1}}} ],{readConcern: { level: "local" }}) The output: {"_id":{"$oid":"5e4a8a5f197a810015c8128a"},"count":2} {"_id":{"$oid":"5e4a8a5f197a810015c81284"},"count":2} {"_id":{"$oid":"5e4a86049c38d3103c0be74c"},"count":2} {"_id":{"$oid":"5e4a679c46501000155776c5"},"count":2} {"_id":{"$oid":"5e4a7b59b413bd00153307bf"},"count":2} {"_id":{"$oid":"5e4a679c46501000155776c6"},"count":2} {"_id":{"$oid":"5e4a780d89cca20015647953"},"count":2} {"_id":{"$oid":"5e4a90c24d1cd8001595db3f"},"count":2} {"_id":{"$oid":"5e4a8dbd46501000155895b2"},"count":2} {"_id":{"$oid":"5e4a8df1465010001558978c"},"count":2} {"_id":{"$oid":"5e4a90c24ae8a800153624a6"},"count":2} {"_id":{"$oid":"5e4a90c24ae8a800153624a9"},"count":2} {"_id":{"$oid":"5e4a90c24d1cd8001595db3e"},"count":2} {"_id":{"$oid":"5e4a6b62b413bd0015329179"},"count":2} {"_id":{"$oid":"5e4a691dde09dc001687ce26"},"count":2} {"_id":{"$oid":"5e4a6641d3a1630015745662"},"count":2} {"_id":{"$oid":"5e4a90c24d1cd8001595db3b"},"count":2} {"_id":{"$oid":"5e4a90bfb6d62e00155d9c9e"},"count":2} {"_id":{"$oid":"5e4a8dbd46501000155895ae"},"count":2} {"_id":{"$oid":"5e4a90c24ae8a800153624a5"},"count":2} {"_id":{"$oid":"5e4a68cc9c38d3103cc56341"},"count":2} {"_id":{"$oid":"5e4a8dbd46501000155895b0"},"count":2} {"_id":{"$oid":"5e4a8a5f197a810015c81286"},"count":2} {"_id":{"$oid":"5e4a8df14650100015589789"},"count":2} {"_id":{"$oid":"5e4a7a69faa3bb001592c82b"},"count":2} {"_id":{"$oid":"5e4a8a5f197a810015c81282"},"count":2} {"_id":{"$oid":"5e4a90c24ae8a800153624a7"},"count":2} {"_id":{"$oid":"5e4a8a5f197a810015c8127f"},"count":2} {"_id":{"$oid":"5e4a7124465010001557b82b"},"count":2} {"_id":{"$oid":"5e4a79419c38d3103cec427a"},"count":2} {"_id":{"$oid":"5e4a7d49b413bd0015331511"},"count":2} {"_id":{"$oid":"5e4a8a57197a810015c81238"},"count":2} {"_id":{"$oid":"5e4a90c24d1cd8001595db3a"},"count":2} {"_id":{"$oid":"5e4a8012871ed70016ace226"},"count":2} {"_id":{"$oid":"5e4a679c46501000155776c8"},"count":2} {"_id":{"$oid":"5e4a679c46501000155776c7"},"count":2} {"_id":{"$oid":"5e4a8dbd46501000155895ac"},"count":2} {"_id":{"$oid":"5e4a90c24ae8a800153624a4"},"count":2} {"_id":{"$oid":"5e4a90c24d1cd8001595db3c"},"count":2} thomas.schubert commented on Fri, 14 Feb 2020 21:59:05 +0000: HI cintyfmoura@yahoo.com.br So we can continue to investigate, would you please provide the aggregation command with readconcern local set and its output? Thank you, Kelsey cintyfmoura@yahoo.com.br commented on Wed, 12 Feb 2020 21:33:12 +0000: Using readConcern{local} it does not return the duplicate document on find but on aggregation the duplicate remains renctan commented on Wed, 12 Feb 2020 16:48:16 +0000: Can you try using readConcern: local? I suspect your query is running with readConcern: available.