...
Hello MongoDB Team. We faced with the issue related to moveChunk process. This is the error from one of the Shrad's log: 2019-10-22T11:18:45.921+0200 I SHARDING [conn132] Starting chunk migration ns: db.col, [{ productId: MinKey }, { productId: -9062096908397525382 }), fromShard: shard3, toShard: shard2 with expected collection version epoch 5bb4b060aec28d86b2174007 2019-10-22T11:18:45.924+0200 I SHARDING [conn132] about to log metadata event into changelog: { _id: "prulmcoembb09-2019-10-22T11:18:45.924+0200-5daec975680ef73c96f4be36", server: "prulmcoembb09", clientAddr: "10.217.1.252:50292", time: new Date(1571735925924), what: "moveChunk.start", ns: "db.col", details: { min: { productId: MinKey }, max: { productId: -9062096908397525382 }, from: "shard3", to: "shard2" } } 2019-10-22T11:18:45.955+0200 I SHARDING [conn132] moveChunk data transfer progress: { waited: true, active: false, ns: "db.col", from: "shard3/IP:PORT", fromShardId: "shard3", min: { productId: MinKey }, max: { productId: -9062096908397525382 }, shardKeyPattern: { productId: "hashed" }, state: "fail", errmsg: "migrate failed: InvalidUUID: Cannot create collection db.col because we already have an identically named collection with UUID 55a...", counts: { cloned: 0, clonedBytes: 0, catchup: 0, steady: 0 }, ok: 1.0, operationTime: Timestamp(1571735921, 1), $gleStats: { lastOpTime: Timestamp(0, 0), electionId: ObjectId('7fffffff000000000000001f') }, lastCommittedOpTime: Timestamp(1571735921, 1), $configServerState: { opTime: { ts: Timestamp(1571735925, 6), t: 18 } }, $clusterTime: { clusterTime: Timestamp(1571735925, 6), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } } } mem used: 0 documents remaining to clone: 2731 2019-10-22T11:18:45.955+0200 I SHARDING [conn132] about to log metadata event into changelog: { _id: "prulmcoembb09-2019-10-22T11:18:45.955+0200-5daec975680ef73c96f4be5d", server: "prulmcoembb09", clientAddr: "10.217.1.252:50292", time: new Date(1571735925955), what: "moveChunk.error", ns: "db.col", details: { min: { productId: MinKey }, max: { productId: -9062096908397525382 }, from: "shard3", to: "shard2" } } 2019-10-22T11:18:45.961+0200 W SHARDING [conn132] Chunk move failed :: caused by :: OperationFailed: Data transfer error: migrate failed: InvalidUUID: Cannot create collection db.col because we already have an identically named collection with UUID 55ab81fa-7d21-4742-8d71-f4ef8f741ec2, which differs from the donor's UUID 3db9aaae-c037-4162-b0a8-9eec312df936. Manually drop the collection on this shard if it contains data from a previous incarnation of db.col Sorry, I've changed the IPs and db and collection names due to security policy. Here is the sharded collection status: mongos> db.coll.getShardDistribution() Shard shard3 at shard3/IP:PORT data : 1.26GiB docs : 109259 chunks : 42 estimated data per chunk : 30.87MiB estimated docs per chunk : 2601 Shard shard1 at shard1/IP:PORT data : 1.18GiB docs : 102091 chunks : 36 estimated data per chunk : 33.61MiB estimated docs per chunk : 2835 Shard shard2 at shard2/IP:PORT data : 1.17GiB docs : 101178 chunks : 34 estimated data per chunk : 35.27MiB estimated docs per chunk : 2975 Totals data : 3.61GiB docs : 312528 chunks : 112 Shard shard3 contains 34.99% data, 34.95% docs in cluster, avg obj size on shard : 12KiB Shard shard1 contains 32.64% data, 32.66% docs in cluster, avg obj size on shard : 12KiB Shard shard2 contains 32.36% data, 32.37% docs in cluster, avg obj size on shard : 12KiB The configuration is the following: 3 Shards with 5 nodes (PRIMARY + 4 Sec + 1 Arb) Here is the Config log error related to the issue: 2019-10-20T02:00:04.541+0200 I SHARDING [Balancer] distributed lock 'db.coll' acquired for 'Migrating chunk(s) in collection db.coll', ts : 5daab1a4d74e87bf1a4ee992 2019-10-20T02:00:04.911+0200 I SHARDING [ShardRegistry] distributed lock with ts: '5daab1a4d74e87bf1a4ee992' and _id: 'db.coll' unlocked. 2019-10-20T02:00:04.922+0200 I SHARDING [Balancer] Balancer move db.coll: [{ productId: MinKey }, { productId: -9062096908397525382 }), from shard3, to shard2 failed :: caused by :: OperationFailed: Data transfer error: migrate failed: InvalidUUID: Cannot create collection db.coll because we already have an identically named collection with UUID 55ab81fa-7d21-4742-8d71-f4ef8f741ec2, which differs from the donor's UUID 3db9aaae-c037-4162-b0a8-9eec312df936. Manually drop the collection on this shard if it contains data from a previous incarnation of db.coll 2019-10-20T02:00:04.922+0200 I SHARDING [Balancer] about to log metadata event into actionlog: { _id: "host-2019-10-20T02:00:04.922+0200-5daba384d74e87bf1a699d86", server: "host", clientAddr: "", time: new Date(1571529604922), what: "balancer.round", ns: "", details: { executionTimeMillis: 465, errorOccured: false, candidateChunks: 1, chunksMoved: 0 } } 2019-10-20T02:00:14.962+0200 I SHARDING [Balancer] distributed lock 'db.coll' acquired for 'Migrating chunk(s) in collection db.coll', ts : 5daab1a4d74e87bf1a4ee992 2019-10-20T02:00:15.009+0200 I SHARDING [ShardRegistry] distributed lock with ts: '5daab1a4d74e87bf1a4ee992' and _id: 'db.coll' unlocked. 2019-10-20T02:00:15.012+0200 I SHARDING [Balancer] Balancer move db.coll: [{ productId: MinKey }, { productId: -9062096908397525382 }), from shard3, to shard2 failed :: caused by :: OperationFailed: Data transfer error: migrate failed: InvalidUUID: Cannot create collection db.coll because we already have an identically named collection with UUID 55ab81fa-7d21-4742-8d71-f4ef8f741ec2, which differs from the donor's UUID 3db9aaae-c037-4162-b0a8-9eec312df936. Manually drop the collection on this shard if it contains data from a previous incarnation of db.coll 2019-10-20T02:00:15.013+0200 I SHARDING [Balancer] about to log metadata event into actionlog: { _id: "host-2019-10-20T02:00:15.013+0200-5daba38fd74e87bf1a699e3d", server: "host", clientAddr: "", time: new Date(1571529615013), what: "balancer.round", ns: "", details: { executionTimeMillis: 82, errorOccured: false, candidateChunks: 1, chunksMoved: 0 } } Please let me know if any additional info are require form my side. Thank You
dmitry.agranat commented on Thu, 31 Oct 2019 15:32:18 +0000: Hi Gennadiy, I am going to close this ticket but feel free to reopen if this happens again. Thanks, Dima dmitry.agranat commented on Thu, 24 Oct 2019 08:06:54 +0000: Hi Gennadiy, There might be other possibilities for this issue to manifest itself. For example, if a collection was somehow on a replica set before it was added as a shard or it might be a manifestation of SERVER-17397. But given we do not have the logs or a reproducer to back this up, we have exhausted all avenues of investigation. If this happens again, or you can reproduce it, please save the logs and we'll be happy to take a look. Thanks, Dima genacvali91 commented on Wed, 23 Oct 2019 09:17:50 +0000: Also, regaraing the migration - we migrated it by using the backup/restore each shard to shard. The cluster was migrated from 4.0.0 to 4.0.4 mongod from docker conteiners. Thank You Gennadiy dmitry.agranat commented on Wed, 23 Oct 2019 08:22:48 +0000: Thanks Gennadiy for additional information. It is unfortunate we do not have any evidence for this collection creation. Based on the information so far, I have another question, when was the cluster upgraded to 4.0.x version and what MongoDB version this cluster was upgraded from? Thanks, Dima genacvali91 commented on Wed, 23 Oct 2019 07:54:11 +0000: Hello Dmitriy, Unfortunatelly we do not have such logs. The cluster was migrated from docker continers and there were not any issues before and after the migration. But now there are... If the collection was not created through mongoS how it can be Sharded as you can see from the output: mongos> db.coll.getShardDistribution() Shard shard3 at shard3/IP:PORT data : 1.26GiB docs : 109259 chunks : 42 estimated data per chunk : 30.87MiB estimated docs per chunk : 2601 Shard shard1 at shard1/IP:PORT data : 1.18GiB docs : 102091 chunks : 36 estimated data per chunk : 33.61MiB estimated docs per chunk : 2835 Shard shard2 at shard2/IP:PORT data : 1.17GiB docs : 101178 chunks : 34 estimated data per chunk : 35.27MiB estimated docs per chunk : 2975 Totals data : 3.61GiB docs : 312528 chunks : 112 Shard shard3 contains 34.99% data, 34.95% docs in cluster, avg obj size on shard : 12KiB Shard shard1 contains 32.64% data, 32.66% docs in cluster, avg obj size on shard : 12KiB Shard shard2 contains 32.36% data, 32.37% docs in cluster, avg obj size on shard : 12KiB Cluster works about an year after migration So, are there cases or resolutions for this situation to resolve the issue? This is PROD env. If there are some other quetions - please ask. Thank You Gennadiy dmitry.agranat commented on Wed, 23 Oct 2019 07:38:54 +0000: Hi Gennadiy, Could you provide evidence of this collection creation, specifically showing when and from which source this collection was created? If logs are still available to back this up, please upload them to this secure portal. What we are looking for is to determine if this collection was created through mongoS or when directly connected to the shards. Thanks, Dima genacvali91 commented on Tue, 22 Oct 2019 13:35:47 +0000: Hello Dmitriy, Regarding the "usually arise if you connected directly to the shards and restored there instead of going through mongos" - this is PROD Cluster wich is not never restored yet. So, this is not the root cause of the issue. The collection name is not the config.system.sessions. This is standart collection with site information. If there are some other quetions - please ask. Thank You Gennadiy dmitry.agranat commented on Tue, 22 Oct 2019 13:29:09 +0000: Hi genacvali91, The situation in which different shards get different UUIDs can usually arise if you connected directly to the shards and restored there instead of going through mongos. Could you confirm if this is the case or otherwise provide more details about the procedure that might have caused this issue? I understand that you have redacted the original collection name, could you confirm the original collection name is not config.system.sessions? Thanks, Dima
Click on a version to see all relevant bugs
MongoDB Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.