...
CC mark.benvenuto + jeff.yemin See repro steps.
xgen-internal-githook commented on Mon, 12 Aug 2019 13:47:23 +0000: Author: {'name': 'Dan Aprahamian', 'username': 'daprahamian', 'email': 'dan.aprahamian@gmail.com'} Message: NODE: remove connecting on linux socket Remove connecting to mongocryptd on /tmp/mongocryptd.sock until SERVER-41029 is resolved Branch: master https://github.com/mongodb/libmongocrypt/commit/7e604382b43d2aabcd69512b140016690b1443d8 jason.carey commented on Thu, 27 Jun 2019 18:01:15 +0000: I'm going to close this out as wontfix, in preference for SERVER-41826 jason.carey commented on Wed, 19 Jun 2019 20:08:39 +0000: I've filed SERVER-41826 with a strategy I believe we can use to avoid stealing the domain socket. Couple of other thoughts: I'm not sure how we're exposing the unix domain socket work in mongocryptd, but have you considered passing --nounixsocket and --bind_ip "./mysock"? That'll let you put the domain socket wherever you'd like. It's still a little error prone, but much less so (and let's opening servers in different directories more easily avoid collisions I think we don't currently support it, but linux has support for abstract domain sockets (a domain socket with a leading '\0' byte). Those aren't on the file system and go away with process death. I'd have to think a bit about introducing the syntax for those, but I think they're a strictly better solution to your problem. If that sounds interesting to you (and if linux only support would still be useful) I can file a ticket to go that route as well kevin.albertson commented on Wed, 19 Jun 2019 17:41:45 +0000: I want to avoid stealing the UNIX domain socket to avoid the user experience described in the repro. Would it also be a problem if the second mongod managed to bind all ports, but then failed for some other reason? Hmm, I think so. I guess it's just a matter that the first mongod to terminates deletes the UNIX domain socket. Perhaps there's no reasonable way to enforce that the socket file is only deleted if no mongod is bound to it. If that is the case, then perhaps we should close this as "Won't Fix", and that would be more reason for us to choose a sensible user-wide pidfile path. By creating it in the current working directory like we currently do, it's easy to hit issue by running your application in two different directories. jason.carey commented on Tue, 18 Jun 2019 21:39:42 +0000: kevin.albertson, do you actually want what's in this ticket? Or do you want to avoid stealing the unix domain socket from a running process? A few thoughts: This problem shows up even if a subsequent mongod does start up (i.e. if you have different hosts bound to different ip addresses) because the socket name only includes binary+port. Would it also be a problem if the second mongod managed to bind all ports, but then failed for some other reason? (because it would still override the unix domain socket) I'm trying to figure out if the narrow problem this ticket describes is actually worth solving. Or if you want something more complicated in the "don't overwrite other's unix domain sockets" kind of vein mark.benvenuto commented on Fri, 31 May 2019 15:11:50 +0000: The unix domain socket is simply being bound before the TCP/IP sockets. This is not a problem specific to mongocryptd. Assigning to service arch. The code in question is here: https://github.com/mongodb/mongo/blob/933c6ad19c3f19a964c74a5174cbcf11cde0a66e/src/mongo/transport/transport_layer_asio.cpp#L678-L686
start mongocryptd verify you can connect with mongo mongodb://%2Ftmp%2Fmongocryptd.sock/ cd to another directory (so pid file differs) start mongocryptd again, which will fail with "SocketException: Address already in use" although other mongocryptd is still running, mongo mongodb://%2Ftmp%2Fmongocryptd.sock/ fails with Connection refused