...
Per SPEC-1338, the shell does not serialize BSON binary subtype 2 the same way drivers do, which drivers do not recognize as valid. bsonspec.org states: \x02 Binary (Old) - This used to be the default subtype, but was deprecated in favor of \x00. Drivers and tools should be sure to handle \x02 appropriately. The structure of the binary data (the byte* array in the binary non-terminal) must be an int32 followed by a (byte*). The int32 is the number of bytes in the repetition. But the shell does not appear to add the four byte integer length.
JIRAUSER1268551 commented on Mon, 19 Jun 2023 18:35:15 +0000: This has been identified as work that the SDP team won't do in the near term. Please reopen with a comment if you feel this work should be reprioritized and explain why. steven.vannelli commented on Tue, 10 May 2022 21:24:17 +0000: Moving this ticket to the Backlog and removing the "Backlog" fixVersion as per our latest policy for using fixVersions. xgen-internal-githook commented on Tue, 15 Oct 2019 19:53:55 +0000: Author: {'name': 'Gabriel Russell', 'username': 'gabrielrussell', 'email': 'gabriel.russell@mongodb.com'} Message: Revert "SERVER-41994 correctly create and show type 2 binary elements" This reverts commit 3f6ba750fc8374968c3c06e31c67ac2839e9a736. Branch: master https://github.com/mongodb/mongo/commit/a6e3d841fcb4bc219ad56e3defc898ae18de9dbd xgen-internal-githook commented on Tue, 10 Sep 2019 20:42:53 +0000: Author: {'name': 'Gabriel Russell', 'username': 'gabrielrussell', 'email': 'gabriel.russell@mongodb.com'} Message: SERVER-41994 correctly create and show type 2 binary elements Branch: master https://github.com/mongodb/mongo/commit/3f6ba750fc8374968c3c06e31c67ac2839e9a736 behackett commented on Tue, 9 Jul 2019 17:57:43 +0000: The shell would have to encode subtype 2 according to the spec and support decoding data according to the spec, but also support decoding this broken format. Note that no driver ever written can decode what the shell currently generates. mark.benvenuto commented on Tue, 9 Jul 2019 17:19:22 +0000: The function I highlighted in my first example is not called. While we can fix this, the shell has been doing the broken behavior for many years at this point. It would be a breaking change for the shell to fix this since the shell will no longer be able to decode data with bindata type 2 that it inserted in previous versions. CCing gabriel.russell and mira.carey@mongodb.com for their opinion. Old javascript engines had the same broken behavior: https://github.com/mongodb/mongo/commit/eb7ba8adc1ea407ebd88a8b829a6adbaf109e681 https://github.com/mongodb/mongo/commit/3184194f9e0eb63fc3f6d73579681c4007022e9b Example Fix: diff --git a/src/mongo/scripting/mozjs/valuereader.cpp b/src/mongo/scripting/mozjs/valuereader.cpp index 0a5a547ccc..264bc3b924 100644 --- a/src/mongo/scripting/mozjs/valuereader.cpp +++ b/src/mongo/scripting/mozjs/valuereader.cpp @@ -135,7 +135,7 @@ void ValueReader::fromBSONElement(const BSONElement& elem, const BSONObj& parent } case mongo::BinData: { int len; - const char* data = elem.binData(len); + const char* data = elem.binDataClean(len); std::stringstream ss; base64::encode(ss, data, len); diff --git a/src/mongo/scripting/mozjs/valuewriter.cpp b/src/mongo/scripting/mozjs/valuewriter.cpp index ef3b0b4d42..0fa06c5fe1 100644 --- a/src/mongo/scripting/mozjs/valuewriter.cpp +++ b/src/mongo/scripting/mozjs/valuewriter.cpp @@ -360,11 +360,15 @@ void ValueWriter::_writeObject(BSONObjBuilder* b, auto binData = base64::decode(*str); - b->appendBinData(sd, - binData.size(), - static_cast( - static_cast(o.getNumber(InternedString::type))), - binData.c_str()); + auto binDataType = static_cast( + static_cast(o.getNumber(InternedString::type))); + + if (binDataType == ByteArrayDeprecated) { + b->appendBinDataArrayDeprecated( + sd.toString().c_str(), binData.c_str(), binData.size()); + } else { + b->appendBinData(sd, binData.size(), binDataType, binData.c_str()); + } return; } shane.harvey commented on Tue, 2 Jul 2019 20:55:31 +0000: Indeed it looks like the shell does not add the length portion of subtype 2. In the shell: db.coll.insert({_id: NumberInt(1), x: BinData(2, '1234')}) In python: >>>> import base64 >>>> from bson import raw_bson >>>> from bson import BSON >>>> raw = client.test.get_collection("coll", raw_bson.DEFAULT_RAW_BSON_OPTIONS).find_one() >>>> raw['x'] Traceback (most recent call last): File "", line 1, in File "/Users/shane/git/mongo-python-driver/bson/raw_bson.py", line 103, in __getitem__ return self.__inflated[item] File "/Users/shane/git/mongo-python-driver/bson/raw_bson.py", line 99, in __inflated self.__raw, 4, len(self.__raw)-1, self.__codec_options, SON()) File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 427, in _elements_to_dict key, value, position = _element_to_dict(data, position, obj_end, opts) File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 407, in _element_to_dict element_name) File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 266, in _get_binary raise InvalidBSON("invalid binary (st 2) - lengths don't match!") bson.errors.InvalidBSON: invalid binary (st 2) - lengths don't match! >>>> raw.raw b'\x19\x00\x00\x00\x10_id\x00\x01\x00\x00\x00\x05x\x00\x03\x00\x00\x00\x02\xd7m\xf8\x00' >>>> BSON.encode({'_id': 1, 'x': Binary(base64.b64decode('1234'), subtype=2)}) b'\x1d\x00\x00\x00\x10_id\x00\x01\x00\x00\x00\x05x\x00\x07\x00\x00\x00\x02\x03\x00\x00\x00\xd7m\xf8\x00' mark.benvenuto commented on Tue, 2 Jul 2019 19:52:26 +0000: SubType 2 was deprecated here - https://github.com/mongodb/mongo/commit/23882959f5f7a66b39524af8710277ee0dfc29bb I think we are appending the extra length in the (byte*). See https://github.com/mongodb/mongo/commit/23882959f5f7a66b39524af8710277ee0dfc29bb
In the shell insert a BinData with subtype 2: > db.coll.insert({x: BinData(2, '1234')}) And in the driver you try to read that, you get a parsing exception: > python -i >>> from pymongo import MongoClient >>> c = MongoClient() >>> c.test.test.find_one() Traceback (most recent call last): File "", line 1, in File "/Users/kevinalbertson/code/mongo-python-driver/pymongo/collection.py", line 1267, in find_one for result in cursor.limit(-1): File "/Users/kevinalbertson/code/mongo-python-driver/pymongo/cursor.py", line 1200, in next if len(self.__data) or self._refresh(): File "/Users/kevinalbertson/code/mongo-python-driver/pymongo/cursor.py", line 1114, in _refresh self.__send_message(q) File "/Users/kevinalbertson/code/mongo-python-driver/pymongo/cursor.py", line 988, in __send_message codec_options=self.__codec_options) File "/Users/kevinalbertson/code/mongo-python-driver/pymongo/cursor.py", line 1077, in _unpack_response return response.unpack_response(cursor_id, codec_options) File "/Users/kevinalbertson/code/mongo-python-driver/pymongo/message.py", line 1474, in unpack_response return bson.decode_all(self.payload_document, codec_options) bson.errors.InvalidBSON: invalid length or type code