bson
– BSON (Binary JSON) Encoding and Decoding¶
BSON (Binary JSON) encoding and decoding.
The mapping from Python types to BSON types is as follows:
Python Type | BSON Type | Supported Direction |
---|---|---|
None | null | both |
bool | boolean | both |
int [1] | int32 / int64 | py -> bson |
long | int64 | py -> bson |
bson.int64.Int64 | int64 | both |
float | number (real) | both |
string | string | py -> bson |
unicode | string | both |
list | array | both |
dict / SON | object | both |
datetime.datetime [2] [3] | date | both |
bson.regex.Regex | regex | both |
compiled re [4] | regex | py -> bson |
bson.binary.Binary | binary | both |
bson.objectid.ObjectId | oid | both |
bson.dbref.DBRef | dbref | both |
None | undefined | bson -> py |
unicode | code | bson -> py |
bson.code.Code | code | py -> bson |
unicode | symbol | bson -> py |
bytes (Python 3) [5] | binary | both |
Note that, when using Python 2.x, to save binary data it must be wrapped as an instance of bson.binary.Binary. Otherwise it will be saved as a BSON string and retrieved as unicode. Users of Python 3.x can use the Python bytes type.
[1] | A Python int will be saved as a BSON int32 or BSON int64 depending
on its size. A BSON int32 will always decode to a Python int. A BSON
int64 will always decode to a Int64 . |
[2] | datetime.datetime instances will be rounded to the nearest millisecond when saved |
[3] | all datetime.datetime instances are treated as naive. clients should always use UTC. |
[4] | Regex instances and regular expression
objects from re.compile() are both saved as BSON regular expressions.
BSON regular expressions are decoded as Regex
instances. |
[5] | The bytes type from Python 3.x is encoded as BSON binary with
subtype 0. In Python 3.x it will be decoded back to bytes. In Python 2.x
it will be decoded to an instance of Binary with
subtype 0. |
-
class
bson.
BSON
¶ BSON (Binary JSON) data.
-
decode
(codec_options=CodecOptions(document_class=dict, tz_aware=False, uuid_representation=PYTHON_LEGACY, unicode_decode_error_handler=’strict’, tzinfo=None))¶ Decode this BSON data.
By default, returns a BSON document represented as a Python
dict
. To use a differentMutableMapping
class, configure aCodecOptions
:>>> import collections # From Python standard library. >>> import bson >>> from bson.codec_options import CodecOptions >>> data = bson.BSON.encode({'a': 1}) >>> decoded_doc = bson.BSON.decode(data) <type 'dict'> >>> options = CodecOptions(document_class=collections.OrderedDict) >>> decoded_doc = bson.BSON.decode(data, codec_options=options) >>> type(decoded_doc) <class 'collections.OrderedDict'>
Parameters: - codec_options (optional): An instance of
CodecOptions
.
Changed in version 3.0: Removed compile_re option: PyMongo now always represents BSON regular expressions as
Regex
objects. Usetry_compile()
to attempt to convert from a BSON regular expression to a Python regular expression object.Replaced as_class, tz_aware, and uuid_subtype options with codec_options.
Changed in version 2.7: Added compile_re option. If set to False, PyMongo represented BSON regular expressions as
Regex
objects instead of attempting to compile BSON regular expressions as Python native regular expressions, thus preventing errors for some incompatible patterns, see PYTHON-500.- codec_options (optional): An instance of
-
classmethod
encode
(document, check_keys=False, codec_options=CodecOptions(document_class=dict, tz_aware=False, uuid_representation=PYTHON_LEGACY, unicode_decode_error_handler=’strict’, tzinfo=None))¶ Encode a document to a new
BSON
instance.A document can be any mapping type (like
dict
).Raises
TypeError
if document is not a mapping type, or contains keys that are not instances ofbasestring
(str
in python 3). RaisesInvalidDocument
if document cannot be converted toBSON
.Parameters: - document: mapping type representing a document
- check_keys (optional): check if keys start with ‘$’ or
contain ‘.’, raising
InvalidDocument
in either case - codec_options (optional): An instance of
CodecOptions
.
Changed in version 3.0: Replaced uuid_subtype option with codec_options.
-
-
bson.
decode_all
(data, codec_options=CodecOptions(document_class=dict, tz_aware=False, uuid_representation=PYTHON_LEGACY, unicode_decode_error_handler=’strict’, tzinfo=None))¶ Decode BSON data to multiple documents.
data must be a string of concatenated, valid, BSON-encoded documents.
Parameters: - data: BSON data
- codec_options (optional): An instance of
CodecOptions
.
Changed in version 3.0: Removed compile_re option: PyMongo now always represents BSON regular expressions as
Regex
objects. Usetry_compile()
to attempt to convert from a BSON regular expression to a Python regular expression object.Replaced as_class, tz_aware, and uuid_subtype options with codec_options.
Changed in version 2.7: Added compile_re option. If set to False, PyMongo represented BSON regular expressions as
Regex
objects instead of attempting to compile BSON regular expressions as Python native regular expressions, thus preventing errors for some incompatible patterns, see PYTHON-500.
-
bson.
decode_file_iter
(file_obj, codec_options=CodecOptions(document_class=dict, tz_aware=False, uuid_representation=PYTHON_LEGACY, unicode_decode_error_handler=’strict’, tzinfo=None))¶ Decode bson data from a file to multiple documents as a generator.
Works similarly to the decode_all function, but reads from the file object in chunks and parses bson in chunks, yielding one document at a time.
Parameters: - file_obj: A file object containing BSON data.
- codec_options (optional): An instance of
CodecOptions
.
Changed in version 3.0: Replaced as_class, tz_aware, and uuid_subtype options with codec_options.
New in version 2.8.
-
bson.
decode_iter
(data, codec_options=CodecOptions(document_class=dict, tz_aware=False, uuid_representation=PYTHON_LEGACY, unicode_decode_error_handler=’strict’, tzinfo=None))¶ Decode BSON data to multiple documents as a generator.
Works similarly to the decode_all function, but yields one document at a time.
data must be a string of concatenated, valid, BSON-encoded documents.
Parameters: - data: BSON data
- codec_options (optional): An instance of
CodecOptions
.
Changed in version 3.0: Replaced as_class, tz_aware, and uuid_subtype options with codec_options.
New in version 2.8.
-
bson.
gen_list_name
()¶ Generate “keys” for encoded lists in the sequence b”0”, b”1”, b”2”, …
The first 1000 keys are returned from a pre-built cache. All subsequent keys are generated on the fly.
-
bson.
has_c
()¶ Is the C extension installed?
-
bson.
is_valid
(bson)¶ Check that the given string represents valid
BSON
data.Raises
TypeError
if bson is not an instance ofstr
(bytes
in python 3). ReturnsTrue
if bson is validBSON
,False
otherwise.Parameters: - bson: the data to be validated
Sub-modules:
binary
– Tools for representing binary data to be stored in MongoDBcode
– Tools for representing JavaScript codecodec_options
– Tools for specifying BSON codec optionsdbref
– Tools for manipulating DBRefs (references to documents stored in MongoDB)decimal128
– Support for BSON Decimal128errors
– Exceptions raised by thebson
packageint64
– Tools for representing BSON int64json_util
– Tools for using Python’sjson
module with BSON documentsmax_key
– Representation for the MongoDB internal MaxKey typemin_key
– Representation for the MongoDB internal MinKey typeobjectid
– Tools for working with MongoDB ObjectIdsraw_bson
– Tools for representing raw BSON documents.regex
– Tools for representing MongoDB regular expressionsson
– Tools for working with SON, an ordered mappingtimestamp
– Tools for representing MongoDB internal Timestampstz_util
– Utilities for dealing with timezones in Python