DEV Community

Franck Pachot
Franck Pachot

Posted on

Which Document class is best to use in Java to read MongoDB documents?

TL;DR: the answer is in the title, use Document.

BSON is a serialization format, similar to protobuf, designed for efficient document storage on disk and transfer over the network as a byte stream. Instead of scanning and rewriting the entire byte sequence to access its contents (fields, arrays, subdocuments), you work with an in-memory object that exposes methods to read and write fields efficiently. On MongoDB's server side, that's the mutable BSON object. On the client side, the drivers provide a similar API. Here are the five object types that implement the BSON interface in Java

  • Document is the recommendation for most applications. It provides the best balance of flexibility, ease of use, and functionality.

Only consider the other classes when you have specific requirements:

  • BsonDocument: When you need strict BSON type safety rather than application types.

  • RawBsonDocument: When you need to access the raw bytes of the document rather than field values.

  • JsonObject: When working exclusively with JSON strings, as plain text.

  • BasicDBObject: Only for legacy code migration.

RawBsonDocument is used internally—for example, for client-side encryption and change streams—whereas the other classes can all be used directly in MongoDB driver operations. Your choice mainly impacts how you construct, manipulate, and access document data in your application code. They are documented:

I reviewed the documentation and code to better understand their differences. Especially when comparing with other databases which have different design and API. In this post I provide a detailed description of those classes, what they are, how they are implemented, what they provide, and when to use them

Detailed description

Here's a detailed comparison of the five document classes available in the MongoDB Java Driver and when to use each:

Document (org.bson)

Document is a flexible representation of a BSON document that implements Map<String, Object>. It uses a LinkedHashMap<String, Object> internally to maintain insertion order.

Key characteristics:

  • Loosely-typed: Values are stored as Object, allowing you to use standard Java types (String, Integer, Date, etc.)
  • Flexible: Easy to work with dynamically structured documents
  • Map interface: Provides all standard Map operations

Document is the recommended default choice for most applications. Use it when you want a flexible and concise data representation that's easy to work with.

BsonDocument (org.bson)

BsonDocument is a type-safe container for BSON documents that implements Map<String, BsonValue>. It also uses a LinkedHashMap, but stores BsonValue types.

Key characteristics:

  • Type-safe: All values must be wrapped in BSON library types (BsonString, BsonInt32, BsonDocument, etc.)
  • Stricter API: Provides compile-time type safety but requires more verbose code
  • Map interface: Implements Map<String, BsonValue>

Use BsonDocument when you need a type-safe API and want explicit control over BSON types. This is particularly useful when you need to ensure precise type handling or when working with APIs that require BsonDocument.

RawBsonDocument (org.bson)

RawBsonDocument is an immutable BSON document represented using only raw bytes. It stores the BSON document as a byte array without parsing it.

Key characteristics:

  • Immutable: All mutation operations throw UnsupportedOperationException
  • Lazy parsing: Data is only parsed when accessed, making it very efficient for pass-through scenarios, when not for accessing each individual field
  • Memory efficient: Stores raw bytes, avoiding object allocation overhead
  • Can decode to other types: Provides decode() method to convert to other document types when needed

Use RawBsonDocument when you need maximum performance and memory efficiency for whole-document operations. It is particularly useful when reading documents from MongoDB and passing them to another system unchanged, when working with large documents that you don’t need to parse, when building high-performance data pipelines where parsing overhead matters, and when you need an immutable document representation.

JsonObject (org.bson.json)

JsonObject is a wrapper class that holds a JSON object string. It simply stores the JSON as a String.

Key characteristics:

  • Does not implement Map: It's just a string wrapper with validation
  • No parsing required: Avoids conversion to Map structure if you're just working with JSON
  • JSON-focused: Designed for applications that primarily work with JSON strings
  • Supports Extended JSON: Works with MongoDB Extended JSON format

Use JsonObject when you want to work directly with JSON strings and avoid the overhead of converting to and from Map objects. This is ideal for REST APIs that consume and produce JSON, for logging or persisting documents as JSON strings, and for applications that primarily handle JSON and do not require programmatic field-level access.

BasicDBObject (com.mongodb)

BasicDBObject is a legacy BSON object implementation that extends BasicBSONObject and implements the DBObject interface.

Key characteristics:

  • Legacy class: Exists for backward compatibility with older driver versions
  • Does not implement Map: Only implements the DBObject interface, lacking modern Map convenience methods
  • Binary compatibility concerns: Implements an interface rather than extending a class, which can cause compatibility issues

Only use BasicDBObject when migrating from a legacy driver version (pre-3.0). The documentation explicitly recommends avoiding this class for new development due to its limitations.

Conversion Between Types

All of these classes implement the Bson interface, which allows them to be used interchangeably in MongoDB operations (but without the same performance). You can also convert between types:

  • BsonDocument.parse(json) to create from JSON
  • RawBsonDocument.decode(codec) to convert RawBsonDocument to another type
  • Document can be converted to BsonDocument through codec operations
  • JsonObject.getJson() to extract the JSON string

Comparing RawBsonDocument and Document

RawBsonDocument stores BSON as a raw byte array and does not deserialize it in advance. When you access a field, it creates a BsonBinaryReader and scans the document sequentially, reading each field’s type and name until it finds the requested key. Only the matching field is decoded using RawBsonValueHelper.decode, while all other fields are skipped without parsing. For nested documents and arrays, it reads only their sizes and wraps the corresponding byte ranges in new RawBsonDocument or RawBsonArray instances, keeping their contents as raw bytes. This approach provides fast access for a single field lookup, while being memory-efficient and keeping the document immutable, which is ideal for large documents where only a few fields are needed or for documents that are mostly passed through without inspection.

In contrast, Document uses a fully deserialized LinkedHashMap<String, Object>. When a Document is created, all fields are eagerly parsed into Java objects. Field access and containsKey operations are simple HashMap lookups, and the document is fully mutable, supporting standard map operations such as put, remove, and clear. This design consumes more memory but is better suited to small or medium-sized documents, scenarios where many fields are accessed, or cases where the document needs to be modified frequently.

Finally, Document doesn't use RawBsonDocument for parsing or accessing fields since doing so would be inefficient, and the two serve different purposes.

Comparison with Oracle (OSON) and PostgreSQL (JSONB)

Neither Oracle nor PostgreSQL provides BSON as they use OSON and JSONB, so there's no BsonDocument or RawBsonDocument equivalent.

In Oracle’s JDBC driver, the closest equivalent to a Document is OracleJsonObject, one of the OracleJsonValue types, which can be exposed directly as a javax.json.JsonObject or mapped domain object. This API works directly on the underlying OSON bytes without fully parsing the entire document into an intermediate data structure. OSON is more than a raw serialization: it carries its own local dictionary of distinct field names, a sorted array of hash codes for those names, and compact field ID arrays and value offsets, enabling the driver to locate a field in place by binary‑searching the field ID array, and jumping to the right offset.

If JSON text is needed instead of the object model, the equivalent is simply to use ResultSet.getString(), which will convert the OSON image to JSON text on the client.

PostgreSQL’s JDBC driver, by contrast, offers no native Java JSON object API for either json or jsonb columns: values are always returned as text, so the application must parse the string into its own document model using a separate JSON library. Even when using PostgreSQL’s binary JSONB storage, none of the binary efficiency crosses the wire (See JSONB vs. BSON: Tracing PostgreSQL and MongoDB Wire Protocols), and client code still performs a full parse before accessing individual fields.

Conclusion

MongoDB’s main advantage for modern applications—whatever the data types or workloads—is the ability to work with data directly through your domain objects, without an intermediate object-relational mapping layer or view. Use the Document class as your document object model (DOM). It offers flexibility, map-style access, and a natural Java interface, while the driver transparently converts BSON from the network into objects your application can use immediately.

Top comments (0)