DocDB is the underlying document storage engine of YugabyteDB and is built on top of a highly customized and optimized version of RocksDB, a log-structured merge tree (LSM)-based key-value store. Several enhancements and customizations have been made on top of the vanilla RocksDB to make DocDB highly performant and scalable. DocDB in essence manages multiple RocksDB instances which are created one per tablet.

DocDB Document Storage Layer

Data model

DocDB is a persistent key-value store, which means that data is stored and retrieved using unique keys. It supports ordered data operations, allowing efficient range queries and iteration over keys. The keys are designed for fast lookups and efficient range scans.

To understand more about how row data is stored as keys and values in DocDB, see DocDB data model.

Storage engine

DocDB is a log-structured merge-tree (LSM) based storage engine. This design is optimized for high write throughput and efficient storage utilization. Data is stored in multiple SSTs (Sorted String Tables) to store key-value data on disk. It is designed to be efficient for both sequential and random access patterns. DocDB periodically compacts data by merging and sorting multiple SST files into a smaller set of files. This process helps to maintain a consistent on-disk format and reclaim space from obsolete data.

To understand more about how LSM tree stores data in SSTs, see LSM and SST.

Performance

DocDB is written in C++ and is designed to be highly performant across various platforms and architectures, including Linux, macOS, Windows, and various UNIX-like systems.

To understand more about how DocDB enhances RocksDB, see Performance.