Releasing BadgerDB v2.0

Dgraph is an open-source, transactional, distributed, native Graph database. Dgraph is optimized for high-performance reads and writes. It can serve queries and mutations with low latency and high throughput, even when they involve deep joins and traversals.

Much of Dgraph's advanced performance capabilities come from Badger. Badger is the embedded key-value store that is responsible for storing all of Dgraph's data.

Badger itself is not distributed, Dgraph implements a layer on top of it to provide the distributed capabilities.

We didn't build Badger at the outset. We started with RocksDB as the datastore underneath. But it was evident in a short while that Dgraph needs specialized storage like Badger. If you're curious to know the motivations for building Badger, here is an article from Dgraph's founder Manish R Jain.

A few months ago, v1.6.0 of Badger was released. It had some significant performance improvements.

Now, Badger v2.0 is here! This release is shipped with some exciting features like encryption at rest, compression, and caching.

Also, there are no breaking API changes since v1.6.0. For previous versions, you should look at the release notes for v1.6.0, since that version did contain breaking changes.

There's a non-API breaking change due to the adoption of Go modules. To use Badger v2, you'll have to use Go modules in your project too.

Let's glance through the notable features and some noteworthy enhancements introduced with this release. If you want to find the complete list of fixes, enhancements, and features, check out the change log.

Let's start with the brand new features introduced in this release of Badger.

Data cache with Ristretto

We recently released Ristretto, a high-performance cache in Go.

With this release, Ristretto is integrated with Badger. This integration speeds up lookups and iterations significantly.

The cache is enabled by default, and you can set its size using the WithMaxCacheSize API.

Data compression

Data compression is another key feature introduced in this release.

It saves storage space by compressing every block of data using one of the two compression algorithms provided. Only the SST files are compressed, not the files in the vlog.

This option doesn't affect existing tables. Only the newly created tables are compressed.

Compression is enabled by default.

Badger supports two compression algorithms: zstd and snappy.

It uses the zstd compression algorithm when Badger is built with Cgo enabled. When built without Cgo enabled, it uses the snappy algorithm.

Encryption at rest

With a key focus on security, Badger now provides an option to encrypt its data!

To use encryption, you need to provide Badger an encryption key using the Options.WithEncryptionKey API.

Badger uses a different key to encrypt the data, these are called data keys, and they are auto-generated.

The data key expires at regular intervals, and Badger auto-generates a new one every time it expires. The new data will be encrypted using the new data key, while the rest of the data still be encrypted with the data keys generated before. The expiry time for the data key is configurable using the Options.WithEncryptionKeyRotationDuration function.

Badger stores the history of the data keys generated. All of them are encrypted using the encryption key, which you provided initially.

The two-step encryption process, by design, provides an additional layer of security by allowing you to rotate keys frequently without having to re-encrypt your dataset.

Ready to migrate?

Some of the performance enhancements have resulted in data format changes. Hence you cannot run this new version of Badger right out of the box on top of the data from the older versions. You need to migrate the data to the new to v2.0.

Here's how to do it:

Step 1: Export the data from the older version of Badger.

badger backup --dir path/to/badger/directory -f badger.backup

Step 2: Install v2.0 of Badger. You can find the new binary in $GOBIN after the installation.

go get -u github.com/dgraph-io/badger

cd $GOPATH/src/github.com/dgraph-io/badger

git checkout v2.0.0

cd badger && go install

Step 3: Import the data back to Badger 2.0.

badger restore --dir path/to/new/badger/directory -f badger.backup

Thanks

The new features in this release are born out of requests from our community and enterprise users.

We take this opportunity to thank all of our contributors and users. It's because of all of you BadgerDB is getting better with useful features, enhancements, and fixes in every release.

Do give BadgerDB 2.0 a try and let us know your feedback.