DEV Community

Johannes Lichtenberger
Johannes Lichtenberger

Posted on

1

Version Your Database / Future Directions

Hi all,

as I want to release version 1.0.0 of SirixDB[1] soon, but lack an Open Source community sadly I wanted to discuss here what you think is most important for future directions.

To keep it short SirixDB keeps the history of each resource in a database through a huge index-trie structure completely copy-on-write based. This means it shares unchanged database pages between revisions. SirixDB allows sophisticated time-travel queries and implements diffing algorithms. It stores XML and JSON in a binary format natively, but could as well store graphs or other kinds of data.

Ideas for the future would be:

  • horizontally scaling, that is writing through a single master, providing reading your own writes consistency, replicate resources on a few cluster-nodes... most probably using ZooKeeper and Apache BookKeeper with exactly once delivery semantics...
  • interactive visualizations of the differences between revisions of the resources. SirixDB currently stores tree structured data in a binary format, that is both XML and JSON. Diffing capabilities are already there. Also some outdated visualizations[2] in Processing which I'd love to port to D3 to the web. Furthermore a web-interface would be nice
  • Adding cost-based query optimizer rules and index-rewrite rules to improve query performance considerably
  • Looking into how to cleverly be able to delete old revisions (I have to look up how ZFS allows deletion of snapshots). However, as a kind of ugly hack a background process could for instance copy the most recent revision to a new resource for now. It's getting kind of tricky I guess as unchanged database pages are shared between revisions and record pages are even versioned. Thus, a page needs to be reconstructed from page fragments of different revisions depending on the algorithm used.

Besides I want to finish stuff for versioning the whole database, not just resources in a database.

Until recently I thought I'd look into horizontal scaling, to use the GraalVM for native images, that is to provide super fast startup times in docker containers, work on writing/reading from a Bookkeeper cluster and deploy everything to a Kubernetes cluster.

But maybe showcasing what's possible with beautiful interactive visualizations would get probably more attention and I think for me it would be great to learn front-end stuff, too. It might also be more useful due to the complete lack of users, thus it's only really interesting from an engineering perspective ;-)

Kind regards and have a great weekend
Johannes

[1] https://sirix.io and https://github.com/sirixdb/sirix
[2] https://m.youtube.com/watch?feature=youtu.be&v=l9CXXBkl5vI

Top comments (0)

Great read:

Is it Time to go Back to the Monolith?

History repeats itself. Everything old is new again and I’ve been around long enough to see ideas discarded, rediscovered and return triumphantly to overtake the fad. In recent years SQL has made a tremendous comeback from the dead. We love relational databases all over again. I think the Monolith will have its space odyssey moment again. Microservices and serverless are trends pushed by the cloud vendors, designed to sell us more cloud computing resources.

Microservices make very little sense financially for most use cases. Yes, they can ramp down. But when they scale up, they pay the costs in dividends. The increased observability costs alone line the pockets of the “big cloud” vendors.

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay