Last week I started working with HBase. As someone who has worked almost exclusively with relational databases so far, it was not easy for me to understand the concepts behind HBase.
There are several tutorials for HBase on the internet, but very few of them helped me to understand it. In this blog post I want to gather the links that have been useful to me:
This blog post is also linked in the official documentation. Jim explains with JSON how the data is structured in HBase. I found this explanation particularly descriptive.
At the beginning, Amandeep explains the structure of HBase. In the second part, he then uses an example to show how HBase is applied. He compares different solution approaches and indicates their advantages and disadvantages. Especially the second part helped me to design my own HBase schema.
To be honest, I have not read the complete HBase documentation. It is far too extensive to understand the fundamentals. But the chapter Data Model gives a good impression of how HBase is structured.
To play around a bit with HBase, I used the following git repository: https://github.com/big-data-europe/docker-hbase/
I should mention that for performance reasons HBase should not be used with Docker in production. To learn HBase, however, Docker is perfectly sufficient.
I used the
docker-compose-standalone.yml file. After a
docker-compose up, I logged in with
docker exec -it hbase bash on the container. With
hbase shell on the container, I was connected with a working HBase environment.
I play around with the HBase Shell commands I found here: https://sparkbyexamples.com/hbase/hbase-shell-commands-cheat-sheet