We can install Hadoop ecosystem to implement Data Lake system. But there are a dozen more frameworks to be installed after the installation of Hadoop itself.
So instead of installing frameworks one by one, we can install the whole stack. There are different types of Hadoop based stacks, such as Yahoo stack, Google stack and so on.
Cloudera stack is one of them.
First install the Oracle Virtual Box following this guide.
https://linuxize.com/post/how-to-install-virtualbox-on-ubuntu-20-04/
Then download the Cloudera VM from this link
https://downloads.cloudera.com/demo_vm/virtualbox/cloudera-quickstart-vm-5.12.0-0-virtualbox.zip
Then install the Cloudera VM in the Oracle Virtual Box
https://www.simplilearn.com/tutorials/big-data-tutorial/cloudera-quickstart-vm
Top comments (0)