DEV Community

Cover image for The Ultimate Hadoop Installation Cheat Sheet
Nishkarsh Raj
Nishkarsh Raj

Posted on

2 3

The Ultimate Hadoop Installation Cheat Sheet

1. Install Java

$ apt-get -y upgrade && apt-get -y update
$ apt install -y default-jdk
$ java --version

2. Create Dedicated Hadoop user

$ sudo addgroup [group name]
$ sudo adduser --ingroup [group name] [user name] 
$ sudo adduser [username] sudo # Add to sudoers group

3. Setup Local and HDFS network connection using SSH

$ sudo apt-get install openssh-client openssh-server
$ su - [username]
$ ssh-keygen -t rsa -P ""
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

4. Download Hadoop Tar file from official registry

Link to Hadoop Registry.

$ cd [to hadoop folder]
$ sudo tar xvzf [folder name]
$ sudo mv [extracted folder] /usr/local/hadoop
$ sudo chown -R [username] /usr/local/hadoop

5. Perform configurations

1. ~/.bashrc

Add following lines at End of file

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/native"
  • Execute the file to modify changes.
$ source ~/.bashrc

2. /usr/local/hadoop/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

3. nano /usr/local/hadoop/etc/hadoop/core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

4. /usr/local/hadoop/etc/hadoop/hdfs-site.xml

<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
</property>

5. /usr/local/hadoop/etc/hadoop/yarn-site.xml

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

6. /usr/local/hadoop/etc/hadoop/mapred-site.xml

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

6. Create directories for data node and name node

$ sudo mkdir -p /usr/local/hadoop_space
$ sudo mkdir -p /usr/local/hadoop_space/hdfs/namenode
$ sudo mkdir -p /usr/local/hadoop_space/hdfs/datanode
$ sudo chown -R nish /usr/local/hadoop_space

7. Running Hadoop in Action

i. Format Name node

$ hdfs namenode -format

ii. Start All hadoop components

$ start-dfs.sh

iii. Start YARN

$ start-yarn.sh

iv. Check which components are up

$ jps

AWS GenAI LIVE image

How is generative AI increasing efficiency?

Join AWS GenAI LIVE! to find out how gen AI is reshaping productivity, streamlining processes, and driving innovation.

Learn more

Top comments (0)

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay