Ubuntu 18, m3.large, memory 8GB
Install openjdk(Not JRE)
sudo apt-get install openjdk-8-jdk
Get the Hadoop 2.9.0
wget https://archive.apache.org/dist/hadoop/core/hadoop-2.9.0/hadoop-2.9.0.tar.gz
Extract Hadoop in home folder
tar -xvf hadoop-2.9.0.tar.gz
Create folder for Hadoop
sudo mkdir /usr/lib/hadoop
Move extracted Hadoop folders to /usr/lib/hadoop
mv hadoop-2.9.0 /usr/lib/hadoop/
Find the JDK 8 path and note down as following
EXPORT=/usr/lib/jvm/java-1.8.0-openjdk-amd64
Open ~/.bashrc and put the above line at the end of thee file.
EXPORT=/usr/lib/jvm/java-1.8.0-openjdk-amd64
Load the env environment
source ~/.bashrc
Generate SSH
ssh-keygen -t rsa
cd ~
sudo .ssh/id_rsa.pub >> .ssh/authorized_keys
ssh-copy-id -i .ssh/id_rsa.pub ubuntu@localhost
Create hadoopdata folder in home directory
cd ~
mkdir hadoopdata
Go to xml files
cd /usr/lib/hadoop/hadoop-2.9.0/etc/hadoop
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/ubuntu/hadoopdata/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/ubuntu/hadoopdata/hdfs/data</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
Format the name node
hdfs namenode -format
Go to sbin directory of hadoop :
cd $HADOOP_HOME/sbin
Start the name node
./hadoop-daemon.sh start namenode
Start HDFS components
./start-dfs.sh
Stop all
./stop-all.sh
Start all
./start-all.sh
Then access the web ui for Hadoop in following webpages.
NameNode – aws_ip_address: 50070
DataNode – aws_ip_address: 50075
SecondaryNameNode – aws_ip_address: 50090
ResourceManager – aws_ip_address: 8088
In the next tutorial, we will install sqoop.
https://dev.to/zawhtutwin/installing-sqoop-on-hadoop-14n8
Top comments (0)