<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: anujd64</title>
    <description>The latest articles on DEV Community by anujd64 (@anujd64).</description>
    <link>https://dev.to/anujd64</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F952998%2Fe22f6bbc-faf2-47fa-9435-190f23f93164.png</url>
      <title>DEV Community: anujd64</title>
      <link>https://dev.to/anujd64</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/anujd64"/>
    <language>en</language>
    <item>
      <title>Access Logs Count</title>
      <dc:creator>anujd64</dc:creator>
      <pubDate>Wed, 13 Mar 2024 04:50:52 +0000</pubDate>
      <link>https://dev.to/anujd64/access-logs-count-4782</link>
      <guid>https://dev.to/anujd64/access-logs-count-4782</guid>
      <description>&lt;p&gt;This practical is very similar to &lt;a href="https://dev.to/anujd64/word-count-hadoop-5b11"&gt;WordCount&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Only changes are in input filetype and the code.&lt;br&gt;
This time we will copy access_log_short.csv into hdfs and rest everything is identical.&lt;/p&gt;

&lt;p&gt;Code and input file for this Practical can be found here:&lt;br&gt;
&lt;a href="https://drive.google.com/drive/u/5/folders/1ea03Rjio44iKt7d_Fl35f8a6sILnEnd_"&gt;Code &amp;amp; Input&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Word Count Hadoop</title>
      <dc:creator>anujd64</dc:creator>
      <pubDate>Wed, 13 Mar 2024 04:11:19 +0000</pubDate>
      <link>https://dev.to/anujd64/word-count-hadoop-5b11</link>
      <guid>https://dev.to/anujd64/word-count-hadoop-5b11</guid>
      <description>&lt;p&gt;Video version of this article: &lt;a href="https://www.youtube.com/watch?v=wTkffAYsCBw"&gt;https://www.youtube.com/watch?v=wTkffAYsCBw&lt;/a&gt;&lt;br&gt;
Credits: @UnboxingBigData&lt;/p&gt;

&lt;p&gt;&lt;a href="https://drive.google.com/drive/folders/1ea03Rjio44iKt7d_Fl35f8a6sILnEnd_?usp=sharing"&gt;Code&lt;/a&gt;&lt;br&gt;
Open Eclipse and create a new Java project.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feavtgrqicuqvdiv0pk6t.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Right click on project and click on Build Path &amp;gt; select Configure Build Path...&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi05b2y8lmhsknpbviez1.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Click on libraries tab and click on Add External JARs...&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2mu9ark5vl5yyjdbdyhz.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Navigate to hadoop common folder in my case it is &lt;code&gt;home/hadoop-3.3.6/share/hadoop/common&lt;/code&gt; and select all the JAR files.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2qvck1kn8euvrkcsiouo.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Again click on Add External JARs...&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvpwmz4kxa24ylgk5pfrw.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Navigate to hadoop mapreduce folder in my case it is &lt;code&gt;home/hadoop-3.3.6/share/hadoop/mapreduce&lt;/code&gt; and select all the JAR files.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4nfj1kyn2ulsh1u7uul.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Now the libraries tab will look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft55zbl3444z24u7vqzxv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft55zbl3444z24u7vqzxv.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Click apply and close.
&lt;/h2&gt;

&lt;p&gt;Now create the three classes, code for the classes is given at the end of the article. I am just copy pasting the classes in the src folder of our Eclipse project.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7r1ibja6ylmnzaa7jx7h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7r1ibja6ylmnzaa7jx7h.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fytzcrg04jb3zov7h6kag.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fytzcrg04jb3zov7h6kag.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbjvg8gbymgdn6ld5znah.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Now we will be exporting JAR for out project, right click on the project and select export.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13qx0yezsbrom5z1posd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13qx0yezsbrom5z1posd.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6i8lvx8ytyr8ym6dswy1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6i8lvx8ytyr8ym6dswy1.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Select the export destination: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36muq8xrki3m42txl7n8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36muq8xrki3m42txl7n8.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjehvphyk75g4iges4tos.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjehvphyk75g4iges4tos.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsjjggnvw1wmmbaxj9ix1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsjjggnvw1wmmbaxj9ix1.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Select the main class for the project:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zpwtxkipqo1ir6t9rp9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zpwtxkipqo1ir6t9rp9.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fakytcxwt721eelj872jl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fakytcxwt721eelj872jl.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F85ly33nrguzp05qjcs5n.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;You will find the JAR at the selected location:&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4zlcoj6fs8vinhw2q6qk.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Create an input.txt file like so:&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwvnqp8sn18co9bbbprsr.png" alt="Image description" width="800" height="500"&gt;
&lt;/h2&gt;

&lt;p&gt;Now just execute the following commands one by one&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd Desktop
start-all.sh
hadoop fs -mkdir /wc_input
hadoop fs -put input.txt /wc_input
hadoop jar WordCount.jar /wc_input/input.txt /wc_output/
hadoop fs -cat /wc_output/part-00000
stop-all.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What do these commands do ?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;-mkdir command creates the folder wc_input in the hdfs(hadoop's distributed file system)&lt;/li&gt;
&lt;li&gt;-put will copy the input.txt file to the folder we just created.&lt;/li&gt;
&lt;li&gt;jar command uses the code in the JAR file exported to perform map reduce on the input.txt file and saves the output in wc_output directory.&lt;/li&gt;
&lt;li&gt;-cat command just prints out the contents of the output produced by the mapreduce operation.&lt;/li&gt;
&lt;li&gt;to delete a directory in hdfs use
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;hadoop fs -rm -r &amp;lt;DIRECTORY_PATH&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;Your terminal should look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0hl5tg0o1qb4mzt99jku.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0hl5tg0o1qb4mzt99jku.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Code for the classes:&lt;/p&gt;

&lt;p&gt;WC_Runner.java&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
public class WC_Runner {
    public static void main(String[] args) throws IOException{
        JobConf conf = new JobConf(WC_Runner.class);
        conf.setJobName("WordCount");
        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(IntWritable.class);
        conf.setMapperClass(WC_Mapper.class);
        conf.setCombinerClass(WC_Reducer.class);
        conf.setReducerClass(WC_Reducer.class);
        conf.setInputFormat(TextInputFormat.class);
        conf.setOutputFormat(TextOutputFormat.class);
        FileInputFormat.setInputPaths(conf,new Path(args[0]));
        FileOutputFormat.setOutputPath(conf,new Path(args[1]));
        JobClient.runJob(conf);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;WC_Reducer.java&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class WC_Reducer  extends MapReduceBase implements Reducer&amp;lt;Text,IntWritable,Text,IntWritable&amp;gt; {
    public void reduce(Text key, Iterator&amp;lt;IntWritable&amp;gt; values,OutputCollector&amp;lt;Text,IntWritable&amp;gt; output,
                       Reporter reporter) throws IOException {
        int sum=0;
        while (values.hasNext()) {
            sum+=values.next().get();
        }
        output.collect(key,new IntWritable(sum));
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;WC_Mapper.java&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
public class WC_Mapper extends MapReduceBase implements Mapper&amp;lt;LongWritable,Text,Text,IntWritable&amp;gt;{
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    public void map(LongWritable key, Text value,OutputCollector&amp;lt;Text,IntWritable&amp;gt; output,
                    Reporter reporter) throws IOException{
        String line = value.toString();
        StringTokenizer  tokenizer = new StringTokenizer(line);
        while (tokenizer.hasMoreTokens()){
            word.set(tokenizer.nextToken());
            output.collect(word, one);
        }
    }

}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
    </item>
    <item>
      <title>Hadoop configuration</title>
      <dc:creator>anujd64</dc:creator>
      <pubDate>Sun, 03 Mar 2024 08:42:45 +0000</pubDate>
      <link>https://dev.to/anujd64/hadoop-configuration-bcf</link>
      <guid>https://dev.to/anujd64/hadoop-configuration-bcf</guid>
      <description>&lt;p&gt;Video version of this article: &lt;a href="https://www.youtube.com/watch?v=Slbi-uzPtnw"&gt;https://www.youtube.com/watch?v=Slbi-uzPtnw&lt;/a&gt;&lt;br&gt;
Credits:&lt;a class="mentioned-user" href="https://dev.to/codewitharjun"&gt;@codewitharjun&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;(Video uses cli text editor to edit config files, this tutorial will use normal text editor.)&lt;/p&gt;

&lt;p&gt;First install java-jdk-8&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sudo apt install openjdk-8-jdk&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;(Optional) To check it’s there &lt;/p&gt;

&lt;p&gt;&lt;code&gt;cd /usr/lib/jvm&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Now ensure that you are at the root of the terminal if not run&lt;br&gt;
&lt;code&gt;cd ~&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8vgh2wfe7t3i9gldh3re.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8vgh2wfe7t3i9gldh3re.png" alt="Image description" width="403" height="46"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Open .bashrc file &lt;/p&gt;

&lt;p&gt;&lt;code&gt;sudo gedit .bashrc&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;And paste in the following block&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 
export PATH=$PATH:/usr/lib/jvm/java-8-openjdk-amd64/bin 
export HADOOP_HOME=~/hadoop-3.3.6/ 
export PATH=$PATH:$HADOOP_HOME/bin 
export PATH=$PATH:$HADOOP_HOME/sbin 
export HADOOP_MAPRED_HOME=$HADOOP_HOME 
export YARN_HOME=$HADOOP_HOME 
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop 
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native 
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native" 
export HADOOP_STREAMING=$HADOOP_HOME/share/hadoop/tools/lib/hadoop-streaming-3.3.6.jar
export HADOOP_LOG_DIR=$HADOOP_HOME/logs 
export PDSH_RCMD_TYPE=ssh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;sudo apt-get install ssh&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;For following commands check your hadoop version number at the time of writing this it is 3.3.6&lt;/p&gt;

&lt;p&gt;Now go to hadoop.apache.org website download the tar file.&lt;br&gt;
&lt;a href="https://dlcdn.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz"&gt;Direct Link&lt;/a&gt;&lt;br&gt;
&lt;a href="https://hadoop.apache.org/releases.html"&gt;Website Link&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once downloaded execute:&lt;br&gt;
(To extract the tar file)&lt;/p&gt;

&lt;p&gt;&lt;code&gt;tar -zxvf ~/Downloads/hadoop-3.3.6.tar.gz&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;For all the configuration below ensure you are in &lt;code&gt;hadoop-3.3.6/etc/hadoop&lt;/code&gt; directory.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;cd hadoop-3.3.6/etc/hadoop&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Many of the files might have &lt;code&gt;&amp;lt;configuration&amp;gt;&lt;/code&gt; tag already so watch before you paste in new configurations.&lt;/p&gt;

&lt;p&gt;Now open hadoop-env.h:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sudo gedit hadoop-env.h&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Paste the following in hadoop-env.h:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;(set the path for JAVA_HOME)&lt;br&gt;
You might not need to use sudo in the following commands but to avoid permission issues I have added it to everything.&lt;br&gt;
Let's configure other files similarly:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sudo gedit core-site.xml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt; 
 &amp;lt;property&amp;gt; 
 &amp;lt;name&amp;gt;fs.defaultFS&amp;lt;/name&amp;gt; 
 &amp;lt;value&amp;gt;hdfs://localhost:9000&amp;lt;/value&amp;gt;  &amp;lt;/property&amp;gt; 
 &amp;lt;property&amp;gt; 
&amp;lt;name&amp;gt;hadoop.proxyuser.dataflair.groups&amp;lt;/name&amp;gt; &amp;lt;value&amp;gt;*&amp;lt;/value&amp;gt; 
 &amp;lt;/property&amp;gt; 
 &amp;lt;property&amp;gt; 
&amp;lt;name&amp;gt;hadoop.proxyuser.dataflair.hosts&amp;lt;/name&amp;gt; &amp;lt;value&amp;gt;*&amp;lt;/value&amp;gt; 
 &amp;lt;/property&amp;gt; 
 &amp;lt;property&amp;gt; 
&amp;lt;name&amp;gt;hadoop.proxyuser.server.hosts&amp;lt;/name&amp;gt; &amp;lt;value&amp;gt;*&amp;lt;/value&amp;gt; 
 &amp;lt;/property&amp;gt; 
 &amp;lt;property&amp;gt; 
&amp;lt;name&amp;gt;hadoop.proxyuser.server.groups&amp;lt;/name&amp;gt; &amp;lt;value&amp;gt;*&amp;lt;/value&amp;gt; 
 &amp;lt;/property&amp;gt; 
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;sudo gedit hdfs-site.xml&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Change &amp;lt;User&amp;gt; with your ubuntu username !&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?xml version="1.0"?&amp;gt;
&amp;lt;?xml-stylesheet type="text/xsl" href=''?&amp;gt;
&amp;lt;configuration&amp;gt;
&amp;lt;property&amp;gt;
  &amp;lt;name&amp;gt;dfs.name.dir&amp;lt;/name&amp;gt;
  &amp;lt;value&amp;gt;file:///home/&amp;lt;USER&amp;gt;/pseudo/dfs/name&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;property&amp;gt;
  &amp;lt;name&amp;gt;dfs.data.dir&amp;lt;/name&amp;gt;
  &amp;lt;value&amp;gt;file:///home/&amp;lt;USER&amp;gt;/pseudo/dfs/data&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;dfs.replication&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;1&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;sudo gedit mapred-site.xml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt; 
 &amp;lt;property&amp;gt; 
 &amp;lt;name&amp;gt;mapreduce.framework.name&amp;lt;/name&amp;gt;  &amp;lt;value&amp;gt;yarn&amp;lt;/value&amp;gt; 
 &amp;lt;/property&amp;gt; 
 &amp;lt;property&amp;gt;
 &amp;lt;name&amp;gt;mapreduce.application.classpath&amp;lt;/name&amp;gt; 
&amp;lt;value&amp;gt;$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*&amp;lt;/value&amp;gt; 
 &amp;lt;/property&amp;gt; 
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;sudo gedit yarn-site.xml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt; 
 &amp;lt;property&amp;gt; 
 &amp;lt;name&amp;gt;yarn.nodemanager.aux-services&amp;lt;/name&amp;gt; 
 &amp;lt;value&amp;gt;mapreduce_shuffle&amp;lt;/value&amp;gt; 
 &amp;lt;/property&amp;gt; 
 &amp;lt;property&amp;gt; 
 &amp;lt;name&amp;gt;yarn.nodemanager.env-whitelist&amp;lt;/name&amp;gt; 
&amp;lt;value&amp;gt;JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREP END_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME&amp;lt;/value&amp;gt; 
 &amp;lt;/property&amp;gt; 
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;/blockquote&gt;

&lt;p&gt;Hadoop is now configured.&lt;/p&gt;

&lt;p&gt;Next execute following one by one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh
ssh localhost 
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa 
cat ~/.ssh/id_rsa.pub &amp;gt;&amp;gt; ~/.ssh/authorized_keys 
chmod 0600 ~/.ssh/authorized_keys 
hadoop-3.3.6/bin/hdfs namenode -format
export PDSH_RCMD_TYPE=ssh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To start hadoop&lt;br&gt;
&lt;code&gt;start-all.sh&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;To check if hadoop is running go to &lt;a href="http://localhost:9870/"&gt;http://localhost:9870/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To stop hadoop&lt;br&gt;
&lt;code&gt;stop-all.sh&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This is an update version to this &lt;a href="https://codewitharjun.medium.com/install-hadoop-on-ubuntu-operating-system-6e0ca4ef9689"&gt;article&lt;/a&gt;&lt;/p&gt;

</description>
      <category>hadoop</category>
      <category>bigdata</category>
    </item>
  </channel>
</rss>
