DEV Community

Discussion on: Building a Raspberry Pi Hadoop / Spark Cluster

Collapse
 
mouri11 profile image
Rohit Das

Hi. I am not able to get the hadoop version or java version for the nodes over ssh using clustercmd. I have set up a cluster of 5 Pis (model B+) with 32 GB micro SD cards in all of them. On running

clustercmd hadoop version | grep Hadoop
Enter fullscreen mode Exit fullscreen mode

I get the following error:

bash:hadoop:command not found
bash:hadoop:command not found
bash:hadoop:command not found
bash:hadoop:command not found
Hadoop 3.2.0
Enter fullscreen mode Exit fullscreen mode

I am attaching the .bashrc here. Please help. Thanks.

# ~/.bashrc: executed by bash(1) for non-login shells.
# see /usr/share/doc/bash/examples/startup-files (in the package bash-doc)
# for examples

# If not running interactively, don't do anything
case $- in
    *i*) ;;
      *) return;;
esac

# don't put duplicate lines or lines starting with space in the history.
# See bash(1) for more options
HISTCONTROL=ignoreboth

# append to the history file, don't overwrite it
shopt -s histappend

# for setting history length see HISTSIZE and HISTFILESIZE in bash(1)
HISTSIZE=1000
HISTFILESIZE=2000

# check the window size after each command and, if necessary,
# update the values of LINES and COLUMNS.
shopt -s checkwinsize

# If set, the pattern "**" used in a pathname expansion context will
# match all files and zero or more directories and subdirectories.
#shopt -s globstar

# make less more friendly for non-text input files, see lesspipe(1)
#[ -x /usr/bin/lesspipe ] && eval "$(SHELL=/bin/sh lesspipe)"

# set variable identifying the chroot you work in (used in the prompt below)
if [ -z "${debian_chroot:-}" ] && [ -r /etc/debian_chroot ]; then
    debian_chroot=$(cat /etc/debian_chroot)
fi

# set a fancy prompt (non-color, unless we know we "want" color)
case "$TERM" in
    xterm-color|*-256color) color_prompt=yes;;
esac

# uncomment for a colored prompt, if the terminal has the capability; turned
# off by default to not distract the user: the focus in a terminal window
# should be on the output of commands, not on the prompt
force_color_prompt=yes

if [ -n "$force_color_prompt" ]; then
    if [ -x /usr/bin/tput ] && tput setaf 1 >&/dev/null; then
    # We have color support; assume it's compliant with Ecma-48
    # (ISO/IEC-6429). (Lack of such support is extremely rare, and such
    # a case would tend to support setf rather than setaf.)
    color_prompt=yes
    else
    color_prompt=
    fi
fi

if [ "$color_prompt" = yes ]; then
    PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w \$\[\033[00m\] '
else
    PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w\$ '
fi
unset color_prompt force_color_prompt

# If this is an xterm set the title to user@host:dir
case "$TERM" in
xterm*|rxvt*)
    PS1="\[\e]0;${debian_chroot:+($debian_chroot)}\u@\h: \w\a\]$PS1"
    ;;
*)
    ;;
esac

# enable color support of ls and also add handy aliases
if [ -x /usr/bin/dircolors ]; then
    test -r ~/.dircolors && eval "$(dircolors -b ~/.dircolors)" || eval "$(dircolors -b)"
    alias ls='ls --color=auto'
    #alias dir='dir --color=auto'
    #alias vdir='vdir --color=auto'

    alias grep='grep --color=auto'
    alias fgrep='fgrep --color=auto'
    alias egrep='egrep --color=auto'
fi

# colored GCC warnings and errors
#export GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01'

# some more ls aliases
#alias ll='ls -l'
#alias la='ls -A'
#alias l='ls -CF'

# Alias definitions.
# You may want to put all your additions into a separate file like
# ~/.bash_aliases, instead of adding them here directly.
# See /usr/share/doc/bash-doc/examples in the bash-doc package.

if [ -f ~/.bash_aliases ]; then
    . ~/.bash_aliases
fi

# enable programmable completion features (you don't need to enable
# this, if it's already enabled in /etc/bash.bashrc and /etc/profile
# sources /etc/bash.bashrc).
if ! shopt -oq posix; then
  if [ -f /usr/share/bash-completion/bash_completion ]; then
    . /usr/share/bash-completion/bash_completion
  elif [ -f /etc/bash_completion ]; then
    . /etc/bash_completion
  fi
fi

# get hostname of other pis
function otherpis {
    grep "pi" /etc/hosts | awk '{print $2}' | grep -v $(hostname)
}

# send commands to other pis
function clustercmd {
    for pi in $(otherpis); do ssh $pi "$@"; done
    $@
}

# restart all pis
function clusterreboot {
    for pi in $(otherpis);do
        ssh $pi "sudo shutdown -r 0"
    done
}

# shutdown all pis
function clustershutdown {
    for pi in $(otherpis);do
        ssh $pi "sudo shutdown 0"
    done
}

# send files to all pis
function clusterscp {
    for pi in $(otherpis); do
        cat $1 | ssh $pi "sudo tee $1" > /dev/null 2>&1
    done
}

function clusterssh {
    for pi in $(otherpis); do
        ssh $pi $1
    done
}

# sudo htpdate -a -l time.nist.gov

export JAVA_HOME=/usr/local/jdk1.8.0/
export HADOOP_HOME=/opt/hadoop
export SPARK_HOME=/opt/spark
export PATH=/usr/local/jdk1.8.0/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SPARK_HOME/bin:$PATH
export HADOOP_HOME_WARN_SUPRESS=1
export HADOOP_ROOT_LOGGER="WARN,DRFA"
Enter fullscreen mode Exit fullscreen mode
Collapse
 
awwsmm profile image
Andrew (he/him)

Did you follow these steps?


Create the Directories

Create the required directories on all other Pis using:

$ clustercmd sudo mkdir -p /opt/hadoop_tmp/hdfs
$ clustercmd sudo chown pi:pi –R /opt/hadoop_tmp
$ clustercmd sudo mkdir -p /opt/hadoop
$ clustercmd sudo chown pi:pi /opt/hadoop

Copy the Configuration

Copy the files in /opt/hadoop to each other Pi using:

$ for pi in $(otherpis); do rsync –avxP $HADOOP_HOME $pi:/opt; done

This will take quite a long time, so go grab lunch.

When you're back, verify that the files copied correctly by querying the Hadoop version on each node with the following command:

$ clustercmd hadoop version | grep Hadoop
Hadoop 3.2.0
Hadoop 3.2.0
Hadoop 3.2.0
...

You can't run the hadoop command on the other Pis until you've copied over those hadoop directories. If you have done that, you also need to make sure that that directory is on the $PATH of the other Pis by including the following lines in each of their .bashrc files (sorry, I don't think I included this step in the instructions):


export JAVA_HOME=$(readlink –f /usr/bin/java | sed "s:bin/java::")
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

You could also simply clusterscp the .bashrc file from Pi #1 to each of the other Pis.


Collapse
 
mouri11 profile image
Rohit Das

Hi. Thanks for the reply. Yes, I did the steps you mentioned. Since Java wasn't pre-installed, I installed it manually in each Pi, and checked them individually to see if they are working. As you can see below, the env variables are configured as you have suggested.

export JAVA_HOME=/usr/local/jdk1.8.0/
export HADOOP_HOME=/opt/hadoop
export SPARK_HOME=/opt/spark
export PATH=/usr/local/jdk1.8.0/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SPARK_HOME/bin:$PATH
export HADOOP_HOME_WARN_SUPRESS=1
export HADOOP_ROOT_LOGGER="WARN,DRFA"
Thread Thread
 
mouri11 profile image
Rohit Das

Thanks. I resolved the issue by putting the PATH exports above the following part in .bashrc:

# If not running interactively, don't do anything
case $- in
    *i*) ;;
      *) return;;
esac
Enter fullscreen mode Exit fullscreen mode

I also put the export PATH commands in /etc/profile of each Pi. Thanks.