This article is written for people who build their own clusters on Gentoo Linux.
Index
- Introduction
- Installing
- Preparation
- Dependencies
- Build & Install
- Post-install settings
- Launch
- Start Database Server
- Start the Service
- Register a Node
- Check the Startup Status
- Conclusion
- References
Introduction
OpenPBS is a job scheduling software for a High Performance Computing (HPC) system. This is one of the few open source software for job management system for HPC, and originally an open source version of proprietary products. This article tries installing it into Gentoo Linux.
Installing
Preparation
Following the Gentoo Linux styles, all software used this time are built and installed from the source codes. Installation of OpenPBS on other operating systems except Gentoo is described in openpbs/INSTALL on GitHub.
Note: This experiment is tested on Gentoo virtual machine with Vagrant on openSUSE Tumbleweed . Security settings are not taken into account. When actually using OpenPBS, SET security settings such as firewalld
and iptables
.
USE Flag Settings
USE Flag settings are described on the page of "HPC on Gentoo" on Gentoo Wiki.
# cat /etc/portage/make.conf | grep USE
USE="${USE} 3dnow fortran gnuplot gpm mmx ncurses nptl nptlonry openmp ompt
pam perl ssl tcpd unicode vim-syntax xml zlib python networkmanager"
sudo
and git
(optionally)
Install the package sudo
and git
used for the main installation work.
# emerge --ask --verbose app-admin/sudo dev-vcs/git
Dependencies
Installing Packages by Portage
Install necessary packages to build OpenPBS.
#!/bin/bash
# openpbs_deps_install.sh
### Dependencies on Build-time
emerge --ask --verbose
sys-devel/gcc sys-devel/make sys-devel/libtool \
sys-apps/hwloc sys-libs/ncurses dev-lang/perl dev-lang/python \
dev-libs/openssl sys-devel/autoconf sys-devel/automake \
dev-libs/libedit dev-db/postgresql dev-lang/swig \
x11-libs/libX11 x11-libs/libXt x11-libs/libXext x11-libs/libXft \
media-libs/fontconfig
### Dependencies on Run-time
emerge --ask --verbose
dev-libs/libical x11-libs/libXrender mail-mta/sendmail
Execute this shellscript.
# ./openpbs_deps_install.sh
Building and Installing Tcl/Tk on Manually
Tcl is one of the script languages, and Tk is GUI toolkit of that. These are able to install by Portage. In that case, however, static libraries that building OpenPBS might depends on are not generated in /usr/lib
.
Therefore, these two must build and install from Tarball on manually. The source codes can be downloaded from Tcl Developer Site.
First, install Tcl by executing following shellscript.
/usr
is the place of installation and /usr/local
is used as working directory.
#!/bin/bash
# tcl_install.sh
### Tcl
cd /usr/local
wget https://prdownloads.sourceforge.net/tcl/tcl8.6.11-src.tar.gz
tar xzf ./tcl8.6.11-src.tar.gz
cd ./tcl8.6.11/unix
./configure --prefix=/usr --enable-static --disable-shared
make
make install
Execute this.
# ./tcl_install.sh
Next, install Tk.
#!/bin/bash
# tk_install.sh
### Tk
cd /usr/local
wget https://prdownloads.sourceforge.net/tcl/tk8.6.11.1-src.tar.gz
tar xzf ./tcl8.6.11-src.tar.gz
cd ./tk8.6.11/unix
./configure --prefix=/usr --enable-static --disable-shared
make
make install
Execute this.
# ./tk_install.sh
Build & Install
In this article, the place of installation of OpenPBS is /opt/pbs
. First, the git clone
command of openpbs/openpbs on GitHub or extend tarball in order to get the source code. And then execute the shellscript of autoge.sh
.
The Case of git clone
# cd /opt
# git clone https://github.com/openpbs/openpbs /opt/openpbs-20.0.1
# cd openpbs-20.0.1
# git checkout v20.0.1
# ./autoge.sh
The Case of Extending Tarball
# cd /opt
# wget https://github.com/openpbs/openpbs/archive/refs/tags/v20.0.1.tar.gz
# tar xzf v20.0.1.tar.gz -C ./openpbs-20.0.1
# cd openpbs-20.0.1
#./autoge.sh
Executing configure
Set the environment variable LDFLAGS="-ltinfo"
to link libtinfo.so
, and then execute ./configure
with the following options.
# cd /opt/openpbs-20.0.1
# export LDFLAGS="-ltinfo"
# ./configure --prefix=/opt/pbs --libexecdir=/opt/pbs/libexec
Executing make
& make install
Next, execute the commands make
and make install
. To avoid an error when executing the make
command, set LDFLAGS
as follows.
# export LDFLAGS="${LDFLAGS} -L/usr/lib64 -lfontconfig -lXft"
# make
# make install
Confirmation
Execute the following command, and if the version number is displayed, the installation is successfull.
# /opt/pbs/bin/pbs_hostn --version
pbs_version = 20.0.1
Post-install settings
Executing Post-install script
Execute the following command on each nodes.
# /opt/pbs/libexec/pbs_postinstall
Then, the file of /etc/pbs.conf
is generated. The environment variable PBS_HOME
is set to /var/spool/pbs
and configuration files and log directories are generated.
Next, set the permissions of these two executable file as follows.
# chmod 4755 /opt/pbs/sbin/pbs_iff /opt/pbs/sbin/pbs_rcp
Settings of /etc/pbs.conf
/etc/pbs.conf
is a file that describes the operation and role of the localhost. See the documents of PBS Pro "PBS Professional Installation & Update Guide" for the role of a host.
-
PBS_SERVER
describes the hostname of the master node. -
PBS_START_*
describes the role of the host.
On the master node, the role of SERVER, SCHEDULER and COMMUNICATION are required, so set the variables as follows.
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
In this article, calculation is executed on the master node, so set the variable PBS_START_MOM
to 1
.
PBS_START_MOM=1
If you want to use a different compute nodes, set the above three variables to 0
and set PBS_START_MOM=1
.
# cat /etc/pbs.conf
PBS_SERVER=master
PBS_START_SERVER=1
PBS_START_SCHED=1
PBS_START_COMM=1
PBS_START_MOM=1
PBS_EXEC=/opt/pbs
PBS_HOME=/var/spool/pbs
PBS_CORE_LIMIT=unlimited
PBS_SCP=/usr/bin/scpjk
Settings ofserver_priv
and mom_priv
server_priv/nodes
describes the hostnames of the nodes to be added. When adding more nodes, add a hostname that can be resolved in the same way. The settings of server_priv
is set only for the node in charge of the server role, that is, the master node.
# cat /var/spool/pbs/server_priv/nodes
master np=1 allnodes
the settings of mom_priv/config
are as follows:
# cat /var/spool/pbs/mom_priv/config
$clienthost master
$restrict_user_maxsysid 999
Write the hostname of the master node on the right of $clienthost
.
Launch
Start Database Server
Load the configuration of postgresql-13 and start the database service.
## emerge —-config dev-db/postgresql:13
# /etc/init.d/postgresql-13 start
# rc-service postgresql-13 status
* Checking PostgreSQL 13 status ...
pg_ctl: server is running (PID: XXXX)
/usr/lib64/postgresql-13/bin/postgres "-D" "/etc/postgresql-13"
...
If you want to start the service when the OS boots, execute the following command.
# rc-update add postgresql-13 default
Start the Service
Execute the pbs_habitat
command which sets initial configuration in order to start the PBS service.
# /opt/pbs/libexec/pbs_habitat
***
*** Setting default queue and resource limits.
***
*** End of /opt/pbs/libexec/pbs_habitat
# rc-service pbs start
Starting PBS
/opt/pbs/sbin/pbs_comm ready (pid=XXXX), Proxy Name:master:17001, Threads:4
PBS comm
PBS sched
Connecting to PBS dataservice...connected to PBS dataservice@master
PBS server
Register a Node
Use the qmgr
command to register and manage nodes.
# /opt/pbs/bin/qmgr -c "create node master" #-> node registration
# rc-service pbs restart #-> service restart
Check the Startup Status
Thepbsnodes
command displays the current status of the nodes. If it says state = free
, starting pbs is successful. In this state, the nodes is ready to receive a job and compute the calculation.
$ pbsnodes -a
master
Mom = master
ntype = PBS
state = free
pcpus = 1
resources_available.arch = linux
resources_available.host = master
resources_available.mem = 2033580kb
resources_available.ncpus = 1
resources_available.vnode = master
resources_assigned.accelerator_memory = 0kb
resources_assigned.hbmem = 0kb
resources_assigned.mem = 0kb
resources_assigned.naccelerators = 0
resources_assigned.ncpus = 0
resources_assigned.vmem = 0kb
resv_enable = True
sharing = default_shared
license = l
last_state_change_time = Sat Sep 25 10:05:51 2021
Conclusion
This article describes the procedure for installing the job scheduler OpenPBS on Gentoo Linux. Install most of the dependent packages with Portage, and install some manually to prepare the library files. You can build and install by specifying reference to those library files.
References
- Altair PBS Works | Support and Documentation
- ジョブスケジューラPBS ProでGPU計算クラスタを組みAIを効率的に学習させる方法(前編) – Qiita [Japanese]
- CentOS7でOpenPBSの設定 – Qiita [Japanese]
Note
This article is translated from a Japanese article on MY website.
Top comments (0)