Install - Guide Rocky8 Warewulf Slurm 2.9 x86 - 64
Install - Guide Rocky8 Warewulf Slurm 2.9 x86 - 64
9)
Cluster Building Recipes
Legal Notice
Copyright © 2016-2022, OpenHPC, a Linux Foundation Collaborative Project. All rights reserved.
Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
2 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Contents
1 Introduction 5
1.1 Target Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Requirements/Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Appendices 33
A Installation Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
B Upgrading OpenHPC Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
C Integration Test Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
D Customization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
D.1 Adding local Lmod modules to OpenHPC hierarchy . . . . . . . . . . . . . . . . . . . 37
D.2 Rebuilding Packages from Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
E Package Manifest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
F Package Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
1 Introduction
This guide presents a simple cluster installation procedure using components from the OpenHPC software
stack. OpenHPC represents an aggregation of a number of common ingredients required to deploy and
manage an HPC Linux* cluster including provisioning tools, resource management, I/O clients, develop-
ment tools, and a variety of scientific libraries. These packages have been pre-built with HPC integration
in mind while conforming to common Linux distribution standards. The documentation herein is intended
to be reasonably generic, but uses the underlying motivation of a small, 4-node stateless cluster installation
to define a step-by-step process. Several optional customizations are included and the intent is that these
collective instructions can be modified as needed for local site customizations.
Base Linux Edition: this edition of the guide highlights installation without the use of a companion con-
figuration management system and directly uses distro-provided package management tools for component
selection. The steps that follow also highlight specific changes to system configuration files that are required
as part of the cluster install process.
Unless specified otherwise, the examples presented are executed with elevated (root) privileges. The
examples also presume use of the BASH login shell, though the equivalent commands in other shells can
be substituted. In addition to specific command-line instructions called out in this guide, an alternate
convention is used to highlight potentially useful tips or optional configuration options. These tips are
highlighted via the following format:
Tip
Life is a tale told by an idiot, full of sound and fury signifying nothing. –Willy Shakes
1.2 Requirements/Assumptions
This installation recipe assumes the availability of a single head node master, and four compute nodes. The
master node serves as the overall system management server (SMS) and is provisioned with Rocky 8.8 and is
subsequently configured to provision the remaining compute nodes with Warewulf in a stateless configuration.
The terms master and SMS are used interchangeably in this guide. For power management, we assume that
the compute node baseboard management controllers (BMCs) are available via IPMI from the chosen master
host. For file systems, we assume that the chosen master server will host an NFS file system that is made
available to the compute nodes. Installation information is also discussed to optionally mount a parallel
file system and in this case, the parallel file system is assumed to exist previously.
An outline of the physical architecture discussed is shown in Figure 1 and highlights the high-level
networking configuration. The master host requires at least two Ethernet interfaces with eth0 connected to
the local data center network and eth1 used to provision and manage the cluster backend (note that these
5 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Master compute
(SMS) nodes
Data
Center eth0 eth1 to compute eth interface
Network
to compute BMC interface
tcp networking
interface names are examples and may be different depending on local settings and OS conventions). Two
logical IP interfaces are expected to each compute node: the first is the standard Ethernet interface that
will be used for provisioning and resource management. The second is used to connect to each host’s BMC
and is used for power management and remote console access. Physical connectivity for these two logical
IP networks is often accommodated via separate cabling and switching infrastructure; however, an alternate
configuration can also be accommodated via the use of a shared NIC, which runs a packet filter to divert
management packets between the host and BMC.
In addition to the IP networking, there is an optional high-speed network (InfiniBand or Omni-Path
in this recipe) that is also connected to each of the hosts. This high speed network is used for application
message passing and optionally for parallel file system connectivity as well (e.g. to existing Lustre or BeeGFS
storage targets).
1.3 Inputs
As this recipe details installing a cluster starting from bare-metal, there is a requirement to define IP ad-
dresses and gather hardware MAC addresses in order to support a controlled provisioning process. These
values are necessarily unique to the hardware being used, and this document uses variable substitution
(${variable}) in the command-line examples that follow to highlight where local site inputs are required.
A summary of the required and optional variables used throughout this recipe are presented below. Note
that while the example definitions above correspond to a small 4-node compute subsystem, the compute
parameters are defined in array format to accommodate logical extension to larger node counts.
6 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Prior to beginning the installation process of OpenHPC components, several additional considerations
are noted here for the SMS host configuration. First, the installation recipe herein assumes that the SMS
host name is resolvable locally. Depending on the manner in which you installed the BOS, there may be an
adequate entry already defined in /etc/hosts. If not, the following addition can be used to identify your
SMS host.
While it is theoretically possible to enable SELinux on a cluster provisioned with Warewulf, doing so is
beyond the scope of this document. Even the use of permissive mode can be problematic and we therefore
recommend disabling SELinux on the master SMS host. If SELinux components are installed locally, the
selinuxenabled command can be used to determine if SELinux is currently enabled. If enabled, consult
the distro documentation for information on how to disable.
Finally, provisioning services rely on DHCP, TFTP, and HTTP network protocols. Depending on the
local BOS configuration on the SMS host, default firewall rules may prohibit these services. Consequently,
this recipe assumes that the local firewall running on the SMS host is disabled. If installed, the default
7 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Tip
Many sites may find it useful or necessary to maintain a local copy of the OpenHPC repositories. To facilitate
this need, standalone tar archives are provided – one containing a repository of binary packages as well as any
available updates, and one containing a repository of source RPMS. The tar files also contain a simple bash
script to configure the package manager to use the local repository after download. To use, simply unpack
the tarball where you would like to host the local repository and execute the make repo.sh script. Tar files
for this release can be found at http://repos.openhpc.community/dist/2.9
In addition to the OpenHPC package repository, the master host also requires access to the standard base
OS distro repositories in order to resolve necessary dependencies. For Rocky 8.8, the requirements are to
have access to the BaseOS, Appstream, Extras, PowerTools, and EPEL repositories for which mirrors are
freely available online:
The public EPEL repository will be enabled automatically upon installation of the ohpc-release package.
Note that this does depend on the Rocky Extras repository, which is shipped with Rocky and is typically
enabled by default. In contrast, the PowerTools repository is typically disabled in a standard install, but
can be enabled from EPEL as follows:
8 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Tip
Many server BIOS configurations have PXE network booting configured as the primary option in the boot
order by default. If your compute nodes have a different device as the first in the sequence, the ipmitool
utility can be used to enable PXE.
HPC systems rely on synchronized clocks throughout the system and the NTP protocol can be used to
facilitate this synchronization. To enable NTP services on the SMS host with a specific server ${ntp server},
and allow this server to serve as a local time server for the cluster, issue the following:
Tip
Note that the “allow all” option specified for the chrony time daemon allows all servers on the local network
to be able to synchronize with the SMS host. Alternatively, you can restrict access to fixed IP ranges and an
example config line allowing access to a local class B subnet is as follows:
allow 192.168.0.0/16
9 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
There are a wide variety of configuration options and plugins available for Slurm and the example config
file illustrated above targets a fairly basic installation. In particular, job completion data will be stored in a
text file (/var/log/slurm jobcomp.log) that can be used to log simple accounting information. Sites who
desire more detailed information, or want to aggregate accounting data from multiple clusters, will likely
want to enable the database accounting back-end. This requires a number of additional local modifications
(on top of installing slurm-slurmdbd-ohpc), and users are advised to consult the online documentation for
more detailed information on setting up a database configuration for Slurm.
Tip
SLURM requires enumeration of the physical hardware characteristics for compute nodes under its control.
In particular, three configuration parameters combine to define consumable compute resources: Sockets,
CoresPerSocket, and ThreadsPerCore. The default configuration file provided via OpenHPC assumes the
nodes are named c1-c4 and are dual-socket, 8 cores per socket, and two threads per core for this 4-node
example. If this does not reflect your local hardware, please update the configuration file at /etc/slurm/
slurm.conf accordingly to match your nodes names and particular hardware. Be sure to run scontrol
reconfigure to notify SLURM of the changes. Note that the SLURM project provides an easy-to-use online
configuration tool that can be accessed here.
Other versions of this guide are available that describe installation of alternate resource management
systems, and they can be found in the docs-ohpc package.
10 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Tip
InfiniBand networks require a subnet management service that can typically be run on either an
administrative node, or on the switch itself. The optimal placement and configuration of the subnet
manager is beyond the scope of this document, but Rocky 8.8 provides the opensm package should
you choose to run it on the master node.
With the InfiniBand drivers included, you can also enable (optional) IPoIB functionality which provides
a mechanism to send IP packets over the IB network. If you plan to mount a Lustre file system over
InfiniBand (see §3.8.4.6 for additional details), then having IPoIB enabled is a requirement for the Lustre
client. OpenHPC provides a template configuration file to aid in setting up an ib0 interface on the master
host. To use, copy the template provided and update the ${sms ipoib} and ${ipoib netmask} entries to
match local desired settings (alter ib0 naming as appropriate if system contains dual-ported or multiple
HCAs).
Tip
Omni-Path networks require a subnet management service that can typically be run on either an
administrative node, or on the switch itself. The optimal placement and configuration of the subnet
manager is beyond the scope of this document, but Rocky 8.8 provides the opa-fm package should
you choose to run it on the master node.
11 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Tip
By default, Warewulf is configured to provision over the eth1 interface and the steps below include updating
this setting to override with a potentially alternatively-named interface specified by ${sms eth internal}.
Tip
# Override default OS repository (optional) - set YUM_MIRROR variable to desired repo location
[sms]# export YUM_MIRROR=${BOS_MIRROR}
12 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
To access the remote repositories by hostname (and not IP addresses), the chroot environment needs to
be updated to enable DNS resolution. Assuming that the master host has a working DNS configuration in
place, the chroot environment can be updated with a copy of the configuration as follows:
Now, we can include additional required components to the compute instance including resource manager
client, NTP, and development environment modules support.
# copy credential files into $CHROOT to ensure consistent uid/gids for slurm/munge at
# install. Note that these will be synchronized with future updates via the provisioning system.
[sms]# cp /etc/passwd /etc/group $CHROOT/etc
# Add Slurm client support meta-package and enable munge and slurmd
[sms]# yum -y --installroot=$CHROOT install ohpc-slurm-client
[sms]# chroot $CHROOT systemctl enable munge
[sms]# chroot $CHROOT systemctl enable slurmd
13 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
If planning to install the Intel® oneAPI compiler runtime (see §4.7), register the following additional path
(/opt/intel) to share with computes:
# (Optional) Setup NFS mount for /opt/intel if planning to install oneAPI packages
[sms]# mkdir /opt/intel
[sms]# echo "/opt/intel *(ro,no_subtree_check,fsid=12)" >> /etc/exports
[sms]# echo "${sms_ip}:/opt/intel /opt/intel nfs nfsvers=3,nodev 0 0" >> $CHROOT/etc/fstab
Details on the steps required for each of these customizations are discussed further in the following sections.
3.8.4.1 Enable InfiniBand drivers If your compute resources support InfiniBand, the following com-
mands add OFED and PSM support using base distro-provided drivers to the compute image.
3.8.4.2 Enable Omni-Path drivers If your compute resources support Omni-Path, the following com-
mands add OPA support using base distro-provided drivers to the compute image.
14 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
3.8.4.3 Increase locked memory limits In order to utilize InfiniBand or Omni-Path as the underlying
high speed interconnect, it is generally necessary to increase the locked memory settings for system users.
This can be accomplished by updating the /etc/security/limits.conf file and this should be performed
within the compute image and on all job submission hosts. In this recipe, jobs are submitted from the master
host, and the following commands can be used to update the maximum locked memory settings on both the
master host and the compute image:
3.8.4.4 Enable ssh control via resource manager An additional optional customization that is
recommended is to restrict ssh access on compute nodes to only allow access by users who have an active
job associated with the node. This can be enabled via the use of a pluggable authentication module (PAM)
provided as part of the Slurm package installs. To enable this feature within the compute image, issue the
following:
3.8.4.5 Add BeeGFS To add optional support for mounting BeeGFS file systems, an additional external
yum repository provided by the BeeGFS project must be configured. In this recipe, it is assumed that
the file system is hosted by servers that are pre-existing and are not part of the install process. The
${sysmgmtd host} should point the server running the BeeGFS Management Service. Starting the client
service triggers a build of a kernel module, hence the kernel module development packages must be installed
first.
15 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
3.8.4.6 Add Lustre client To add Lustre client support on the cluster, it necessary to install the client
and associated modules on each host needing to access a Lustre file system. In this recipe, it is assumed
that the Lustre file system is hosted by servers that are pre-existing and are not part of the install process.
Outlining the variety of Lustre client mounting options is beyond the scope of this document, but the general
requirement is to add a mount entry for the desired file system that defines the management server (MGS)
and underlying network transport protocol. To add client mounts on both the master server and compute
image, the following commands can be used. Note that the Lustre file system to be mounted is identified
by the ${mgs fs name} variable. In this example, the file system is configured to be mounted locally as
/mnt/lustre.
Tip
The suggested mount options shown for Lustre leverage the localflock option. This is a Lustre-specific
setting that enables client-local flock support. It is much faster than cluster-wide flock, but if you have an
application requiring cluster-wide, coherent file locks, use the standard flock attribute instead.
The default underlying network type used by Lustre is tcp. If your external Lustre file system is to be
mounted using a network type other than tcp, additional configuration files are necessary to identify the de-
sired network type. The example below illustrates creation of modprobe configuration files instructing Lustre
to use an InfiniBand network with the o2ib LNET driver attached to ib0. Note that these modifications
are made to both the master host and compute image.
With the Lustre configuration complete, the client can be mounted on the master host as follows:
16 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
3.8.4.7 Enable forwarding of system logs It is often desirable to consolidate system logging infor-
mation for the cluster in a central location, both to provide easy access to the data, and to reduce the impact
of storing data inside the stateless compute node’s memory footprint. The following commands highlight
the steps necessary to configure compute nodes to forward their logs to the SMS, and to allow the SMS to
accept these log requests.
# Disable most local logging on computes. Emergency and boot logs will remain on the compute nodes
[sms]# perl -pi -e "s/^\*\.info/\\#\*\.info/" $CHROOT/etc/rsyslog.conf
[sms]# perl -pi -e "s/^authpriv/\\#authpriv/" $CHROOT/etc/rsyslog.conf
[sms]# perl -pi -e "s/^mail/\\#mail/" $CHROOT/etc/rsyslog.conf
[sms]# perl -pi -e "s/^cron/\\#cron/" $CHROOT/etc/rsyslog.conf
[sms]# perl -pi -e "s/^uucp/\\#uucp/" $CHROOT/etc/rsyslog.conf
3.8.4.8 Add Nagios monitoring Nagios is an open source infrastructure network monitoring package
designed to watch servers, switches, and various services and offers user-defined alerting facilities for mon-
itoring various aspects of an HPC cluster. The core Nagios daemon and a variety of monitoring plugins
are provided by the underlying OS distro and the following commands can be used to install and configure
a Nagios server on the master node, and add the facility to run tests and gather metrics from provisioned
compute nodes. This simple configuration example is intended to be illustrative to walk through defining a
compute host group and enabling an ssh check for the computes. Users are encouraged to consult Nagios
documentation for more information and can install additional plugins as desired on login nodes, service
nodes, or compute hosts.
# Copy example Nagios config file to define a compute group and ssh check
# (note: edit as desired to add all desired compute hosts)
[sms]# cp /opt/ohpc/pub/examples/nagios/compute.cfg /etc/nagios/objects
# Register the config file with nagios
[sms]# echo "cfg_file=/etc/nagios/objects/compute.cfg" >> /etc/nagios/nagios.cfg
17 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
3.8.4.9 Add ClusterShell ClusterShell is an event-based Python library to execute commands in par-
allel across cluster nodes. Installation and basic configuration defining three node groups (adm, compute,
and all) is as follows:
# Install ClusterShell
[sms]# yum -y install clustershell
3.8.4.10 Add genders genders is a static cluster configuration database or node typing database used
for cluster configuration management. Other tools and users can access the genders database in order to
make decisions about where an action, or even what action, is appropriate based on associated types or
”genders”. Values may also be assigned to and retrieved from a gender to provide further granularity. The
following example highlights installation and configuration of two genders: compute and bmc.
# Install genders
[sms]# yum -y install genders-ohpc
3.8.4.11 Add Magpie Magpie contains a number of scripts to aid in running a variety of big data
software frameworks within HPC queuing environments. Examples include Hadoop, Spark, Hbase, Storm,
Pig, Mahout, Phoenix, Kafka, Zeppelin, and Zookeeper. Consult the online repository for more information
on using these scripts; basic installation is outlined as follows:
# Install magpie
[sms]# yum -y install magpie-ohpc
3.8.4.12 Add ConMan ConMan is a serial console management program designed to support a large
number of console devices and simultaneous users. It supports logging console device output and connecting
to compute node consoles via IPMI serial-over-lan. Installation and example configuration is outlined below.
18 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
# Configure conman for computes (note your IPMI password is required for console access)
[sms]# for ((i=0; i<$num_computes; i++)) ; do
echo -n 'CONSOLE name="'${c_name[$i]}'" dev="ipmi:'${c_bmc[$i]}'" '
echo 'ipmiopts="'U:${bmc_username},P:${IPMI_PASSWORD:-undefined},W:solpayloadsize'"'
done >> /etc/conman.conf
Note that an additional kernel boot option is typically necessary to enable serial console output. This option
is highlighted in §3.9.4 after compute nodes have been registered with the provisioning system.
3.8.4.13 Add NHC Resource managers often provide for a periodic ”node health check” to be performed
on each compute node to verify that the node is working properly. Nodes which are determined to be
”unhealthy” can be marked as down or offline so as to prevent jobs from being scheduled or run on them.
This helps increase the reliability and throughput of a cluster by reducing preventable job failures due to
misconfiguration, hardware failure, etc. OpenHPC distributes NHC to fulfill this requirement.
In a typical scenario, the NHC driver script is run periodically on each compute node by the resource
manager client daemon. It loads its configuration file to determine which checks are to be run on the current
node (based on its hostname). Each matching check is run, and if a failure is encountered, NHC will exit
with an error message describing the problem. It can also be configured to mark nodes offline so that the
scheduler will not assign jobs to bad nodes, reducing the risk of system-induced job failures.
3.8.4.14 Add GEOPM The Global Extensible Open Power Manager (GEOPM) is a framework for
exploring power and energy optimizations targeting high performance computing. The GEOPM package
provides built-in features ranging from static management of power policy for each individual compute node,
to dynamic coordination of the power policy and performance across all compute nodes hosting an MPI
application on a portion of a distributed computing system. The dynamic coordination is implemented as a
hierarchical control system for scalable communication and decentralized control. The following commands
customize the provisioning environment to support GEOPM installation which is done in a later step in §4.4.
# Disable Intel pstate driver for compute nodes as it interferes with GEOPM's operation.
[sms]# export kargs="${kargs} intel_pstate=disable"
GEOPM uses the msr-safe kernel module to allow users read/write access to whitelisted model specific
registers (MSRs). An associated Slurm plugin ensures that MSRs modified within a user’s slurm job are
reset to their original state after job completion.
19 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
# Install msr-safe kernel module and SLURM plugin into compute image
[sms]# yum -y --installroot=$CHROOT install kmod-msr-safe-ohpc
[sms]# yum -y --installroot=$CHROOT install msr-safe-ohpc
[sms]# yum -y --installroot=$CHROOT install msr-safe-slurm-ohpc
For documentation on how to configure and use GEOPM, please see the geopm man page and tutorials
available online.
Similarly, to import the cryptographic key that is required by the munge authentication library to be available
on every host in the resource management pool, issue the following:
Finally, to add optional support for controlling IPoIB interfaces (see §3.5), OpenHPC includes a template
file for Warewulf that can optionally be imported and used later to provision ib0 network settings.
# (Optional) Include drivers from kernel updates; needed if enabling additional kernel modules on computes
[sms]# export WW_CONF=/etc/warewulf/bootstrap.conf
[sms]# echo "drivers += updates/kernel/" >> $WW_CONF
20 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
# Optionally define IPoIB network settings (required if planning to mount Lustre/BeeGFS over IB)
[sms]# for ((i=0; i<$num_computes; i++)) ; do
wwsh -y node set ${c_name[$i]} -D ib0 --ipaddr=${c_ipoib[$i]} --netmask=${ipoib_netmask}
done
[sms]# wwsh -y provision set "${compute_regex}" --fileadd=ifcfg-ib0.ww
21 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Tip
Warewulf includes a utility named wwnodescan to automatically register new compute nodes versus the
outlined node-addition approach which requires hardware MAC addresses to be gathered in advance. With
wwnodescan, nodes will be added to the Warewulf database in the order in which their DHCP requests are
received by the master, so care must be taken to boot nodes in the order one wishes to see preserved in the
Warewulf database. The IP address provided will be incremented after each node is found, and the utility
will exit after all specified nodes have been found. Example usage is highlighted below:
# Increase per-user limit on the number of user namespaces that may be created
[sms]# echo "user.max_user_namespaces=15076" >> $CHROOT/etc/sysctl.conf
# rebuild VNFS
[sms]# wwvnfs --chroot $CHROOT
Tip
Typical Charliecloud workflows are based around Docker containers, but it is not strictly necessary to install
Docker itself on the HPC resource. A common pattern is to build the Docker container on a laptop or
VM and upload the result to the cluster for use with Charliecloud. More information can be found at
https://hpc.github.io/charliecloud/
If you chose to enable ConMan in §3.8.4.12, additional warewulf configuration is needed as follows:
If any components have added to the boot time kernel command line arguments for the compute nodes, the
following command is required to store the configuration in Warewulf:
22 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Enabling stateful nodes also requires additional site-specific, disk-related parameters in the Warewulf con-
figuration, and several example partitioning scripts are provided in the distribution.
Tip
Those provisioning compute nodes in UEFI mode will install a slightly different set of packages in to the
VNFS. Warewulf also provides an example EFI filesystem layout.
Upon subsequent reboot of the modified nodes, Warewulf will partition and format the disk to host the
desired VNFS image. Once the image is installed to disk, warewulf can be configured to use the nodes’ local
storage as the boot device.
23 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Once kicked off, the boot process should take less than 5 minutes (depending on BIOS post times) and
you can verify that the compute hosts are available via ssh, or via parallel ssh tools to multiple hosts. For
example, to run a command on the newly imaged compute hosts using pdsh, execute the following:
Tip
While the pxelinux.0 and lpxelinux.0 files that ship with Warewulf to enable network boot support a
wide range of hardware, some hosts may boot more reliably or faster using the BOS versions provided
via the syslinux-tftpboot package. If you encounter PXE issues, consider replacing the pxelinux.0 and
lpxelinux.0 files supplied with warewulf-provision-ohpc with versions from syslinux-tftpboot.
24 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
4.2 Compilers
OpenHPC presently packages the GNU compiler toolchain integrated with the underlying Lmod modules
system in a hierarchical fashion. The modules system will conditionally present compiler-dependent software
based on the toolchain currently loaded.
Note that OpenHPC 2.x introduces the use of two related transport layers for the MPICH and OpenMPI
builds that support a variety of underlying fabrics: UCX (Unified Communication X) and OFI (OpenFabrics
interfaces). In the case of OpenMPI, a monolithic build is provided which supports both transports and
end-users can customize their runtime preferences with environment variables. For MPICH, two separate
builds are provided and the example above highlighted installing the ofi variant. However, the packaging is
designed such that both versions can be installed simultaneously and users can switch between the two via
normal module command semantics. Alternatively, a site can choose to install the ucx variant instead as a
drop-in MPICH replacement:
In the case where both MPICH variants are installed, two modules will be visible in the end-user envi-
ronment and an example of this configuration is highlighted is below.
-------------------- /opt/ohpc/pub/moduledeps/gnu12---------------------
mpich/3.4.3-ofi mpich/3.4.3-ucx (D)
If your system includes InfiniBand and you enabled underlying support in §3.5 and §3.8.4, an additional
MVAPICH2 family is available for use:
25 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Alternatively, if your system includes Intel® Omni-Path, use the (psm2) variant of MVAPICH2 instead:
Optionally, the GEOPM power management framework can be installed using the convenience meta-package
below. Note that GEOPM requires customization of the compute node environment to include an additional
kernel module as highlighted previously in §3.8.4.14):
Tip
If you want to change the default environment from the suggestion above, OpenHPC also provides the GNU
compiler toolchain with the MPICH and MVAPICH2 stacks:
• lmod-defaults-gnu12-mpich-ofi-ohpc
• lmod-defaults-gnu12-mpich-ucx-ohpc
• lmod-defaults-gnu12-mvapich2-ohpc
26 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
name. For example, libraries that do not require MPI as part of the build process adopt the following RPM
name:
Packages that do require MPI as part of the build expand upon this convention to additionally include the
MPI family name as follows:
To illustrate this further, the command below queries the locally configured repositories to identify all of
the available PETSc packages that were built with the GNU toolchain. The resulting output that is included
shows that pre-built versions are available for each of the supported MPI families presented in §4.3.
Tip
OpenHPC-provided 3rd party builds are configured to be installed into a common top-level repository so that
they can be easily exported to desired hosts within the cluster. This common top-level path (/opt/ohpc/pub)
was previously configured to be mounted on compute nodes in §3.8.3, so the packages will be immediately
available for use on the cluster after installation on the master host.
For convenience, OpenHPC provides package aliases for these 3rd party libraries and utilities that can
be used to install available libraries for use with the GNU compiler family toolchain. For parallel libraries,
aliases are grouped by MPI family toolchain so that administrators can choose a subset should they favor a
particular MPI stack. Please refer to Appendix E for a more detailed listing of all available packages in each
of these functional areas. To install all available package offerings within OpenHPC, issue the following:
27 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Tip
As noted in §3.8.3, the default installation path for OpenHPC (/opt/ohpc/pub) is exported over
NFS from the master to the compute nodes, but the Intel® oneAPI HPC Toolkit packages install to
a top-level path of /opt/intel. To make the Intel® compilers available to the compute nodes one
must add an additional NFS export for /opt/intel that is mounted on desired compute nodes.
To enable all 3rd party builds available in OpenHPC that are compatible with the Intel® oneAPI classic
compiler suite, issue the following:
# Optionally, choose the Omni-Path enabled build for MVAPICH2. Otherwise, skip to retain IB variant
[sms]# yum -y install mvapich2-psm2-intel-ohpc
28 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Warewulf installs a utility on the compute nodes to automatically synchronize known files from the
provisioning server at five minute intervals. In this recipe, recall that we previously registered credential files
with Warewulf (e.g. passwd, group, and shadow) so that these files would be propagated during compute
node imaging. However, with the addition of a new “test” user above, the files have been outdated and we
need to update the Warewulf database to incorporate the additions. This re-sync process can be accomplished
as follows:
29 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Tip
After re-syncing to notify Warewulf of file modifications made on the master host, it should take approximately
5 minutes for the changes to propagate. However, you can also manually pull the changes from compute nodes
via the following:
30 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Tip
The following table provides approximate command equivalences between SLURM and OpenPBS:
31 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
prun ./a.out
Tip
The use of the %j option in the example batch job script shown is a convenient way to track application output
on an individual job basis. The %j token is replaced with the Slurm job allocation number once assigned
(job #339 in this example).
32 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Appendices
A Installation Template
This appendix highlights the availability of a companion installation script that is included with OpenHPC
documentation. This script, when combined with local site inputs, can be used to implement a starting
recipe for bare-metal system installation and configuration. This template script is used during validation
efforts to test cluster installations and is provided as a convenience for administrators as a starting point for
potential site customization.
Tip
Note that the template script provided is intended for use during initial installation and is not designed for
repeated execution. If modifications are required after using the script initially, we recommend running the
relevant subset of commands interactively.
The template script relies on the use of a simple text file to define local site variables that were outlined
in §1.3. By default, the template installation script attempts to use local variable settings sourced from
the /opt/ohpc/pub/doc/recipes/vanilla/input.local file, however, this choice can be overridden by
the use of the ${OHPC INPUT LOCAL} environment variable. The template install script is intended for
execution on the SMS master host and is installed as part of the docs-ohpc package into /opt/ohpc/pub/
doc/recipes/vanilla/recipe.sh. After enabling the OpenHPC repository and reviewing the guide for
additional information on the intent of the commands, the general starting approach for using this template
is as follows:
2. Copy the provided template input file to use as a starting point to define local site settings:
[sms]# cp -p /opt/ohpc/pub/doc/recipes/rocky8/x86_64/warewulf/slurm/recipe.sh .
33 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
1. (Optional) Ensure repo metadata is current (on head node and in chroot location(s)). Package man-
agers will naturally do this on their own over time, but if you are wanting to access updates immediately
after a new release, the following can be used to sync to the latest.
4. Rebuild image(s)
In the case where packages were upgraded within the chroot compute image, you will need to reboot the
compute nodes when convenient to enable the changes.
34 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
The RPM installation creates a user named ohpc-test to house the test suite and provide an isolated
environment for execution. Configuration of the test suite is done using standard GNU autotools semantics
and the BATS shell-testing framework is used to execute and log a number of individual unit tests. Some
tests require privileged execution, so a different combination of tests will be enabled depending on which user
executes the top-level configure script. Non-privileged tests requiring execution on one or more compute
nodes are submitted as jobs through the SLURM resource manager. The tests are further divided into
“short” and “long” run categories. The short run configuration is a subset of approximately 180 tests to
demonstrate basic functionality of key components (e.g. MPI stacks) and should complete in 10-20 minutes.
The long run (around 1000 tests) is comprehensive and can take an hour or more to complete.
Most components can be tested individually, but a default configuration is setup to enable collective
testing. To test an isolated component, use the configure option to disable all tests, then re-enable the
desired test to run. The --help option to configure will display all possible tests. By default, the test
suite will endeavor to run tests for multiple MPI stacks where applicable. To restrict tests to only a subset
of MPI families, use the --with-mpi-families option (e.g. --with-mpi-families="openmpi4"). Example
output is shown below (some output is omitted for the sake of brevity).
[sms]# su - ohpc-test
[test@sms ~]$ cd tests
[test@sms ~]$ ./configure --disable-all --enable-fftw
checking for a BSD-compatible install... /bin/install -c
checking whether build environment is sane... yes
...
---------------------------------------------- SUMMARY ---------------------------------------------
35 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Many OpenHPC components exist in multiple flavors to support multiple compiler and MPI runtime
permutations, and the test suite takes this in to account by iterating through these combinations by default.
If make check is executed from the top-level test directory, all configured compiler and MPI permutations
of a library will be exercised. The following highlights the execution of the FFTW related tests that were
enabled in the previous step.
36 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
D Customization
D.1 Adding local Lmod modules to OpenHPC hierarchy
Locally installed applications can easily be integrated in to OpenHPC systems by following the Lmod con-
vention laid out by the provided packages. Two sample module files are included in the examples-ohpc
package—one representing an application with no compiler or MPI runtime dependencies, and one depen-
dent on OpenMPI and the GNU toolchain. Simply copy these files to the prescribed locations, and the lmod
application should pick them up automatically.
Where:
L: Module is loaded
37 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
# Rebuild binary RPM. Note that additional directives can be specified to modify build
[test@sms ~rpmbuild/SPECS]$ rpmbuild -bb --define "OHPC_CFLAGS ’-O3 -mtune=native’" \
--define "OHPC_CUSTOM_DELIM static" fftw.spec
--------------------------/opt/ohpc/pub/moduledeps/gnu9-openmpi4 ---------------------------
fftw/3.3.8-static fftw/3.3.8 (D)
38 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
E Package Manifest
This appendix provides a summary of available meta-package groupings and all of the individual RPM
packages that are available as part of this OpenHPC release. The meta-packages provide a mechanism to
group related collections of RPMs by functionality and provide a convenience mechanism for installation. A
list of the available meta-packages and a brief description is presented in Table 2.
39 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
40 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
41 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
42 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
What follows next in this Appendix is a series of tables that summarize the underlying RPM packages
available in this OpenHPC release. These packages are organized by groupings based on their general
functionality and each table provides information for the specific RPM name, version, brief summary, and
the web URL where additional information can be obtained for the component. Note that many of the 3rd
party community libraries that are pre-packaged with OpenHPC are built using multiple compiler and MPI
families. In these cases, the RPM package name includes delimiters identifying the development environment
for which each package build is targeted. Additional information on the OpenHPC package naming scheme
is presented in §4.6. The relevant package groupings and associated Table references are as follows:
43 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
44 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
Table 4: Provisioning
45 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
46 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
47 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
48 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
49 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
50 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
51 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
52 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
53 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
54 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
55 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
56 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
57 Rev: 19888f796
Install Guide (v2.9): Rocky 8.8/x86 64 + Warewulf + SLURM
F Package Signatures
All of the RPMs provided via the OpenHPC repository are signed with a GPG signature. By default, the
underlying package managers will verify these signatures during installation to ensure that packages have
not been altered. The RPMs can also be manually verified and the public signing key fingerprint for the
latest repository is shown below:
Fingerprint: 5392 744D 3C54 3ED5 7847 65E6 8A30 6019 DA565C6C
The following command can be used to verify an RPM once it has been downloaded locally by confirming
if the package is signed, and if so, indicating which key was used to sign it. The example below highlights
usage for a local copy of the docs-ohpc package and illustrates how the key ID matches the fingerprint
shown above.
58 Rev: 19888f796