0% found this document useful (0 votes)
2 views44 pages

Group 1 Project

The project report details the deployment of a High-Performance Computing (HPC) cluster using containers, specifically employing Warewulf for provisioning, SLURM for resource management, and Ganglia for monitoring. It outlines the system requirements, installation steps, and troubleshooting tips, along with acknowledgments to the guiding faculty and support staff. The project is submitted as part of the Post Graduate Diploma in High Performance Computing System Administration from C-DAC ACTS, Pune.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views44 pages

Group 1 Project

The project report details the deployment of a High-Performance Computing (HPC) cluster using containers, specifically employing Warewulf for provisioning, SLURM for resource management, and Ganglia for monitoring. It outlines the system requirements, installation steps, and troubleshooting tips, along with acknowledgments to the guiding faculty and support staff. The project is submitted as part of the Post Graduate Diploma in High Performance Computing System Administration from C-DAC ACTS, Pune.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Project Report

on
Deploying HPC Cluster
On Containers

Submitted in partial fulfillment for the award of


Post Graduate Diploma in High Performance Computing
System Administration from C-DAC ACTS (Pune)

Guided by:

Mr. Ashutosh Das


Presented by:

Mr. Ravindra Singh (230340127048)

Mr. Vikas Kumar (230340127053)

Mr. Jitendra Kumar (230340127037)

Mr. Madhu Sen (230340127003)

Mr. Rajnikant Kundan (230340127046)

Centre of Development of Advanced Computing (C-DAC), Pune


CERTIFICATE

TO WHOMSOEVER IT MAY CONCERN

This is to certify that

Mr. Ravindra Singh

Mr. Vikas Kumar

Mr. Jitendra Kumar

Mr. Madhu Sen

Mr. Rajnikant Kundan

have successfully completed their project on

Deploying HPC Cluster


On Containers

Under the Guidance of Mr. Ashutosh Das

Project Guide Project Supervisor

HOD ACTS
Mr.
PG-DHPCSA

ACKNOWLEDGEMENT

This project “ Deploying HPC Cluster on Containers ” was a great learning experience for us and
we are submitting this work toAdvanced Computing Training School (CDAC ACTS).

We all are very glad to mention the name of Mr. Ashutosh Das for his valuable guidance to work on this
project. Overcome various obstacles and intricacies during the course of project work.

We are highly grateful to HPC tech team (ACTS training Centre, C-DAC),
For his valuable guidance and support whenever necessary while doing this course Post
Graduate Diploma in High Performance Computing System Administration (PG- DHPCSA)
Through CDAC ACTS , Pune.

Our most heartfelt thank goes to Ms. Swati salunkhe (Course Coordinator, PG- DHPCSA)
who gave all the required support and kind coordination to provide all the necessities like
required hardware, internet facility and extra Lab hours to complete the project and
throughout the course up to the last day here in C-DAC ACTS, Pune.

From:

Mr. Ravindra Singh (230340127048)

Mr. Vikas (230340127053)

Mr. Jitendra Kumar (230340127037)

Mr. Madhu Sen (230340127003)

Mr. Rajnikant Kundan(230340127046)


TABLE OF CONTENTS

1. Introduction

2. Workflow

3. System Requirements

a. Software
b. Hardware
4. Setting up the Master

5. Installation of Warewulf

a. Setting up the Node


b. Provisioning of the Node
6. Installation of SLURM

7. Installation of Ganglia

8. Troubleshooting

9. References and Bibliography

10.Project Link

P a g e 4 | 44
Introduction

This project is about deployment of HPC on Containers using Warewulf.

For monitoring we have Ganglia. SLURM has been used to manage

Resources and accounting. This HPC stacks uses Rocky 8.8 Linux as an

Alternative to Centos 7.9.

P a g e 5 | 44
Workflow

P a g e 6 | 44
System Requirements

• RAM: 16 Gb
• HDD: 100 Gb
• Processors: 4 cores
• Network adapters
o NAT
o Host Only

Software Requirements

• Rocky Linux (8.8)


• Warewulf
• SLURM
• Ganglia

P a g e 7 | 44
Setting up the master

• create a Virtual machine for master node

Step 1: Create new virtual machine in VMware Workstation

Step 2: Click next & choose `I will install Operating System Later`

P a g e 8 | 44
Step 3: Select Guest Operating System Type

Step 4: Give name to your New VM

P a g e 9 | 44
Step 5: Select no of processors for master

Step 6: next step is to define ram for new master machine

Step 7: Next step is to select Network Type

P a g e 10 | 44
Step 8: Next step is to select disk & it’s size

P a g e 11 | 44
Step 9: Next Step is to add another network adapter & change its type to Host-
only

P a g e 12 | 44
Step 10: Final step is to confirm the configuration and click finish

P a g e 13 | 44
Installing OS on new VM

Step 1: select boot device and choose rocky Linux 8.8 iso & click open. Then
start the virtual machine

Step 2: After vm is started select the first option & press enter key

P a g e 14 | 44
Step 3: Next step is to select language & press Continue button.

Step 4: setup the root password, Installation destination & click Begin Installation

P a g e 15 | 44
Step 5: After system has been installed click Reboot System

Step 6: Next step is to accept the EULA agreement

P a g e 16 | 44
Step 7: Next step is to accept the agreement & click finish Configuration

Step 8: in Next step, Enter the username & password on the login screen

P a g e 17 | 44
Step 9: Finally, we land the Rock Linux desktop Environment

P a g e 18 | 44
Warewulf Installation

Step 1: First, we need to disable the SELINUX, firewall and change hostname

Step 2: Next step is to add warewulf repository

P a g e 19 | 44
Step 3: Next step is to install warewulf

Step 4: Next step to edit config file

P a g e 20 | 44
Step 5: Next step is to configure the warewulf

Step 6: Next we need to start the services required for provisioning

P a g e 21 | 44
Step 7: Next step is to import container from docker hub & set password for it.

P a g e 22 | 44
Step 8: Next step is to create a new node and test if it is booting or not.

Warning : RAM given to Node VM must be greater than the Container Size

1. click on create new VM


2. select “I will install the OS later”
3. Then choose OS type Linux
4. Choose No of processors
5. Give the RAM size for the Node
6. Then choose the Network mode(Host Only)
7. Then give the disk size. (I’ve given 0.001 Gb, in order to demonstrate that this process
is stateless, doesn’t require HDD)

P a g e 23 | 44
P a g e 24 | 44
Step 9: add this node to warewulf

Step 10: Start the node

P a g e 25 | 44
P a g e 26 | 44
Installation of MUNGE

Step 1: Installing epel-release package

Step 2: enabling powertools & installing Munge packages

Step 3: installing random number generator tools, this will help in generating the
munge key

P a g e 27 | 44
Step 4: Starting the rngd service

Step 5: create the munge key

Step 6: copying this munge key to container

P a g e 28 | 44
Step 7: installing the munge on the container

Step 8: copying the munge key from shared folder to munge path & changing it’s
ownership. Finally, we have to rebuild the container

P a g e 29 | 44
Step 9: checking status of munge service on Compute Node & Master

P a g e 30 | 44
Installation of SLURM

Step 1: First, we have to download rpm-build & make packages

Step 2 : Next step is to download Slurmd packages

Step 3 : Next step is to download dependencies for Slurmd packages

P a g e 31 | 44
Step 4 : Next step is to build rpm from tar file

Step 5 : Next step is to create Slurm User to operate Slurm services

Step 6 : Next step is to check rpmbuild packages after rpmbuild packages has
been build

P a g e 32 | 44
Step 7 : Next step is to install rpmbuild packages

Step 8 : Next step is to copy Slurm packages from Master to Container

Step 9 : Next step is to make directory , change ownership and give permission
to Slurm User

P a g e 33 | 44
Step 10 : Next step is to install Slurm on container

Step 11 : Next step is to edit the config file

Line no 11 : give name to this slurm cluster


Line no 12 : give hostname of slurmctld machine
Line no 92 : comment this line
Line no 93 : add node1 information, which can be obtained using “slurmd -C”
Save & quit config file

P a g e 34 | 44
Note: to get this slurm configuration from node, node has to be in a booted state.
Otherwise, this command will show master machine configuration.

Step 12: Next step is to change ownership to slurm and restart the service

Step 13 : Next step is to create spool & log directories for Slurmd on Container

Step 14 : Next step is to reconfigure the warewulf

P a g e 35 | 44
Step 15 : Next step is to reboot the Container & check all services

Step 16 : Next step is to confirm node has been added or not

Note : Node is visible to Slurm controller

P a g e 36 | 44
Installation of Ganglia

Step 1 : First , we have to download Ganglia packages

Step 2 : Next step is to edit the gmetad file

Line 44 : Change the Cluster name

Step 4 : Next step is to edit gmond file

Line no 30 : give the cluster name


Line no 31 : give the hostname of the master machine
Line no 50 : give the ip address of the master
Line no 57 : comment this line
Line no 59 : comment this line

P a g e 37 | 44
Step 7 : Next step is to start the services

P a g e 38 | 44
Step 8 : Next step is to install ganglia on container

Step 9 : Next step is to edit gmond file on cluster

Line no 30 : give the cluster name


Line no 31 : give the hostname of the master machine
Line no 50 : give the ip address of the master
Line no 57 : comment this line
Line no 59 : comment this line

P a g e 39 | 44
Step 10 : Next step is to add the gmond service in bashrc file

Step 11 : Next step is to reboot the node & check the status of the gmond service

Step 12 : Next step is to open browser on master & check the ganglia cluster output
https://localhost/ganglia

Cluster Output:

P a g e 40 | 44
Node Result:

Master result:

P a g e 41 | 44
Troubleshooting

Error 1: hub_port_status failed (err = 110)

Solution: Remove Usb Controller from Vm settings

P a g e 42 | 44
Error 2: Failed to set Locale, defaulting to C.UTF-8

Solution: install glibc-all-langpacks to resolve this issue

P a g e 43 | 44
References & Bibliography

1. Warewulf Documentation

https://warewulf.org/docs/development/

2. Slurm Documentation

https://slurm.schedmd.com/documentation.html

Project Link

Github: https://github.com/ravi30flash/HPC-project/tree/master

P a g e 44 | 44

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy