0% found this document useful (0 votes)
3 views25 pages

04 Qos

Uploaded by

DTSX
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views25 pages

04 Qos

Uploaded by

DTSX
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

QoS

Slurm Training’15

Salvador Martin & Jordi Blasco (HPCNow!)


QoS Configuration Questions

Agenda

1 QoS Configuration Job Limits


Job Scheduling Priority Other QOS features
Job Preemption 2 Questions

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Quality of Service
The QOS’s are defined in the SLURM database using the
sacctmgr utility. The quality of service associated with a job will
affect the job in three ways:
• Job Scheduling Priority
• Job Preemption
• Job Limits
• Other QOS Options
Jobs request a QOS using the --qos= option to the sbatch, salloc,
and srun commands.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Job Scheduling Priority


Job scheduling priority is made up of a number of factors as
described in the priority/multifactor plugin. Each QOS is defined in
the SLURM database and includes an associated priority. Jobs that
request and are permitted a QOS will incorporate the priority
associated with that QOS in the job’s priority calculation.
• To enable the QOS priority component of the multi-factor
priority calculation, the PriorityWeightQOS configuration
parameter must be defined in the slurm.conf file and assigned
an integer value greater than zero.
• A job’s QOS only affects is scheduling priority when the
multi-factor plugin is loaded.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Job Preemption based on QoS


The preemption method is determined by the PreemptType
configuration parameter defined in slurm.conf. When the
PreemptType is set to preempt/qos, a queued job’s QOS will be
used to determine whether it can preempt a running job.

Section 5 of this training will cover Job Preemption in detail.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Job Limits
Each QoS is assigned a set of limits which will be applied to the
job. The limits mirror the limits imposed by the
user/account/cluster/partition association defined in the Slurm
database and described in the Resource Limits section. When limits
for a QoS have been defined, they will take precedence over the
association’s limits.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

QoS Job Limits

• GrpCpus The total count of cpus able to be used at any given


time from jobs running from this QOS.
• GrpCPUMins A hard limit of cpu minutes to be used by jobs
running from this QOS. If this limit is reached all jobs running
in this group will be killed, and no new jobs will be allowed to
run.
• GrpCPURunMins Maximum number of CPU minutes all jobs
running with this QOS can run at the same time. This takes
into consideration time limit of running jobs. If the limit is
reached no new jobs are started until other jobs finish to allow
time to free up.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

QoS Job Limits

• GrpJobs The total number of jobs able to run at any given


time from this QOS.
• GrpMemory The total amount of memory (MB) able to be
used at any given time from jobs running from QOS.
• GrpNodes The total count of nodes able to be used at any
given time from jobs running from this QOS. If this limit is
reached new jobs will be queued but only allowed to run after
resources have been relinquished from this group1 .

1
Each job’s node allocation is counted separately (i.e. if a single node has
resources allocated to two jobs, this is counted as two allocated nodes).
Salvador Martin & Jordi Blasco (HPCNow!) QoS
QoS Configuration Questions

QoS Configuration

QoS Job Limits

• GrpSubmitJobs The total number of jobs able to be


submitted to the system at any given time from this QOS. If
this limit is reached new submission requests will be denied
until previous jobs complete from this group.
• GrpWall The maximum wall clock time any job submitted to
this group can run for. If this limit is reached submission
requests will be denied and the running jobs will be killed.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

QoS Job Limits per Job

• MaxCpusPerJob The maximum size in cpus any given job


can have from this QOS.
• MaxCPUMinsPerJob Maximum number of CPU*minutes
any job with this QOS can run.
• MaxNodesPerJob The maximum size in nodes any given job
can have from this association.
• MaxWallDurationPerJob The maximum wall clock time any
job submitted to this QOS can run for.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

QoS Job Limits per User

• MaxCpusPerUser Maximum number of CPU’s any user with


this QOS can be allocated.
• MaxJobsPerUser Maximum number of jobs a user can run
with this QOS.
• MaxNodesPerUser Maximum number of nodes that can be
allocated to any user with this QOS.
• MaxSubmitJobsPerUser Maximum number of jobs (pending
or running) any user with this QOS can have. As the name
implies, if this limit is reached the job will be denied at
submission until previous jobs complete from this user.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Other QOS Options


Flags Used by the slurmctld to override or enforce certain
characteristics. Valid options are :
• DenyOnLimit If set jobs using this QOS will be rejected at
submission time if they do not conform to the QOS ’Max’
limits. By default jobs that go over these limits will pend until
they conform.
• EnforceUsageThreshold If set, and the QOS also has a
UsageThreshold, any jobs submitted with this QOS that fall
below the UsageThreshold will be held until their Fairshare
Usage goes above the Threshold.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Other QOS Options

• NoReserve If this flag is set and backfill scheduling is used,


jobs using this QOS will not reserve resources in the backfill
schedule’s map of resources allocated through time.
• PartitionMaxNodes If set jobs using this QOS will be able to
override the requested partition’s MaxNodes limit.
• PartitionMinNodes If set jobs using this QOS will be able to
override the requested partition’s MinNodes limit.
• PartitionTimeLimit If set jobs using this QOS will be able to
override the requested partition’s TimeLimit.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Other QOS Options

• RequiresReservaton If set jobs using this QOS must


designate a reservation when submitting a job.
• GraceTime Preemption grace time to be extended to a job
which has been selected for preemption.
• UsageFactor Usage factor when running with this QOS.
• UsageThreshold A float representing the lowest fairshare of
an association allowable to run a job.2

2
If an association falls below this threshold and has pending jobs or submits
new jobs those jobs will be held until the usage goes back above the threshold.
Use sshare to see current shares on the system.
Salvador Martin & Jordi Blasco (HPCNow!) QoS
QoS Configuration Questions

QoS Configuration

Some examples

Add a new qos


sacctmgr add qos NameOfQoS MaxCpusPerUser=100

Add a qos to a user account


sacctmgr modify user name=user01 set
qos+=NameOfQoS defaultqos=NameOfQoS

Set QOS priority


sacctmgr modify qos NameOfQoS set priority=100

Set Max CPU minutes limit (60 minutes * 24 hours)


sacctmgr modify qos NameOfQoS set GrpCPUMins=1440

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Some examples

Add a new qos


sacctmgr add qos NameOfQoS MaxCpusPerUser=100

Add a qos to a user account


sacctmgr modify user name=user01 set
qos+=NameOfQoS defaultqos=NameOfQoS

Set QOS priority


sacctmgr modify qos NameOfQoS set priority=100

Set Max CPU minutes limit (60 minutes * 24 hours)


sacctmgr modify qos NameOfQoS set GrpCPUMins=1440

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Some examples

Add a new qos


sacctmgr add qos NameOfQoS MaxCpusPerUser=100

Add a qos to a user account


sacctmgr modify user name=user01 set
qos+=NameOfQoS defaultqos=NameOfQoS

Set QOS priority


sacctmgr modify qos NameOfQoS set priority=100

Set Max CPU minutes limit (60 minutes * 24 hours)


sacctmgr modify qos NameOfQoS set GrpCPUMins=1440

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Some examples

Add a new qos


sacctmgr add qos NameOfQoS MaxCpusPerUser=100

Add a qos to a user account


sacctmgr modify user name=user01 set
qos+=NameOfQoS defaultqos=NameOfQoS

Set QOS priority


sacctmgr modify qos NameOfQoS set priority=100

Set Max CPU minutes limit (60 minutes * 24 hours)


sacctmgr modify qos NameOfQoS set GrpCPUMins=1440

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Some examples

Add a new qos


Set Max CPUs per group
sacctmgr modify qos NameOfQoS set GrpCpus=1000

Set Max Jobs per group


sacctmgr modify qos NameOfQoS set GrpJobs=1000

Set Max Nodes per group


sacctmgr modify qos NameOfQoS set GrpNodes=1000

Set Max Submit Jobs per group


sacctmgr modify qos NameOfQoS set GrpSubmitJobs=1000

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Some examples

Add a new qos


Set Max CPUs per group
sacctmgr modify qos NameOfQoS set GrpCpus=1000

Set Max Jobs per group


sacctmgr modify qos NameOfQoS set GrpJobs=1000

Set Max Nodes per group


sacctmgr modify qos NameOfQoS set GrpNodes=1000

Set Max Submit Jobs per group


sacctmgr modify qos NameOfQoS set GrpSubmitJobs=1000

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Some examples

Add a new qos


Set Max CPUs per group
sacctmgr modify qos NameOfQoS set GrpCpus=1000

Set Max Jobs per group


sacctmgr modify qos NameOfQoS set GrpJobs=1000

Set Max Nodes per group


sacctmgr modify qos NameOfQoS set GrpNodes=1000

Set Max Submit Jobs per group


sacctmgr modify qos NameOfQoS set GrpSubmitJobs=1000

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

Some examples

Add a new qos


Set Max CPUs per group
sacctmgr modify qos NameOfQoS set GrpCpus=1000

Set Max Jobs per group


sacctmgr modify qos NameOfQoS set GrpJobs=1000

Set Max Nodes per group


sacctmgr modify qos NameOfQoS set GrpNodes=1000

Set Max Submit Jobs per group


sacctmgr modify qos NameOfQoS set GrpSubmitJobs=1000

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

QoS Configuration

QoS Summary

• the QOS’s and their associated limits are defined in the


SLURM database using the sacctmgr utility.
• The QOS will only influence job scheduling priority when the
multi-factor priority plugin is loaded and a non-zero
"PriorityWeightQOS" has been defined in the slurm.conf file.
• The QOS will only determine job preemption when the
"PreemptType" is defined as "preempt/qos" in the slurm.conf
file.
• Limits defined for a QOS (and described above) will override
the limits of the user/account/cluster/partition association.

Salvador Martin & Jordi Blasco (HPCNow!) QoS


QoS Configuration Questions

Questions

Salvador Martin & Jordi Blasco (HPCNow!) QoS

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy