NVIDIA MIG User Guide
NVIDIA MIG User Guide
Guide
User Guide
The new Multi-Instance GPU (MIG) feature allows GPUs (starting with NVIDIA Ampere
architecture) to be securely partitioned into up to seven separate GPU Instances for
CUDA applications, providing multiple users with separate GPU resources for optimal
GPU utilization. This feature is particularly beneficial for workloads that do not fully
saturate the GPU's compute capacity and therefore users may want to run different
workloads in parallel to maximize utilization.
For Cloud Service Providers (CSPs), who have multi-tenant use cases, MIG ensures one
client cannot impact the work or scheduling of other clients, in addition to providing
enhanced isolation for customers.
With MIG, each instance's processors have separate and isolated paths through the
entire memory system - the on-chip crossbar ports, L2 cache banks, memory controllers,
and DRAM address busses are all assigned uniquely to an individual instance. This
ensures that an individual user's workload can run with predictable throughput and
latency, with the same L2 cache allocation and DRAM bandwidth, even if other tasks
are thrashing their own caches or saturating their DRAM interfaces. MIG can partition
available GPU compute resources (including streaming multiprocessors or SMs, and
GPU engines such as copy engines or decoders), to provide a defined quality of service
(QoS) with fault isolation for different clients such as VMs, containers or processes. MIG
enables multiple GPU Instances to run in parallel on a single, physical NVIDIA Ampere
GPU.
With MIG, users will be able to see and schedule jobs on their new virtual GPU Instances
as if they were physical GPUs. MIG works with Linux operating systems, supports
containers using Docker Engine, with support for Kubernetes and virtual machines using
hypervisors such as Red Hat Virtualization and VMware vSphere.
MIG supports the following deployment configurations:
The purpose of this document is to introduce the concepts behind MIG, deployment
considerations and provide examples of MIG management to demonstrate how users
can run CUDA applications on MIG supported GPUs.
MIG is supported on GPUs starting with the NVIDIA Ampere generation (i.e. GPUs with
compute capability >= 8.0). The following table provides a list of supported GPUs:
Additionally, MIG is supported on systems that include the supported products above
such as DGX, DGX Station and HGX.
‣ Under Linux guests on supported hypervisors, when MIG-supported GPUs are in GPU
pass-through, the same workflows, tools and profiles available on bare-metal can be
used.
‣ MIG allows multiple vGPUs (and thereby VMs) to run in parallel on a single MIG-
supported GPU, while preserving the isolation guarantees that vGPU provides. To
configure a GPU for use with vGPU VMs, refer to the chapter in the vGPU Software
User Guide. Refer also to the technical brief for more information on GPU partitioning
with vGPU.
5.1. Terminology
This section introduces some terminology used to describe the concepts behind MIG.
Streaming Multiprocessor
A streaming multiprocessor (SM) executes compute instructions on the GPU.
GPU Context
A GPU context is analogous to a CPU process. It encapsulates all the resources
necessary to execute operations on the GPU, including a distinct address space, memory
allocations, etc. A GPU context has the following properties:
‣ Fault isolation
‣ Individually scheduled
‣ Distinct address space
GPU Engine
A GPU engine is what executes work on the GPU. The most commonly used engine is
the Compute/Graphics engine that executes the compute instructions. Other engines
include the copy engine (CE) that is responsible for performing DMAs, NVDEC for video
decoding, NVENC for encoding, etc. Each engine can be scheduled independently and
execute work for different GPU contexts.
GPU SM Slice
A GPU SM slice is the smallest fraction of the SMs on the GPU. A GPU SM slice is roughly
one seventh of the total number of SMs available in the GPU when configured in MIG
mode.
GPU Slice
A GPU slice is the smallest fraction of the GPU that combines a single GPU memory slice
and a single GPU SM slice.
GPU Instance
A GPU Instance (GI) is a combination of GPU slices and GPU engines (DMAs, NVDECs,
etc.). Anything within a GPU instance always shares all the GPU memory slices and other
GPU engines, but it's SM slices can be further subdivided into compute instances (CI).
A GPU instance provides memory QoS. Each GPU slice includes dedicated GPU memory
resources which limit both the available capacity and bandwidth, and provide memory
QoS. Each GPU memory slice gets 1/8 of the total GPU memory resources and each GPU
SM slice gets 1/7 of the total number of SMs.
Compute Instance
A GPU instance can be subdivided into multiple compute instances. A Compute Instance
(CI) contains a subset of the parent GPU instance's SM slices and other GPU engines
(DMAs, NVDECs, etc.). The CIs share memory and engines.
5.2. Partitioning
Using the concepts introduced above, this section provides an overview of how the user
can create various partitions on the GPU. For illustration purposes, the document will use
the A100-40GB as an example, but the process is similar for other GPUs that support
MIG.
GPU Instance
Partitioning of the GPU happens using memory slices, so the A100-40GB GPU can be
thought of having 8x5GB memory slices and 7 SM slices as shown in the diagram below.
As explained above, then to create a GPU Instance (GI) requires combining some number
of memory slices with some number of compute slices. In the diagram below, a 5GB
memory slice is combined with 1 compute slice to create a 1g.5gb GI profile:
Similarly, 4x5GB memory slices can be combined with 4x1 compute slices to create the
4g.5gb GI profile:
Compute Instance
The compute slices of a GPU Instance can be further subdivided into multiple Compute
Instances (CI), with the CIs sharing the engines and memory of the parent GI, but each CI
has dedicated SM resources.
Using the same 4g.20gb example above, a CI may be created to consume only the first
compute slice as shown below:
In this case, 4 different CIs can be created by choosing any of the compute slices. Two
compute slices can also be combined together to create a 2c.4g.20gb profile:
In this example, 3 compute slices can also be combined to create a 3c.4g.20gb profile
or all 4 can be combined to create a 4c.4g.20gb profile. When all 4 compute slices are
combined, the profile is simply referred to as the 4g.20gb profile.
Refer to the sections on the canonical naming scheme and the CUDA device
terminology.
Profile Placement
The number of slices that a GI can be created with is not arbitrary. The NVIDIA driver
APIs provide a number of “GPU Instance Profiles” and users can create GIs by specifying
one of these profiles.
On a given GPU, multiple GIs can be created from a mix and match of these profiles, so
long as enough slices are available to satisfy the request.
Note:
The table below shows the profile names on the A100-SXM4-40GB product. For A100-
SXM4-80GB, the profile names will change according to the memory proportion - for
example, 1g.10gb, 2g.20gb, 3g.40gb, 4g.40gb, 7g.80gb respectively.
For a list of all supported combinations of profiles on MIG-enabled GPUs, refer to the
section on supported profiles.
Number of
Profile Fraction of Fraction Hardware L2 Cache Copy Instances
Name Memory of SMs Units Size Engines Available
JPEG /0
OFA
MIG 2
4g.20gb NVDECs /0
4/8 4/7 4/8 4 1
JPEG /0
OFA
MIG 5
7g.40gb NVDECs /1
Full 7/7 Full 7 1
JPEG /1
OFA
The diagram below shows a pictorial representation of how to build all valid combinations
of GPU instances.
In this diagram, a valid combination can be built by starting with an instance profile on
the left and combining it with other instance profiles as you move to the right, such that
no two profiles overlap vertically. For a list of all supported combinations and placements
of profiles on A100 and A30, refer to the section on supported profiles.
Note that prior to NVIDIA driver release R510, the combination of a (4 memory, 4
compute) and a (4 memory, 3 compute) profile was not supported. This restriction no
longer applies on newer drivers.
Note that the diagram represents the physical layout of where the GPU Instances will
exist once they are instantiated on the GPU. As GPU Instances are created and destroyed
at different locations, fragmentation can occur, and the physical position of one GPU
Instance will play a role in which other GPU Instances can be instantiated next to it.
Lastly, MIG is the new form of concurrency offered by NVIDIA GPUs while addressing
some of the limitations with the other CUDA technologies for running parallel work.
Note:
Also note the device nodes and nvidia-capabilities for exposing the MIG devices.
The /proc mechanism for system-level interfaces is deprecated as of 450.51.06 and it
is recommended to use the /dev based system-level interface for controlling access
mechanisms of MIG devices through cgroups. This functionality is available starting
with 450.80.02+ drivers.
MIG allows multiple vGPUs (and thereby VMs) to run in parallel on a single A100, while
preserving the isolation guarantees that vGPU provides. For more information on GPU
partitioning using vGPU and MIG, refer to the technical brief.
‣ Setting MIG mode on the A100/A30 requires a GPU reset (and thus super-user
privileges). Once the GPU is in MIG mode, instance management is then dynamic.
Note that the setting is on a per-GPU basis.
‣ On NVIDIA Ampere GPUs, similar to ECC mode, MIG mode setting is persistent across
reboots until the user toggles the setting explicitly
‣ All daemons holding handles on driver modules need to be stopped before MIG
enablement.
‣ This is true for systems such as DGX which may be running system health monitoring
services such as nvsm or GPU health monitoring or telemetry services such as
DCGM.
‣ Toggling MIG mode requires the CAP_SYS_ADMIN capability. Other MIG management,
such as creating and destroying instances, requires superuser by default, but can be
delegated to non-privileged users by adjusting permissions to MIG capabilities in /
proc/.
By default, a MIG device consists of a single “GPU Instance” and a single “Compute
Instance”. The table below highlights a naming convention to refer to a MIG device by
its GPU Instance's compute slice count and its total memory in GB (rather than just its
memory slice count).
When only a single CI is created (that consumes the entire compute capacity of the GI),
then the CI sizing is implied in the device name.
Note:
The description below shows the profile names on the A100-SXM4-40GB product. For
A100-SXM4-80GB, the profile names will change according to the memory proportion - for
example, 1g.10gb, 2g.20gb, 3g.40gb, 4g.40gb, 7g.80gb respectively.
Each GI can be further sub-divided into multiple CIs as required by users depending on
their workloads. The table below highlights what the name of a MIG device would look
like in this case. The example shown is for subdividing a 3g.20gb device into a set of sub-
devices with different Compute Instance slice counts.
$ ls -l /proc/driver/nvidia-caps/
-r--r--r-- 1 root root 0 Nov 21 21:22 mig-minors
-r--r--r-- 1 root root 0 Nov 21 21:22 nvlink-minors
-r--r--r-- 1 root root 0 Nov 21 21:22 sys-minors
The corresponding device nodes (in mig-minors) are created under /dev/nvidia-caps.
Refer to the chapter on device nodes and capabilities for more information.
1. With drivers >= R470 (470.42.01+), each MIG device is assigned a GPU UUID starting
with MIG-<UUID>.
2. With drivers < R470 (e.g. R450 and R460), each MIG device is enumerated by
specifying the CI and the corresponding parent GI. The format follows this
convention: MIG-<GPU-UUID>/<GPU instance ID>/<compute instance ID>.
Note:
With the R470 NVIDIA datacenter drivers (470.42.01+), the example below shows how
MIG devices are assigned GPU UUIDs in an 8-GPU system with each GPU configured
differently.
$ nvidia-smi -L
This section provides an overview of the supported profiles and possible placements of
the MIG profiles on supported GPUs.
The table below shows the supported profiles on the A30-24GB product.
Number of
Profile Fraction of Fraction Hardware L2 Cache Copy Instances
Name Memory of SMs Units Size Engines Available
MIG 2
1 (A single 2g
2g.12gb NVDECs /1
2/4 2/4 2/4 2 profile can include
+me JPEG /1
media extensions)
OFA
MIG 4
4g.24gb NVDECs /1
Full 4/4 Full 4 1
JPEG /1
OFA
Note:
The 1g.6gb+me profile is only available starting with R470 drivers.
The 2g.12gb+me profile is only available starting with R525 drivers.
The table below shows the supported profiles on the A100-SXM4-40GB product. For
A100-SXM4-80GB, the profile names will change according to the memory proportion
- for example, 1g.10gb, 1g.10gb+me, 1g.20gb, 2g.20gb, 3g.40gb, 4g.40gb, 7g.80gb
respectively.
Note:
The 1g.5gb+me profile is only available starting with R470 drivers.
The 1g.10gb profile is only available starting with R525 drivers.
The table below shows the supported profiles on the H100 80GB product (PCIe and
SXM5).
Number of
Profile Fraction of Fraction Hardware L2 Cache Copy Instances
Name Memory of SMs Units Size Engines Available
MIG 7
7g.80gb NVDECs /7
Full 7/7 Full 8 1
JPEG /1
OFA
The table below shows the supported profiles on the H100 94GB product (PCIe and
SXM5).
Number of
Profile Fraction of Fraction of Hardware L2 Cache Copy Instances
Name Memory SMs Units Size Engines Available
MIG 1/8 1/7 1 1/8 1 7
1g.11gb NVDECs /1
JPEG /0
OFA
MIG 1/8 1/7 1 NVDEC /1 1/8 1 1 (A single
1g.11gb JPEG /1 1g profile
+me OFA can include
media
extensions)
MIG 1/4 1/7 1 1/8 1 4
1g.22gb NVDECs /1
JPEG /0
OFA
MIG 2/8 2/7 2 2/8 2 3
2g.22gb NVDECs /2
JPEG /0
OFA
MIG 4/8 3/7 3 4/8 3 2
3g.44gb NVDECs /3
JPEG /0
OFA
MIG 4/8 4/7 4 4/8 4 1
4g.44gb NVDECs /4
JPEG /0
OFA
MIG Full 7/7 7 Full 8 1
7g.88gb NVDECs /7
JPEG /1
OFA
The table below shows the supported profiles on the H100 96GB product (H100 on
GH200).
Number of
Profile Fraction of Fraction of Hardware L2 Cache Copy Instances
Name Memory SMs Units Size Engines Available
MIG 1/8 1/7 1 1/8 1 7
1g.12gb NVDECs /1
JPEG /0
OFA
MIG 1/8 1/7 1 NVDEC /1 1/8 1 1 (A single
1g.12gb JPEG /1 1g profile
+me OFA can include
media
extensions)
MIG 1/4 1/7 1 1/8 1 4
1g.24gb NVDECs /1
JPEG /0
OFA
MIG 2/8 2/7 2 2/8 2 3
2g.24gb NVDECs /2
JPEG /0
OFA
MIG 4/8 3/7 3 4/8 3 2
3g.48gb NVDECs /3
JPEG /0
OFA
MIG 4/8 4/7 4 4/8 4 1
4g.48gb NVDECs /4
JPEG /0
OFA
MIG Full 7/7 7 Full 8 1
7g.96gb NVDECs /7
JPEG /1
OFA
The table below shows the supported profiles on the H200 141GB product.
Number of
Profile Fraction of Fraction Hardware L2 Cache Copy Instances
Name Memory of SMs Units Size Engines Available
MIG 7
7g.141gb NVDECs /7
Full 7/7 Full 8 1
JPEG /1
OFA
9.1. Prerequisites
The following prerequisites and minimum software versions are recommended when
using supported GPUs in MIG mode.
‣ If using H100, then CUDA 12 and NVIDIA driver R525 ( >= 525.53) or later
‣ If using A100/A30, then CUDA 11 and NVIDIA driver R450 ( >= 450.80.02) or later
‣ Linux operating system distributions supported by CUDA
‣ If running containers or using Kubernetes, then:
‣ NVIDIA Container Toolkit (nvidia-docker2): v2.5.0 or later
‣ NVIDIA K8s Device Plugin: v0.7.0 or later
‣ NVIDIA gpu-feature-discovery: v0.2.0 or later
MIG can be managed programmatically using NVIDIA Management Library (NVML) APIs
or its command-line-interface, nvidia-smi. Note that for brevity, some of the nvidia-
smi output in the following examples may be cropped to showcase the relevant sections
of interest.
For more information on the MIG commands, see the nvidia-smi man page or nvidia-
smi mig --help. For information on the MIG management APIs, see the NVML header
(nvml.h) included in the CUDA Toolkit packages (cuda-nvml-dev-*; installed under /usr/
local/cuda/include/nvml.h) For automated tooling support with configuring MIG, refer
to the NVIDIA MIG Partition Editor (or mig-parted) tools.
$ nvidia-smi -i 0
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 A100-SXM4-40GB Off | 00000000:36:00.0 Off | 0 |
| N/A 29C P0 62W / 400W | 0MiB / 40537MiB | 6% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
MIG mode can be enabled on a per-GPU basis with the following command: nvidia-smi
-i <GPU IDs> -mig 1. The GPUs can be selected using comma separated GPU indexes,
PCI Bus Ids or UUIDs. If no GPU ID is specified, then MIG mode is applied to all the GPUs
on the system.
When MIG is enabled on the GPU, depending on the GPU product, the driver will attempt
to reset the GPU so that MIG mode can take effect.
Note:
If you are using MIG inside a VM with NVIDIA Ampere GPUs (A100 or A30) in passthrough,
then you may need to reboot the VM to allow the GPU to be in MIG mode as in some
cases, GPU reset is not allowed via the hypervisor for security reasons. This can be seen in
the following example:
In this specific DGX example, you would have to stop the nvsm and dcgm services, enable
MIG mode on the desired GPU and then restore the monitoring services:
The examples shown in the document use super-user privileges. As described in the
Device Nodes section, granting read access to mig/config capabilities allows non-
root users to manage instances once the GPU has been configured into MIG mode. The
default file permissions on the mig/config file is shown below.
$ ls -l /proc/driver/nvidia/capabilities/*
/proc/driver/nvidia/capabilities/mig:
total 0
-r-------- 1 root root 0 May 24 16:10 config
-r--r--r-- 1 root root 0 May 24 16:10 monitor
List the possible placements available using the following command. The syntax of the
placement is {<index>}:<GPU Slice Count> and shows the placement of the instances
on the GPU. The placement index shown indicates how the profiles are mapped on the
GPU as shown in the supported profiles tables.
The command shows that the user can create two instances of type 3g.20gb (profile ID
9) or seven instances of 1g.5gb (profile ID 19).
Note:
Without creating GPU instances (and corresponding compute instances), CUDA workloads
cannot be run on the GPU. In other words, simply enabling MIG mode on the GPU is not
sufficient. Also note that, the created MIG devices are not persistent across system
reboots. Thus, the user or system administrator needs to recreate the desired MIG
configurations if the GPU or system is reset. For automated tooling support for this
purpose, refer to the NVIDIA MIG Partition Editor (or mig-parted) tool, including creating a
systemd service that could recreate the MIG geometry at system startup.
The following example shows how the user can create GPU instances (and corresponding
compute instances). In this example, the user can create two GPU instances (of type
3g.20gb), with each GPU instance having half of the available compute and memory
capacity. In this example, we purposefully use profile ID and short profile name to
showcase how either option can be used:
Now verify that the GIs and corresponding CIs are created:
$ nvidia-smi
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 1 0 0 | 11MiB / 20224MiB | 42 0 | 3 0 2 0 0 |
+------------------+----------------------+-----------+-----------------------+
| 0 2 0 1 | 11MiB / 20096MiB | 42 0 | 3 0 2 0 0 |
+------------------+----------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Instance Geometry
As described in the section on Partitioning, the NVIDIA driver APIs provide a number of
available GPU Instance profiles that can be chosen by the user.
If a mixed geometry of the profiles is specified by the user, then the NVIDIA driver
chooses the placement of the various profiles. This can be seen in the following
examples.
Example 1: Creation of a 4-2-1 geometry. After the instances are created, the placement
of the profiles can be observed:
Note:
Due to a known issue with the APIs, the profile ID 9 or 3g.20gb must be specified first in
order. Not doing so, will result in the following error.
Specify the correct order for the 3g.20gb profile. The remaining combinations of the
profiles do not have this requirement.
$ nvidia-smi -L
GPU 0: A100-SXM4-40GB (UUID: GPU-e86cb44c-6756-fd30-cd4a-1e6da3caf9b0)
MIG 3g.20gb Device 0: (UUID: MIG-c7384736-a75d-5afc-978f-d2f1294409fd)
MIG 3g.20gb Device 1: (UUID: MIG-a28ad590-3fda-56dd-84fc-0a0b96edc58d)
Now verify the two CUDA applications are running on two separate GPU instances:
$ nvidia-smi
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 1 0 0 | 268MiB / 20224MiB | 42 0 | 3 0 2 0 0 |
+------------------+----------------------+-----------+-----------------------+
| 0 2 0 1 | 268MiB / 20096MiB | 42 0 | 3 0 2 0 0 |
+------------------+----------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 1 0 58866 C ./BlackScholes 253MiB |
| 0 2 0 58856 C ./BlackScholes 253MiB |
+-----------------------------------------------------------------------------+
$ nvidia-smi
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 1 0 0 | 268MiB / 20096MiB | 42 0 | 3 0 2 0 0 |
| | 4MiB / 32767MiB | | |
+------------------+----------------------+-----------+-----------------------+
| 0 2 0 1 | 268MiB / 20096MiB | 42 0 | 3 0 2 0 0 |
| | 4MiB / 32767MiB | | |
+------------------+----------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 1 0 6217 C ...inux/release/BlackScholes 253MiB |
| 0 2 0 6223 C ...inux/release/BlackScholes 253MiB |
+-----------------------------------------------------------------------------+
For monitoring MIG devices on MIG capable GPUs such as the A100, including attribution
of GPU metrics (including utilization and other profiling metrics), it is recommended to
use NVIDIA DCGM v2.0.13 or later. See the Profiling Metrics section in the DCGM User
Guide for more details on getting started.
+--------------------------------------------------------------------------------------+
| 0 1 MIG 3g.20gb 2* 0/1 42 2 0 0 |
| 3 0 |
+--------------------------------------------------------------------------------------+
Create 3 CIs, each of type 1c compute capacity (profile ID 0) on the first GI.
And the GIs and CIs created on the A100 are now enumerated by the driver:
$ nvidia-smi
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 1 0 0 | 11MiB / 20224MiB | 14 0 | 3 0 2 0 0 |
+------------------+ +-----------+-----------------------+
| 0 1 1 1 | | 14 0 | 3 0 2 0 0 |
+------------------+ +-----------+-----------------------+
| 0 1 2 2 | | 14 0 | 3 0 2 0 0 |
+------------------+----------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
$ nvidia-smi
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 1 0 0 | 476MiB / 20224MiB | 14 0 | 3 0 2 0 0 |
+------------------+ +-----------+-----------------------+
| 0 1 1 1 | | 14 0 | 3 0 2 0 0 |
+------------------+ +-----------+-----------------------+
| 0 1 2 2 | | 14 0 | 3 0 2 0 0 |
+------------------+----------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 1 0 59785 C ./BlackScholes 153MiB |
| 0 1 1 59796 C ./BlackScholes 153MiB |
| 0 1 2 59885 C ./BlackScholes 153MiB |
+-----------------------------------------------------------------------------+
Note:
If the intention is to destroy all the CIs and GIs, then this can be accomplished with the
following commands:
It can be verified that the CI devices have now been torn down on the GPU:
$ nvidia-smi
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| No MIG devices found |
+-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Note:
On Ampere GPUs (A100 or A30), NVML (and nvidia-smi) does not support attribution of
utilization metrics to MIG devices. From the previous example, the utilization is displayed
as N/A when running CUDA programs:
$ nvidia-smi
+-----------------------------------------------------------------------------
+
| MIG devices:
|
+------------------+----------------------+-----------+-----------------------
+
| GPU GI CI MIG | Memory-Usage | Vol| Shared
|
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA
JPG|
| | | ECC|
|
|==================+======================+===========
+=======================|
| 0 1 0 0 | 268MiB / 20096MiB | 42 0 | 3 0 2 0 0
|
| | 4MiB / 32767MiB | |
|
+------------------+----------------------+-----------+-----------------------
+
| 0 2 0 1 | 268MiB / 20096MiB | 42 0 | 3 0 2 0 0
|
| | 4MiB / 32767MiB | |
|
+------------------+----------------------+-----------+-----------------------
+
+-----------------------------------------------------------------------------
+
| Processes:
|
| GPU GI CI PID Type Process name GPU Memory
|
| ID ID Usage
|
|
=============================================================================|
| 0 1 0 6217 C ...inux/release/BlackScholes 253MiB
|
| 0 2 0 6223 C ...inux/release/BlackScholes 253MiB
|
+-----------------------------------------------------------------------------
+
Workflow
In summary, the workflow for running with MPS is as follows:
Note:
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 A100-PCIE-40GB On | 00000000:65:00.0 Off | On |
| N/A 37C P0 66W / 250W | 581MiB / 40536MiB | N/A Default |
| | | Enabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 1 0 0 | 290MiB / 20096MiB | 42 0 | 3 0 2 0 0 |
| | 8MiB / 32767MiB | | |
+------------------+----------------------+-----------+-----------------------+
| 0 2 0 1 | 290MiB / 20096MiB | 42 0 | 3 0 2 0 0 |
| | 8MiB / 32767MiB | | |
+------------------+----------------------+-----------+-----------------------+
export CUDA_MPS_PIPE_DIRECTORY=/tmp/<MIG_UUID>
mkdir -p $CUDA_MPS_PIPE_DIRECTORY
CUDA_VISIBLE_DEVICES=<MIG_UUID> \
CUDA_MPS_PIPE_DIRECTORY=/tmp/<MIG_UUID> \
nvidia-cuda-mps-control -d
CUDA_VISIBLE_DEVICES=<MIG_UUID> \
my-cuda-app
A Complete Example
We now provide a script below where we attempt to run the BlackScholes from before
on the two MIG devices created on the GPU:
#!/usr/bin/env bash
GPU_UUID=GPU-63feeb45-94c6-b9cb-78ea-98e9b7a5be6b
for i in MIG-$GPU_UUID/1/0 MIG-$GPU_UUID/2/0; do
When running this script, we can observe the two MPS servers on each MIG device and
the corresponding CUDA program started as an MPS client when using nvidia-smi:
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 1 0 46781 M+C ./bin/BlackScholes 251MiB |
| 0 1 0 46784 C nvidia-cuda-mps-server 29MiB |
| 0 2 0 46797 M+C ./bin/BlackScholes 251MiB |
| 0 2 0 46798 C nvidia-cuda-mps-server 29MiB |
+-----------------------------------------------------------------------------+
$ curl https://get.docker.com | sh \
&& sudo systemctl start docker \
&& sudo systemctl enable docker
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 1 0 0 | 11MiB / 20224MiB | 42 0 | 3 0 2 0 0 |
+------------------+----------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
=============
== PyTorch ==
=============
Container image Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
NVIDIA Deep Learning Profiler (dlprof) Copyright (c) 2020, NVIDIA CORPORATION. All
rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.
Done!
Train Epoch: 1 [0/60000 (0%)] Loss: 2.320747
Train Epoch: 1 [640/60000 (1%)] Loss: 1.278727
Currently, the NVIDIA kernel driver exposes its interfaces through a few system-wide
device nodes. Each physical GPU is represented by its own device node - e.g. nvidia0,
nvidia1 etc. This is shown below for a 2-GPU system.
/dev
├── nvidiactl
├── nvidia-modeset
├── nvidia-uvm
├── nvidia-uvm-tools
├── nvidia-nvswitchctl
├── nvidia0
└── nvidia1
$ cat /proc/driver/nvidia/capabilities/mig/config
DeviceFileMinor: 1
DeviceFileMode: 256
DeviceFileModify: 1
The combination of the device major for nvidia-caps and the value of DeviceFileMinor
in this file indicate that the mig-config capability (which allows a user to create and
destroy MIG devices) is controlled by the device node with a major:minor of 238:1. As
such, one will need to use cgroups to grant a process read access to this device in order
$ ls -l /dev/nvidia-caps
total 0
cr-------- 1 root root 508, 1 Nov 21 17:16 nvidia-cap1
cr--r--r-- 1 root root 508, 2 Nov 21 17:16 nvidia-cap2
...
$ nvidia-modprobe \
-f /proc/driver/nvidia/capabilities/mig/config \
-f /proc/driver/nvidia/capabilities/mig/monitor
$ ls -l /dev/nvidia-caps
total 0
cr-------- 1 root root 508, 1 Nov 21 17:16 nvidia-cap1
cr--r--r-- 1 root root 508, 2 Nov 21 17:16 nvidia-cap2
nvidia-modprobe looks at the DeviceFileMode in each capability file and creates the
device node with the permissions indicated (e.g. +ur from a value of 256 (o400) from our
example for mig-config).
Programs such as nvidia-smi will automatically invoke nvidia-modprobe (when
available) to create these device nodes on your behalf. In other scenarios it is not
necessarily required to use nvidia-modprobe to create these device nodes, but it does
make the process simpler.
If you actually want to prevent nvidia-modprobe from ever creating a particular device
node on your behalf, you can do the following:
You will then be responsible for managing creation of the device node referenced by /
proc/driver/nvidia/capabilities/mig/config going forward. If you want to change
that in the future, simply reset it to a value of "DeviceFileModify: 1" with the same
command sequence.
This is important in the context of containers because we may want to give a container
access to a certain capability even if it doesn't exist in the /proc hierarchy yet.
For example, granting a container the mig-config capability implies that we should also
grant it capabilities to access all possible gis and cis that could be created for any GPU
on the system. Otherwise the container will have no way of working with those gis and
cis once they have actually been created.
One final thing to note about /dev based capabilities is that the minor numbers for all
possible capabilities are predetermined and can be queried under various files of the
form:
/proc/driver/nvidia-caps/*-minors
$ cat /proc/driver/nvidia-caps/mig-minors
config 1
monitor 2
gpu0/gi0/access 3
gpu0/gi0/ci0/access 4
gpu0/gi0/ci1/access 5
gpu0/gi0/ci2/access 6
...
gpu31/gi14/ci6/access 4321
gpu31/gi14/ci7/access 4322
Note:
The NVML device numbering (e.g. through nvidia-smi) is not the device minor number.
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 1 0 0 | 19MiB / 40192MiB | 14 0 | 3 0 3 0 3 |
| | 0MiB / 65535MiB | | |
+------------------+ +-----------+-----------------------+
| 0 1 1 1 | | 14 0 | 3 0 3 0 3 |
| | | | |
+------------------+ +-----------+-----------------------+
| 0 1 2 2 | | 14 0 | 3 0 3 0 3 |
| | | | |
+------------------+----------------------+-----------+-----------------------+
/proc/driver/nvidia/capabilities
└── mig
└── config
Likewise, the capabilities required to run workloads on a MIG device once it has been
created are represented as follows (namely as access to the GPU Instance and Compute
Instance that comprise the MIG device):
/proc/driver/nvidia/capabilities
└── gpu0
└── mig
├── gi0
│ ├── access
│ └── ci0
│ └── access
├── gi1
│ ├── access
│ └── ci0
│ └── access
└── gi2
├── access
└── ci0
└── access
And the corresponding file system layout is shown below with read permissions:
$ ls -l /proc/driver/nvidia/capabilities/gpu0/mig/gi*
/proc/driver/nvidia/capabilities/gpu0/mig/gi1:
total 0
-r--r--r-- 1 root root 0 May 24 17:38 access
dr-xr-xr-x 2 root root 0 May 24 17:38 ci0
/proc/driver/nvidia/capabilities/gpu0/mig/gi2:
total 0
-r--r--r-- 1 root root 0 May 24 17:38 access
dr-xr-xr-x 2 root root 0 May 24 17:38 ci0
For a CUDA process to be able to run on top of MIG, it needs access to the Compute
Instance capability and its parent GPU Instance. Thus a MIG device is identified by the
following format:
As an example, having read access to the following paths would allow one to run
workloads on the MIG device represented by <gpu0, gi0, ci0>:
/proc/driver/nvidia/capabilities/gpu0/mig/gi0/access
/proc/driver/nvidia/capabilities/gpu0/mig/gi0/ci0/access
Note, that there is no access file representing a capability to run workloads on gpu0 (only
on gi0 and ci0 that sit underneath gpu0). This is because the traditional mechanism of
using cgroups to control access to top level GPU devices (and any required meta devices)
is still required. As shown earlier in the document, the cgroups mechanism applies to:
/dev/nvidia0
/dev/nvidiactl
/dev/nvidiactl-uvm
...
NVIDIA reserves the right to make corrections, modifications, enhancements, improvements, and any other changes to this document, at any time without notice.
Customer should obtain the latest relevant information before placing orders and should verify that such information is current and complete.
NVIDIA products are sold subject to the NVIDIA standard terms and conditions of sale supplied at the time of order acknowledgement, unless otherwise agreed in an individual sales agreement signed by
authorized representatives of NVIDIA and customer (“Terms of Sale”). NVIDIA hereby expressly objects to applying any customer general terms and conditions with regards to the purchase of the NVIDIA
product referenced in this document. No contractual obligations are formed either directly or indirectly by this document.
NVIDIA products are not designed, authorized, or warranted to be suitable for use in medical, military, aircraft, space, or life support equipment, nor in applications where failure or malfunction of the NVIDIA
product can reasonably be expected to result in personal injury, death, or property or environmental damage. NVIDIA accepts no liability for inclusion and/or use of NVIDIA products in such equipment or
applications and therefore such inclusion and/or use is at customer’s own risk.
NVIDIA makes no representation or warranty that products based on this document will be suitable for any specified use. Testing of all parameters of each product is not necessarily performed by NVIDIA. It
is customer’s sole responsibility to evaluate and determine the applicability of any information contained in this document, ensure the product is suitable and fit for the application planned by customer, and
perform the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA
product and may result in additional or different conditions and/or requirements beyond those contained in this document. NVIDIA accepts no liability related to any default, damage, costs, or problem which
may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.
No license, either expressed or implied, is granted under any NVIDIA patent right, copyright, or other NVIDIA intellectual property right under this document. Information published by NVIDIA regarding third-
party products or services does not constitute a license from NVIDIA to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party
under the patents or other intellectual property rights of the third party, or a license from NVIDIA under the patents or other intellectual property rights of NVIDIA.
Reproduction of information in this document is permissible only if approved in advance by NVIDIA in writing, reproduced without alteration and in full compliance with all applicable export laws and regulations,
and accompanied by all associated conditions, limitations, and notices.
THIS DOCUMENT AND ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE
BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES
OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL NVIDIA BE LIABLE FOR ANY DAMAGES, INCLUDING
WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF ANY
USE OF THIS DOCUMENT, EVEN IF NVIDIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s
aggregate and cumulative liability towards customer for the products described herein shall be limited in accordance with the Terms of Sale for the product.
Trademarks
NVIDIA and the NVIDIA logo are trademarks and/or registered trademarks of NVIDIA Corporation in the Unites States and other countries. Other company and product names may be trademarks of the
respective companies with which they are associated.
Copyright
© 2020-2024 NVIDIA Corporation & affiliates. All rights reserved.