GPU Partitioning#

Prerequisites#

Before starting this guide, you must complete:

  1. Getting Started with Virtualization

  2. Host Configuration


Supported partitioning types:

  • Spatial Partitioning: Instinct MI300X, MI325X and MI350X/MI355X

  • Temporal Partitioning: Radeon Pro V710

Spatial Partitioning#

In a non-monolithic GPU architecture, multiple chiplets are integrated to form a cohesive unit. The arrangement of these chiplets is essential for understanding the overall architecture and its capabilities.

For programming simplicity, these distinct elements are presented to the programmer as a single logical device. However, for performance-critical applications, it may be beneficial for programmers to give up the convenience of this single-pool view. Instead, they can target kernels and memory allocations at the device’s individual components.

Spatial partitioning enables programmers to selectively modify the logical view of the device. This primarily involves exposing the discrete architectural elements separately. In the case of MxGPU there are memory partitioning modes, which change the view of the memory, and compute partitioning modes which change the view of the compute.

To facilitate targeted resource management, GPUs support various partitioning modes that alter the logical view of the device. These modes can be categorized into two primary types:

  • Compute Partitioning

  • Memory Partitioning

Compute Partitioning#

This refers to the logical partitioning of compute chiplets into distinct devices within the software stack. In the default mode, all compute chiplets are viewed as a single logical compute element. In a partitioned mode, each compute chiplet appears as a separate logical GPU, allowing for explicit scheduling and resource allocation for each individual compute element. MxGPU currently supports Dynamic Compute Partitioning.

Dynamic Compute Partitioning#

Dynamic Compute Partitioning allows for the division of GPU compute resources within a single VF into multiple partitions, configurable as:

  • 1 Partition (SPX)

  • 2 Partitions (DPX)

  • 4 Partitions (QPX)

  • 8 Partitions (CPX)

To dynamically switch the compute partitioning mode, you can use the AMD SMI tool with the following command:

amd-smi set --accelerator-partition=<profile_index>

This command will only work if there are no guest VMs running. It sets the accelerator partition to a mode based on the specified profile_index.

You can retrieve the available profile_index numbers by executing the following command:

amd-smi partition --accelerator

Make sure to check the output of this command to select the appropriate profile_index for your needs.

Dynamic Compute Partitioning Example#

A visual representation illustrating how Dynamic Compute Partition CPX Mode interacts with the VF is provided below:

Dynamic Compute Partition Mode - 1 VF

  • SPX - 8 XCC work together​

  • CPX - each XCC works independently

1VF Config

Memory Partitioning#

Memory partitioning modes, known as Non-Uniform Memory Access (NUMA) Per Socket (NPS), change the number of NUMA domains that a device exposes, effectively altering the accessible memory space for compute units. This alteration affects the number of High-Bandwidth Memory (HBM) stacks accessible to a compute unit. Importantly, the number of memory partitions must be less than or equal to the number of compute partitions. For example, certain memory partitioning modes may only be enabled when specific compute partitioning modes are active, allowing for optimized memory access based on the architecture’s capabilities.

This method divides the GPU memory into partitions based on the NPS modes, which can be set as:

  • NPS1

  • NPS2

  • NPS4

  • NPS8

To switch memory partition modes, you can again use the AMD SMI tool with the following command:

amd-smi set --memory-partition=<memory_partition_setting>

This command will only work if there are no guest VMs running. It sets the memory partition to one of the following options: NPS1, NPS2, NPS4, or NPS8.

To view the available memory partition capabilities, you can run the following command:

amd-smi partition --memory

This will display the supported memory partition settings for your GPU and current compute partitioning mode.

In theory, if not restricted by hardware limitations, any combination of compute and memory partition modes is possible (with rule that the number of memory partitions must be less than or equal to the number of compute partitions). Still, some combinations are restricted for simplicity.

Temporal Partitioning (AMD Radeon PRO V710)#

As a monolithic GPU, the AMD Radeon PRO V710 uses a fundamentally different approach to partitioning. It implements temporal partitioning through its Auto Scheduler, which time-slices the full GPU among active Virtual Functions (VFs).

Unlike true spatial partitioning where separate hardware blocks are permanently allocated to each workload, the V710 uses temporal partitioning that shares all hardware resources among VFs. Logical isolation is maintained while using shared execution units.

VF Support for Temporal Partitioning#

Temporal partitioning supports 1-12 VFs. This range is chosen to provide an optimal experience. While the theoretical maximum is 31 VFs, the practical limit is constrained to improve performance.

To achieve the desired temporal partitioning, use the vf_num parameter with the modprobe command on driver load time. The command should be structured as follows:

modprobe gim vf_num=<number_of_vfs>

In this command, <number_of_vfs> indicates the number of VFs you want to set up.

If you do not specify the vf_num parameter, the default value of 1 will be used.

Scheduling Modes#

The V710’s Auto Scheduler supports multiple scheduling modes. The primary ones are:

  • Solid Mode: Equal time slices allocated to all VFs

  • Liquid Mode: Dynamic time slice adjustment based on each VF’s workload requirements

Configuring the Scheduling Mode#

Set the scheduling mode at driver load time using the sch_policy module parameter. Provide the index of the desired scheduling mode:

modprobe gim sch_policy=<mode_index>

Note: If sch_policy is not specified, the default value 1 is used, which corresponds to Solid Mode.

All available scheduling modes and their corresponding indexes can be listed with:

modinfo gim

Performance Characteristics#

Key Differences from Spatial Partitioning:

Aspect

Temporal Partitioning

Spatial Partitioning

Division type

Time-based

Hardware-based

Execution

One workload at a time

Multiple workloads run simultaneously

Isolation

Strong temporal isolation

Strong spatial isolation

Overhead

Context-switching

Resource fragmentation


Partitioning support per GPU model#

AMD Instinct MI210X Architecture#

AMD Instinct MI210X offers no partitioning support.

AMD Instinct MI300X Architecture#

Spatial partitioning:

Static Compute Partitioning (VF number)

Dynamic Compute Partitioning

Supported Memory Partitioning (Memory NPS)

1

SPX

1

CPX

1, 4

AMD Instinct MI325X Architecture#

Spatial partitioning:

Static Compute Partitioning (VF number)

Dynamic Compute Partitioning

Supported Memory Partitioning (Memory NPS)

1

SPX

1

CPX

1, 4

AMD Instinct MI350X/MI355X Architecture#

Spatial partitioning:

Static Compute Partitioning (VF number)

Dynamic Compute Partitioning

Supported Memory Partitioning (Memory NPS)

1

SPX

1

DPX

2

AMD Radeon Pro V710#

The AMD Radeon PRO V710 uses temporal partitioning exclusively.


Next Steps#

After completing GPU partitioning, you can:

Virtual Machine Setup - Configure your VMs to use the partitioned VFs

XGMI Configuration - See supported XGMI configurations, then proceed to VM Setup