AMD GPU Driver (amdgpu) 30.10 release notes#

The release notes provide a summary of notable changes since the previous AMD GPU Driver release.

Release highlights#

The following are notable new features and improvements in AMD GPU Driver 30.10.

Operating system and hardware support changes#

The AMD GPU Driver 30.10 adds support for AMD Instinct MI355X and MI350X accelerators.

AMD GPU Driver 30.10 also introduces support for the following operating systems:

  • Rocky 9

  • Ubuntu 24.04.3

AMD GPU Driver 30.10 marks end-of-support (EOS) for Ubuntu 24.04.2. For the compatibility between AMD GPU Driver, ROCm, GPUs, and OS, see the Compatibility matrix.

Partitioning#

The AMD GPU Driver 30.10 adds the following memory and compute partitioning support:

  • NPS1 + SPX partitioning for AMD Instinct MI355X and MI350X: This memory partitioning mode exposes the entire memory to all compute dies (XCDs), allowing full access across the GPU. In SPX (Single Partition Compute Mode), workgroups are distributed round-robin across all XCDs. There’s no explicit control over which XCD executes a given kernel, making it simple and general-purpose. This feature requires PLDM bundle (firmware) 01.25.13.04.

  • NPS2 + DPX partitioning for AMD Instinct MI355X and MI350X: NPS2 splits the GPU’s memory into two NUMA domains. Dual Partition Compute Mode (DPX) divides the GPU’s compute resources into two partitions, each with 4 XCDs (out of 8 total), 8 DMA engines, and 2 VCN decoder groups. This feature requires PLDM bundle (firmware) 01.25.13.04.

GPU resiliency#

The following GPU resiliency feature is supported in the AMD GPU Driver 30.10 for AMD Instinct MI300X, MI350X, and MI355X:

  • SDMA engine reset enables recovery from SDMA-related faults without requiring a full GPU reset, improving system stability and fault tolerance.

Program counter (PC) sampling#

The AMD GPU Driver 30.10.0 adds support for Stochastic (hardware-based) and Host-trap PC sampling, a GPU profiling technique used for analyzing kernel execution performance.

  • Stochastic PC sampling: This method randomly triggers wave traps across compute units to capture program counter (PC) snapshots. This method introduces randomness in wave selection, enabling broader statistical coverage of kernel execution behavior. This feature is supported on the MI300-series GPUs (including MI300A, MI300X, MI325X, MI350X, and MI355X).

  • Host-trap PC Sampling: This method allows controlled, device-wide profiling. It works by periodically selecting active wave slots across compute units and triggering a trap handler to capture the program counter (PC), producing a histogram of sampled instructions. This feature is supported on the MI200-series GPUs (including MI210, MI250, and MI250X) and MI300-series GPUs (including MI300A, MI300X, MI325X, MI350X, and MI355X).

This feature can be accessed either through the ROCprofiler method or directly via the ROCm Runtime vendor extension APIs, which are defined in the hsa_ven_amd_pc_sampling.h header as follows:

hsa_status_t hsa_ven_amd_pcs_create(hsa_agent_t agent, hsa_ven_amd_pcs_method_kind_t method,
                                    hsa_ven_amd_pcs_units_t units, size_t interval, size_t latency,
                                    size_t buffer_size
                                    hsa_ven_amd_pcs_data_ready_callback_t data_ready_callback,
                                    void* client_callback_data, hsa_ven_amd_pcs_t* pc_sampling);
hsa_status_t hsa_ven_amd_pcs_destroy(hsa_ven_amd_pcs_t pc_sampling);
hsa_status_t hsa_ven_amd_pcs_start(hsa_ven_amd_pcs_t pc_sampling);
hsa_status_t hsa_ven_amd_pcs_stop(hsa_ven_amd_pcs_t pc_sampling);
hsa_status_t hsa_ven_amd_pcs_flush(hsa_ven_amd_pcs_t pc_sampling);

Known issues#

Exceeding bad memory page threshold fails to declare Out-Of-Band Common Platform Error Records (CPERs). This issue affects all AMD Instinct MI350 series and MI300 series GPUs, and will be fixed in a future AMD GPU Driver release.

Resolved issues#

Issue with restoring a CRIU checkpoint for workloads on AMD Instinct MI series GPUs is resolved.