Kernel Parameters#
This section describes the GRUB and kernel parameter settings common to all AMD Instinct GPU models.
Overview#
Configuring the correct kernel command-line parameters is essential for stable operation and optimal performance of AMD Instinct™-based systems. These parameters are set using GRUB (GNU Grand Unified Bootloader), which controls how the Linux kernel initializes hardware during system startup. This section outlines the required and recommended kernel boot parameters to be appended in the GRUB configuration file (typically /etc/default/grub
) required by AMD Instinct-based servers.
GRUB Configuration Steps#
Open
/etc/default/grub
with root privileges.Locate the line starting with
GRUB_CMDLINE_LINUX
.Append all required and recommended parameters to this line.
Save the file and apply changes:
sudo update-grub
For RHEL-based systems, use the grubby tool:
sudo grubby --update-kernel=ALL --args="pci=realloc=off"
Reboot the system for changes to take effect.
Verify the active kernel parameters:
cat /proc/cmdline
Kernel Parameters#
Required Kernel Parameters#
Parameter |
Comments |
---|---|
|
With this setting Linux can unambiguously detect all GPUs of the Instinct™-based system because this setting disables the automatic reallocation of PCI resources. It is used when Single Root I/O Virtualization (SR-IOV) Base Address Registers (BARs) have not been allocated by the BIOS. This can help avoid potential issues with certain hardware configurations. |
|
The |
|
The |
|
For systems with Intel host CPUs, not needed for systems with AMD CPUs. |
|
The NUMA balancing feature allows the OS to scan memory and attempt to migrate to a DIMM that is logically closer to the cores accessing it. This causes an overhead because the OS is only estimating NUMA allocations, which may be useful if the NUMA locality access is not ideal. |
|
For some system configurations, the amdgpu driver needs to be blacklisted to avoid instances where the DCGPU may not be ready when the driver loads or if system BIOS settings have not been set optimally. For the system to be functional if this parameter is used, the amdgpu driver must be loaded after booting. Alternatively, configuring the AMD DCGPU with recommended system optimized BIOS settings, it might be possible to remove blacklisting the driver. However, blacklisting the driver is considered the safest option since the AMD DCGPU may not be ready during system boot if a firmware update is in progress. |
Note
If modprobe.blacklist=amdgpu
is used, the amdgpu module must be loaded after booting:
sudo modprobe amdgpu
For deployment, adding a sysctl task to load the amdgpu driver immediately after boot is recommended.
Optional Kernel Parameters#
Parameter |
Comments |
---|---|
|
Disables KASLR (Kernel Address Space Layout Randomization) which reduces boot time and increases cache locality. |
|
Disable random memory allocation, which increases performance, but can reduce system security. Not recommended for systems with multiple users without virtualization safety restrictions. |
|
Disables many security features for a dramatic increase in performance in some cases |
|
Disable ASPM power management at the cost of slightly more power usage from the PCIe bus. Strongly recommended. |
|
Limits CPU to active state for better responsiveness |
|
Disables predictable system devices naming. This decreases boot time and has minimal performance increase. |
|
Reduces the number of messages logged to dmesg to only the critical messages. |
|
Allows applications to automatically allocate huge (2MB) memory pages without application changes. This allows for faster memory allocation and easier memory management at the cost of increased RAM usage. This feature may reduce performance if RAM is constrained. |
|
The kernel stops periodically checking TSC accuracy, reducing background activity. |
|
Disables the NMI watchdog that detects hard lockups. Disabling this improves high-performance and low-latency workloads. |