Standalone Debian Package Install#
System Requirements#
Before installing the AMD GPU Metrics Exporter, you need to install the “AMDGPU” driver from the Radeon repository. Please ensure that your system meets the following requirements:
Operating System: Ubuntu 22.04 or Ubuntu 24.04
ROCm Version: 6.4.1 (specific to each .deb pkg)
Each Debian package release of the Standalone Metrics Exporter is dependent on a specific version of the ROCm amdgpu driver. Please see table below for more information:
Metrics Exporter Debian Version |
ROCm Version |
AMDGPU Driver Version |
|---|---|---|
amdgpu-exporter-1.2.0 |
ROCm 6.3.x |
6.10.5 |
amdgpu-exporter-1.3.1 |
ROCm 6.4.x |
6.12.12 |
amdgpu-exporter-1.4.0.1 |
ROCm 7.0.x |
6.14.x |
amdgpu-exporter-1.4.2 |
ROCm 7.1.x |
6.16.6 |
amdgpu-exporter-1.5.0 |
ROCm 7.2.x |
6.16.13 |
Installation#
Step 1: Install System Prerequisites#
Install Linux headers and modules:
sudo apt update sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
Add user to required groups:
sudo usermod -a -G render,video $LOGNAME
Step 2: Install AMDGPU Driver#
Note
For the most up-to-date information on installing dkms drivers please see the ROCm Install Quick Start page. The below instructions are the most current instructions as of ROCm 7.0.rc1.
Download the driver from the Radeon repository (repo.radeon.com) for your operating system. For example if you want to get the latest ROCm 7.0.0 drivers for Ubuntu 22.04 you would run the following command:
wget https://repo.radeon.com/amdgpu-install/7.0/ubuntu/jammy/amdgpu-install_7.0.70000-1_all.deb sudo apt install ./amdgpu-install_7.0.70000-1_all.deb sudo apt update
Please note that the above url will be different depending on what version of the drivers you will be installing and type of Operating System you are using.
Install the driver:
sudo apt install amdgpu-dkms sudo reboot
Load the driver module:
sudo modprobe amdgpu
Step 3: Install the APT Prerequisites for Metrics Exporter#
Update the package list and install necessary tools, keyrings and keys:
# Install necessary tools sudo apt update sudo apt install vim wget gpg # Create the keyrings directory with the appropriate permissions: sudo mkdir --parents --mode=0755 /etc/apt/keyrings # Download the ROCm GPG key and add it to the keyrings: wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null
Edit the sources list to add the Device Metrics Exporter repository:
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/device-metrics-exporter/apt/1.4.0 jammy main
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/device-metrics-exporter/apt/1.4.0 noble main
Update the package list again:
sudo apt update
Step 4: Install Metrics Exporter#
Install the Device Metrics Exporter:
sudo apt install amdgpu-exporter
Enable and start services:
sudo systemctl enable amd-metrics-exporter.service sudo systemctl start amd-metrics-exporter.service
Check service status:
sudo systemctl status amd-metrics-exporter.service
Note
Before performing GPU driver unload/upgrade or partition operations, services must be stopped. See the Service Management for Driver and Partition Operations section for detailed instructions.
Metrics Exporter Default Settings#
Metrics endpoint:
http://localhost:5000/metricsConfiguration file:
/etc/metrics/config.jsonGPU Agent socket:
/var/run/gpuagent.sock(Unix Domain Socket)
The Exporter HTTP port is configurable via the ServerPort field in the configuration file.
Metrics Exporter Custom Configuration#
Changing configuration config.json#
If you need to customize ports or settings:
Edit the amd-metrics-exporter service file:
sudo vi /lib/systemd/system/amd-metrics-exporter.service
Update the ExecStart line to read in the config.json file:
ExecStart=/usr/local/bin/amd-metrics-exporter -amd-metrics-config /etc/metrics/config.json
Reload systemd:
sudo systemctl daemon-reload
Custom Socket Configuration - Change GPU Agent Socket Path (Advanced)#
By default, GPU Agent uses Unix Domain Socket at /var/run/gpuagent.sock for communication with the metrics exporter.
To change the socket path:
Edit the GPU Agent service file:
sudo vi /lib/systemd/system/gpuagent.service
Update ExecStart with custom socket path:
ExecStart=/usr/local/bin/gpuagent -s /path/to/custom.sock
Edit the Metrics Exporter service file:
sudo vi /lib/systemd/system/amd-metrics-exporter.service
Update ExecStart to use the same socket path:
ExecStart=/usr/local/bin/amd-metrics-exporter -s /path/to/custom.sock
Restart both services:
sudo systemctl restart gpuagent.service sudo systemctl restart amd-metrics-exporter.service sudo systemctl daemon-reload
Change Metrics Exporter Port#
Edit the configuration file:
sudo vi /etc/metrics/config.json
Update ServerPort to your desired port
Restart Metrics Exporter service:
Stop Metrics Exporter#
- To stop the Metrics Exporter service, run:
sudo systemctl stop amd-metrics-exporter.service sudo systemctl stop gpuagent.service sudo systemctl daemon-reload
Confirm Metrics Exporter is Running#
To confirm that the Metrics Exporter is running and accessible, you can use the following command:
systemctl status amd-metrics-exporter.service systemctl status gpuagent.service
Service Management for Driver and Partition Operations#
The GPU Metrics Exporter and GPU Agent services must be stopped before performing the following operations:
GPU driver unload or upgrade
GPU partition configuration changes
Required Steps:
Stop Services: See Stop Metrics Exporter section for instructions on stopping both services
Perform driver upgrade or partition operations
Restart Services: Use the enable and start commands from Step 4: Install Metrics Exporter
Verify Services: See Confirm Metrics Exporter is Running section to verify both services are running correctly
Removing Metrics Exporter and other components#
To remove this application, follow these commands in reverse order:
Uninstall the Metrics Exporter:
Ensure the .deb package is removed:
sudo dpkg -r amdgpu-exporter sudo apt-get purge amdgpu-exporter
(Optional) If you would also like to uninstall the AMDGPU Driver:
Uninstall any associated DKMS packages:
sudo dpkg -r amdgpu-install
Unload the driver module:
sudo modprobe -r amdgpu
(Optional) If you would also like to remove the system prerequisites that were installed:
Remove Linux header and module packages:
sudo apt remove linux-headers-$(uname -r) sudo apt remove linux-modules-extra-$(uname -r)
Remove the user from groups:
sudo gpasswd -d $LOGNAME render sudo gpasswd -d $LOGNAME video