Standalone Debian Package Install#

System Requirements#

Before installing the AMD GPU Metrics Exporter, you need to install the following:

  • Operating System: Ubuntu 22.04 or Ubuntu 24.04

  • ROCm Version: 6.4.1 (specific to each .deb pkg)

Each Debian package release of the Standalone Metrics Exporter is dependent on a specific version of the ROCm amdgpu driver. Please see table below for more information:

Metrics Exporter Debian Version

ROCm Version

AMDGPU Driver Version

amdgpu-exporter-1.2.0

ROCm 6.3.x

6.10.5

amdgpu-exporter-1.3.1

ROCm 6.4.x

6.12.12

Installation#

Step 1: Install System Prerequisites#

  1. Install Linux headers and modules:

    sudo apt update
    sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
    
  2. Add user to required groups:

    sudo usermod -a -G render,video $LOGNAME
    

Step 2: Install AMDGPU Driver#

Note

For the most up-to-date information on installing dkms drivers please see the ROCm Install Quick Start page. The below instructions are the most current instructions as of ROCm 7.0.rc1.

  1. Download the driver from the Radeon repository (repo.radeon.com) for your operating system. For example if you want to get the latest ROCm 7.0.rc1 drivers for Ubuntu 22.04 you would run the following command:

    sudo mkdir --parents --mode=0755 /etc/apt/keyrings
    wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null
    echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/30.10_rc1/ubuntu jammy main" sudo tee /etc/apt/sources.list.d/amdgpu.list
    sudo apt update
    

    Please note that the above url will be different depending on what version of the drivers you will be installing and type of Operating System you are using.

  2. Install the driver:

    sudo apt install amdgpu-dkms
    sudo reboot
    
  3. Load the driver module:

    sudo modprobe amdgpu
    

Step 3: Install the APT Prerequisites for Metrics Exporter#

  1. Update the package list and install necessary tools, keyrings and keys:

    # Install necessary tools
    sudo apt update
    sudo apt install vim wget gpg
    
    # Create the keyrings directory with the appropriate permissions:
    sudo mkdir --parents --mode=0755 /etc/apt/keyrings
    
    # Download the ROCm GPG key and add it to the keyrings:
    wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null
    
  2. Edit the sources list to add the Device Metrics Exporter repository:

    deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/device-metrics-exporter/apt/1.4.0 jammy main
    
    deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/device-metrics-exporter/apt/1.4.0 noble main
    
  3. Update the package list again:

    sudo apt update
    

Step 4: Install Metrics Exporter#

  1. Install the Device Metrics Exporter:

    sudo apt install amdgpu-exporter
    
  2. Enable and start services:

    sudo systemctl enable amd-metrics-exporter.service
    sudo systemctl start amd-metrics-exporter.service
    
  3. Check service status:

    sudo systemctl status amd-metrics-exporter.service
    

Metrics Exporter Default Settings#

  • Metrics endpoint: http://localhost:5000/metrics

  • Configuration file: /etc/metrics/config.json

  • GPU Agent port (default): 50061

The Exporter HTTP port is configurable via the ServerPort field in the configuration file.

Metrics Exporter Custom Configuration#

Changing configuration config.json#

If you need to customize ports or settings:

  1. Edit the amd-metrics-exporter service file:

    sudo vi /lib/systemd/system/amd-metrics-exporter.service
    
  2. Update the ExecStart line to read in the config.json file:

    ExecStart=/usr/local/bin/amd-metrics-exporter -amd-metrics-config /etc/metrics/config.json
    
  3. Reload systemd:

    sudo systemctl daemon-reload
    

Custom Port Configuration - Change GPU Agent Port#

  1. Edit the GPU Agent service file:

    sudo vi /lib/systemd/system/gpuagent.service
    
  2. Update ExecStart with desired port:

    ExecStart=/usr/local/bin/gpuagent -p <port_number>
    

Change Metrics Exporter Port#

  1. Edit the configuration file:

    sudo vi /etc/metrics/config.json
    
  2. Update ServerPort to your desired port.

Stop Metrics Exporter#

To stop the Metrics Exporter service, run:
sudo systemctl stop amd-metrics-exporter.service
sudo systemctl stop gpuagent.service

Confirm Metrics Exporter is Running#

To confirm that the Metrics Exporter is running and accessible, you can use the following command:

systemctl status amd-metrics-exporter.service
systemctl status gpuagent.service

Removing Metrics Exporter and other components#

To remove this application, follow these commands in reverse order:

  1. Uninstall the Metrics Exporter:

    • Ensure the .deb package is removed:

      sudo dpkg -r amdgpu-exporter
      sudo apt-get purge amdgpu-exporter
      
  2. (Optional) If you would also like to uninstall the AMDGPU Driver:

    • Uninstall any associated DKMS packages:

      sudo dpkg -r amdgpu-install
      
    • Unload the driver module:

      sudo modprobe -r amdgpu
      
  3. (Optional) If you would also like to remove the system prerequisites that were installed:

    • Remove Linux header and module packages:

      sudo apt remove linux-headers-$(uname -r)
      sudo apt remove linux-modules-extra-$(uname -r)
      
    • Remove the user from groups:

      sudo gpasswd -d $LOGNAME render
      sudo gpasswd -d $LOGNAME video