Troubleshooting Device Metrics Exporter#
This topic provides an overview of troubleshooting options for Device Metrics Exporter.
Logs#
You can view the container logs by executing the following command:
Docker deployment#
docker logs device-metrics-exporter
K8s deployment#
kubectl logs -n <namespace> <exporter-container-on-node>
Debian deployment#
sudo journalctl -xu amd-metrics-exporter
logs are collected in file /var/run/exporter.log
Common Issues#
This section describes common issues with AMD Device Metrics Exporter
Port conflicts:
Verify port 5000 is available
Configure an alternate port through the configuration file
Device access:
Ensure proper permissions on
/dev/dri
and/dev/kfd
Verify ROCm is properly installed
Metric collection issues:
Check GPU driver status
Verify ROCm version compatibility