Troubleshooting#
This guide provides steps to diagnose and resolve common issues with the AMD Network Operator.
Checking Operator Status#
To check the status of the AMD Network Operator:
kubectl get pods -n kube-amd-network
Collecting Logs#
To collect logs from the AMD Network Operator:
kubectl logs -n kube-amd-network <pod-name>
Using Techsupport-dump Tool#
The techsupport-dump tool collects system state and logs for debugging purposes. It can be run from any node in the cluster, including control plane nodes.
./tools/techsupport_dump.sh [-w] [-o yaml/json] [-k kubeconfig] <node-name/all>
Options:
-w
: wide option-o yaml/json
: output format (default: json)-k kubeconfig
: path to kubeconfig (default: ~/.kube/config)
TechSupport Collects:#
Kubernetes resources from the
network-operator
,nfd
, andkmm
namespaces, including:Pods
DaemonSets
Deployments
ConfigMaps
NetworkConfig
resources
Pod logs from components such as:
Node Feature Discovery (NFD)
Kernel Module Management (KMM)
Network Operator (Data Plane, Metrics Exporter, CNI plugins)
System-level diagnostics:
lsmod
output (loaded kernel modules)dmesg
output (kernel ring buffer)