Kube-RBAC-Proxy with Metrics Exporter#
The kube-rbac-proxy sidecar container is used to secure the metrics endpoint by enforcing Role-Based Access Control (RBAC) or Static Authorization based on the Kubernetes authentication model.
By enabling kube-rbac-proxy, you ensure that only authorized users (or authorized client certificates) can access the /metrics
endpoint.
Configure Kube-RBAC-Proxy#
To enable and configure the kube-rbac-proxy sidecar container, add the rbacConfig
section to the Metrics Exporter configuration in the DeviceConfig CR. Here’s a quick overview of the settings for kube-rbac-proxy:
enable: Set to
true
to enable the kube-rbac-proxy sidecar container.image: Specify the image for the kube-rbac-proxy container. If not specified, the default image is used.
secret: Kubernetes Secret containing server TLS cert (
tls.crt
) and key (tls.key
).disableHttps: If set to
true
, the HTTPS protection for the metrics endpoint is disabled. By default, this isfalse
, and HTTPS is enabled for secure communication.clientCAConfigMap: Kubernetes ConfigMap containing client CA cert (
ca.crt
) for mutual TLS validation.staticAuthorization.enable: Enables static authorization mode based on client certificate Common Name (CN).
staticAuthorization.clientName: The expected Common Name (CN) to authorize when static authorization is enabled.
It is mandatory to provide a valid TLS server certificate (via a Secret) if HTTPS is enabled.
Setting Up TLS Certificates#
If you want to provide custom TLS certificates, create a Kubernetes secret containing the TLS certificate (tls.crt
) and private key (tls.key
), and reference this secret in the rbacConfig.secret.name
field.
kubectl create secret tls my-tls-secret --cert=path/to/cert.crt --key=path/to/cert.key -n kube-amd-gpu
For enabling mTLS, you must also create a ConfigMap containing the client CA:
kubectl create configmap my-client-ca --from-file=ca.crt=path/to/ca.crt -n kube-amd-gpu
DeviceConfig Configuration Examples#
Token-Based Authorization:
metricsExporter:
rbacConfig:
enable: true # Enable the kube-rbac-proxy sidecar
image: "quay.io/brancz/kube-rbac-proxy:v0.18.1" # Image for the kube-rbac-proxy sidecar container
secret:
name: "my-tls-secret" # Secret containing the TLS certificate and key
disableHttps: false # Set to true if you want to disable HTTPS (not recommended)
mTLS with Certificate based RBAC Authorization:
metricsExporter:
rbacConfig:
enable: true # Enable the kube-rbac-proxy sidecar
image: "quay.io/brancz/kube-rbac-proxy:v0.18.1" # Image for the kube-rbac-proxy sidecar container
secret:
name: "my-tls-secret" # Secret containing the TLS certificate and key
clientCAConfigMap:
name: "my-client-ca" # ConfigMap containing the CA certificate that issued the client certificate
mTLS with Static Authorization:
metricsExporter:
rbacConfig:
enable: true # Enable the kube-rbac-proxy sidecar
image: "quay.io/brancz/kube-rbac-proxy:v0.18.1" # Image for the kube-rbac-proxy sidecar container
secret:
name: "my-tls-secret" # Secret containing the TLS certificate and key
clientCAConfigMap:
name: "my-client-ca" # ConfigMap containing the CA certificate that issued the client certificate
staticAuthorization:
enable: true # Enable static authorization based on client certificate CN
clientName: "prometheus-client" # The exact CN value that must appear in the client certificate to grant access
Accessing Metrics#
For a complete guide on how to access the metrics securely (including the generation of tokens, certificates, applying RBAC roles, and accessing the metrics inside and outside the cluster), please refer to the example scenarios README in the repository.
Conclusion#
Kube-rbac-proxy provides versatile options to secure your GPU metrics endpoints. You can choose from simple token-based authentication for easy integration, mutual TLS for stronger security, or static authorization for performance-critical scenarios. By following these steps, you will have a fully functional setup for accessing metrics from your AMD GPU cluster using the Metrics Exporter and kube-rbac-proxy.
For more detailed configuration guidance, refer to the example scenarios README.