Developer Guide#
This guide provides information for developers who want to contribute to or modify the AMD GPU Operator.
Warning
This project is not ready yet to accept the external developers commits.
Prerequisites#
Go v1.20 (due to open issues with Go v1.21 or v1.22)
Docker
Kubernetes cluster (v1.29.0+) or OpenShift (4.16+)
kubectl
oroc
CLI tool configured to access your clusterAccess to
rocm/gpu-kernel-module-manager
image (available on Docker Hub)
Development Environment Setup#
Install Helm:
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
For alternative installation methods, refer to the Helm Official Website.
Install Helmify:
Download the released binary from the Helmify GitHub release page, unpack it, and move it to your
PATH
.
Clone the repository:
git clone https://github.com/ROCm/gpu-operator.git
cd gpu-operator
(Optional) Set up a local Docker registry. If you want to build and host container images locally, you can set up a local Docker registry:
docker run -d -p 5000:5000 --name registry registry:latest
Modify the registry-related variables in the
Makefile
:DOCKER_REGISTRY
: Set tolocalhost:5000
for local development, or your preferred registryIMAGE_NAME
: Set torocm/gpu-operator
IMAGE_TAG
: Set as needed (e.g.,v1.0.0
orlatest
)
Compile the project:
make
This will generate the basic YAML files for CRD, build controller images, build Helm charts and build OpenShift OLM bundle.
Build and push the AMD GPU Operator image:
make docker-build
make docker-push
Note: If you’re using a remote registry that requires authentication, ensure you’ve logged in using
docker login
before pushing.
Generate Helm charts:
For vanilla Kubernetes:
make helm
For OpenShift:
OPENSHIFT=1 make helm
Running Tests#
Running e2e requires a Kubernetes cluster, please prepare your Kubernetes cluster ready for running the e2e tests, as well as configure the kubeconfig file at ~/.kube/config
for kubectl and helm toolkits to get access to your cluster. The e2e test cases will deploy the Operator to your cluster and run the test cases.
To run the e2e tests:
make e2e
To run e2e tests with a specific Helm chart:
make e2e GPU_OPERATOR_CHART="path to helm chart"
To run e2e test only:
make -C tests/e2e # run e2e tests only
Creating a Pull Request#
Fork the repository on GitHub.
Create a new branch for your changes.
Make your changes and commit them with clear, descriptive commit messages.
Push your changes to your fork.
Create a pull request against the main repository.
Please ensure your code follows our coding standards and includes appropriate tests.
Build Documentation Website Locally#
Download mkdocs utilities
python3 -m pip install mkdocs
Build the website
cd docs
python3 -m mkdocs build
Deploy the website
python3 -m mkdocs serve --dev-addr localhost:2345
The local docs website will dynmically update as changes are made to markdown docs.