Docker Swarm Integration#
Purpose#
Docker Swarm integration allows orchestrated GPU workloads to be deployed across multiple nodes by leveraging GPU UUIDs and Docker’s resource framework.
Docker Daemon Configuration for Swarm#
Configure each swarm node’s Docker daemon with GPU resources in /etc/docker/daemon.json:
{
"runtimes": {
"amd": {
"path": "amd-container-runtime",
"runtimeArgs": []
}
},
"node-generic-resources": [
"AMD_GPU=0x378041e1ada6015",
"AMD_GPU=0xef39dad16afb86ad",
"GPU_COMPUTE=0x583de6f2d99dc333"
]
}
After updating the configuration, restart the Docker daemon:
sudo systemctl restart docker
Deploy GPU Enabled Services#
Deploy services with specific GPU requirements using docker-compose:
# docker-compose.yml for Swarm deployment
version: '3.8'
services:
rocm-service:
image: rocm/dev-ubuntu-24.04
command: rocm-smi
runtime: amd
deploy:
replicas: 1
resources:
reservations:
generic_resources:
- discrete_resource_spec:
kind: 'AMD_GPU' # Matches daemon.json key
value: 1
Deploy the service:
docker stack deploy -c docker-compose.yml rocm-stack