Acceptance criteria

Acceptance criteria#

Provided the system under test passes the following criteria, the SUT is ready and should be accepted.

Table 7 Summary of basic system health checks#

Test

Command

Pass/Fail criteria

Check OS distribution

cat/etc/os-release
  • Pass: OS version listed in compatibility matrix

  • Fail: Otherwise

Check kernel boot arguments

cat/proc/cmdline
  • Pass: Contains pci-realloc=off, amd_iommu=on or intel_iommu=on, and iommu=pt

  • Fail: Otherwise

Check for driver errors

sudo dmesg -T | grep amdgpu | grep -i error
  • Pass: Null

  • Fail: Errors reported

Check available memory

lsmem | grep "Total online memory"
  • Pass: 1.5T or more

  • Fail: Less than 1.5T

Check GPU presence

lspci | grep MI300X
  • Pass: All 8 GPUs found

  • Fail: Otherwise

Check GPU link speed and width

sudo lspci -d 1002:74a1 -vvv | grep -e DevSta -e LnkSta
  • Pass: Speed 32GT/s, width x16, no FatalErr+

  • Fail: Otherwise

Monitor utilization metrics

amd-smi monitor -putm
  • Pass: Idle metrics as specified

  • Fail: Otherwise

Check system kernel logs for errors

sudo dmesg -T | grep -i 'error|warn|fail|exception'
  • Pass: Null

  • Fail: Otherwise

Table 8 Summary of system validation tests#

Test

Command

Pass/Fail criteria

Compute/GPU properties

rvs -c ${RVS_CONF}/gpup_single.conf
  • Pass: All GPUs listed with no errors

  • Fail: Missing GPUs or errors

GPU stress test (GST)

rvs -c ${RVS_CONF}/MI300X/gst_single.conf
  • Pass: met: TRUE in logs

  • Fail: Target GFLOP/s not met

Input energy delay product (IET)

rvs -c ${RVS_CONF}/MI300X/iet_single.conf
  • Pass: met: TRUE for all actions

  • Fail: Otherwise

Memory test (MEM)

rvs -c ${RVS_CONF}/mem.conf -l mem.txt
  • Pass: All tests passed; bandwidth ~2TB/s

  • Fail: Any test failed or low bandwidth

PCIe bandwidth benchmark (PEBB)

rvs -c ${RVS_CONF}/MI300X/pebb_single.conf
  • Pass: All distances and bandwidths displayed

  • Fail: Missing data

PCIe qualification tool (PEQT)

rvs -c ${RVS_CONF}/peqt_single.conf
  • Pass: All actions true

  • Fail: Otherwise

P2P benchmark and qualification tool (PBQT)

rvs -c ${RVS_CONF}/pbqt_single.conf
  • Pass: peers:true lines and non-zero throughput

  • Fail: Otherwise

Table 9 Summary of performance benchmarking tests#

Test

Command

Pass/Fail criteria

TransferBench all-to-all

TransferBench a2a
  • Pass: Greater than or equal to 32.9

  • Fail: Otherwise

TransferBench peer-to-peer

TransferBench p2p
  • UniDir pass: Greater than or equal to 33.9

  • BiDir pass: Greater than or equal to 43.9

  • Fail: Otherwise

TransferBench tests 1 to 6

TransferBench example.cfg
  • Test 1 pass: Greater than or equal to 47.1 GB/s

  • Test 2 pass: Greater than or equal to 48.4 GB/s

  • Test 3 pass: Greater than or equal to 31.9 GB/s (0 to 1) and 38.9 GB/s (1 to 0)

  • Test 4 pass: Greater than or equal to 1264 GB/s

  • Test 5 pass: N/A for GPU validation

  • Test 6 pass: Greater than or equal to 48.6 GB/s

  • Fail: Otherwise

RCCL Allreduce

build/all_reduce_perf -b 8 -e 8G -f 2 -g 8
  • Pass: Greater than or equal to 304 GB/s

  • Fail: Otherwise

rocBLAS FP32 benchmark

rocblas-bench -f gemm \
  -r s -m 4000 \
  --lda 4000 --ldb 4000 --ldc 4000 \
  --transposeA N --transposeB T
  • Pass: Greater than or equal to 94100 TFLOPS

  • Fail: Otherwise

rocBLAS BF16 benchmark

rocblas-bench -f gemm_strided_batched_ex \
  --transposeA N --transposeB T \
  -m 1024 -n 2048 -k 512 \
  --a_type h --lda 1024 --stride_a 4096 \
  --b_type h --ldb 2048 --stride_b 4096 \
  --c_type s --ldc 1024 --stride_c 2097152 \
  --d_type s --ldd 1024 --stride_d 2097152 \
  --compute_type s \
  --alpha 1.1 --beta 1 \
  --batch_count 5
  • Pass: Greater than or equal to 130600 TFLOPS

  • Fail: Otherwise

rocBLAS INT8 benchmark

rocblas-bench -f gemm_strided_batched_ex \
  --transposeA N --transposeB T \
  -m 1024 -n 2048 -k 512 \
  --a_type i8_r --lda 1024 --stride_a 4096 \
  --b_type i8_r --ldb 2048 --stride_b 4096 \
  --c_type i32_r --ldc 1024 --stride_c 2097152 \
  --d_type i32_r --ldd 1024 --stride_d 2097152 \
  --compute_type i32_r \
  --alpha 1.1 --beta 1 \
  --batch_count 5
  • Pass: Greater than or equal to 162700 TFLOPS

  • Fail: Otherwise

BabelStream

mpiexec -n 8 wrapper.sh
  • Copy pass: Greater than or equal to 4,177,285 MB/s

  • Copy pass: Greater than or equal to 4,067,069 MB/s

  • Copy pass: Greater than or equal to 3,920,853 MB/s

  • Copy pass: Greater than or equal to 3,885,301 MB/s

  • Copy pass: Greater than or equal to 3,660,781 MB/s

  • Fail: Otherwise