---
myst:
    html_meta:
        "description": "AMD Instinct MI100 acceptance criteria — prerequisites, health checks, system validation, and performance benchmarks for CDNA PCIe GPU platforms."
        "keywords": "MI100, AMD Instinct, CDNA, ROCm, PCIe, Infinity Fabric, system acceptance, validation, benchmarks, HBM2"
---

# AMD Instinct MI100

The AMD Instinct™ MI100 is a data-center compute PCIe-form-factor GPU. This document provides MI100-specific prerequisites, health checks, validation steps, and performance acceptance criteria.

## Overview

The AMD Instinct MI100 introduces the first-generation CDNA architecture in a standard full-height, full-length, dual-slot PCIe® add-in card aimed at HPC and accelerated computing workloads. Each MI100 provides 120 compute units with Matrix Core technology, 32 GB of HBM2 memory at up to 1.2 TB/s, and AMD Infinity Fabric™ link support for direct GPU-to-GPU connectivity in 2- and 4-GPU hive configurations. The card is passively cooled with a 300 W TDP and supports PCIe® Gen4 host connectivity.

The MI100 is built on the CDNA architecture (gfx908) with 120 compute units and 32 GB of HBM2 memory per GPU. The MI100 Infinity Fabric™ topology tops out at 4 GPUs per hive, so the validation reference configuration for this document is a single 4-GPU MI100 hive with Infinity Fabric™ bridges providing direct GPU-to-GPU connectivity across all peers. Larger deployments (for example, dual-socket servers with two 4-GPU hives for 8 MI100s total) are common; in those systems, cross-hive traffic traverses the host PCIe fabric and the per-hive criteria below apply to each hive independently.

- **[MI100 Product Page](https://www.amd.com/en/products/accelerators/instinct/mi100.html)**
- **[MI100 Product Brief](https://www.amd.com/content/dam/amd/en/documents/instinct-business-docs/product-briefs/instinct-mi100-brochure.pdf)**
- **[MI100 Microarchitecture](https://instinct.docs.amd.com/latest/gpu-arch/mi100.html)**

## System requirements

### Operating system support

For the most up-to-date information on supported operating systems and distributions, refer to the official ROCm documentation:

[ROCm System Requirements - Supported Distributions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions)

```{note}
[ROCm docs](https://rocm.docs.amd.com) is the single source of truth for supported versions, distribution compatibility, and required dependencies for the ROCm toolkit.
```

For BIOS, NUMA, and OS-level tuning that applies to all AMD Instinct hosts, see [BIOS settings](../common/bios-settings.md) and [OS tuning](../common/os-tuning.md).

### GPU identification

All MI100 GPUs (PCI vendor:device `1002:738c`) should appear in `lspci` output:

```bash
sudo lspci -d 1002:738c
```

Expected output example (4-GPU MI100 hive):

```bash
1d:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Arcturus GL-XL [Instinct MI100] (rev 01)
20:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Arcturus GL-XL [Instinct MI100] (rev 01)
23:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Arcturus GL-XL [Instinct MI100] (rev 01)
26:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Arcturus GL-XL [Instinct MI100] (rev 01)
```

## Acceptance criteria

The MI100 system acceptance process validates that the platform is correctly configured, stable, and performing to expectations. Follow the sequence: Prerequisites → Basic Health Checks → System Validation → Performance Benchmarks.

### System acceptance process

1. **[Prerequisites validation](#prerequisites-validation)** - Ensure all system requirements and dependencies are met
2. **[Basic health checks](#basic-health-checks)** - Verify hardware detection and basic system health
3. **[System validation](#system-validation)** - Conduct comprehensive stress testing and qualification
4. **[Performance benchmarks](#performance-benchmarks)** - Validate compute, memory, and interconnect performance

The system is accepted when all criteria below are successfully validated.

### Prerequisites validation

Ensure all system requirements are met before proceeding with validation. See the [Prerequisites documentation](../common/prerequisites.md) and [System setup](../common/system-setup.md) for more details.

- ✅ Supported operating system version installed
- ✅ Compatible ROCm version installed
- ✅ BIOS configured per [BIOS settings](../common/bios-settings.md), with MI100-specific values per platform vendor
- ✅ Required kernel parameters present: `pci=realloc=off`, `pci=bfsort`, `iommu=pt`, and `amd_iommu=on` (or `intel_iommu=on` on Intel hosts) — see [Kernel Parameters](../common/kernel-parameters.md)
- ✅ Minimum 256G system memory available
- ✅ Latest applicable firmware applied consistently across nodes
- ✅ ROCm Validation Suite (RVS) installed

### Basic health checks

These checks ensure fundamental system health and proper GPU detection. For detailed procedures, see [Health Checks](../common/health-checks.md).

| Test | Command | Pass/Fail criteria |
|------|---------|-------------------|
| [Check OS distribution](../common/health-checks.md#check-os-distribution) | `cat /etc/os-release` | **Pass**: OS version listed in compatibility matrix<br>**Fail**: Otherwise |
| [Check kernel boot arguments](../common/health-checks.md#check-kernel-boot-arguments) | `cat /proc/cmdline` | **Pass**: Contains `pci=realloc=off`, `pci=bfsort`, `iommu=pt`, and `amd_iommu=on` or `intel_iommu=on`<br>**Fail**: Otherwise |
| [Check for driver errors](../common/health-checks.md#check-for-driver-errors) | `sudo dmesg -T \| grep amdgpu \| grep -i error` | **Pass**: Null<br>**Fail**: Errors reported |
| [Check available memory](../common/health-checks.md#check-for-available-system-memory) | `lsmem \| grep "Total online memory"` | **Pass**: ≥ 256G<br>**Fail**: Less than 256G |
| [Check GPU presence](../common/health-checks.md#check-gpu-presence) | `sudo lspci -d 1002:738c` | **Pass**: 4 MI100 GPUs found (per hive)<br>**Fail**: Otherwise |
| [Check GPU link speed and width](../common/health-checks.md#check-gpu-pcie-bus-link-speed-and-width) | `sudo lspci -d 1002:738c -vvv \| grep -e DevSta -e LnkSta` | **Pass**: Speed 16GT/s, width `x16`, no `FatalErr+`<br>**Fail**: Otherwise |
| [Monitor utilization metrics](../common/health-checks.md#monitor-utilization-metrics) | `amd-smi monitor -putm` | **Pass**: Idle metrics as specified<br>**Fail**: Otherwise |
| [Check system kernel logs for errors](../common/health-checks.md#check-system-kernel-logs) | `sudo dmesg -T \| grep -i 'error\|warn\|fail\|exception'` | **Pass**: Null<br>**Fail**: Otherwise |

### System validation

Comprehensive validation ensures system stability under load. For detailed procedures, see [System Validation](../common/system-validation.md).

| Test | Command | Pass/Fail criteria |
|------|---------|-------------------|
| [Compute/GPU properties](../common/system-validation.md#gpu-properties) | `rvs -c ${RVS_CONF}/gpup_single.conf` | **Pass**: All GPUs listed with no errors<br>**Fail**: Missing GPUs or errors |
| [GPU stress test (GST)](../common/system-validation.md#gpu-stress-test) | `rvs -c ${RVS_CONF}/MI100/gst_single.conf` | **Pass**: `met: TRUE` in logs<br>**Fail**: Target GFLOP/s not met |
| [Input energy delay product (IET)](../common/system-validation.md#input-energy-delay-product) | `rvs -c ${RVS_CONF}/MI100/iet_single.conf` | **Pass**: `met: TRUE` for all actions<br>**Fail**: Otherwise |
| [Memory test (MEM)](../common/system-validation.md#mem) | `rvs -c ${RVS_CONF}/mem.conf -l mem.txt` | **Pass**: All tests passed; bandwidth ≥ 800 GB/s per GPU<br>**Fail**: Any test failed or low bandwidth |
| [PCIe bandwidth benchmark (PEBB)](../common/system-validation.md#pcie-bandwidth-benchmark) | `rvs -c ${RVS_CONF}/MI100/pebb_single.conf` | **Pass**: All distances and bandwidths displayed<br>**Fail**: Missing data |
| [PCIe qualification tool (PEQT)](../common/system-validation.md#pcie-qualification-tool) | `rvs -c ${RVS_CONF}/peqt_single.conf` | **Pass**: All actions true<br>**Fail**: Otherwise |
| [P2P benchmark and qualification tool (PBQT)](../common/system-validation.md#p2p-benchmark-and-qualification-tool) | `rvs -c ${RVS_CONF}/pbqt_single.conf` | **Pass**: `peers:true` lines and non-zero throughput<br>**Fail**: Otherwise |

```{note}
The reference configuration for this document is a single 4-GPU MI100 hive with AMD Infinity Fabric™ bridges installed, so intra-hive PBQT and TransferBench numbers reflect XGMI throughput. On systems without bridges, P2P traffic traverses the host PCIe fabric and these thresholds will not be met.
```

### Performance benchmarks

Performance validation ensures the system meets MI100 specifications. For detailed procedures, see [Performance Benchmarking](../common/system-validation.md#performance-benchmarking).

:::{card} Command: `TransferBench a2a`
[TransferBench all-to-all](../common/system-validation.md#transferbench)
^^^
**Pass:** ≥ 270 GB/s aggregate
+++
**Fail:** otherwise
:::

:::{card} Command: `TransferBench p2p`
[TransferBench peer-to-peer](../common/system-validation.md#transferbench)
^^^

| Test | Pass criteria |
|------|--------------|
| UniDir | ≥ 30 GB/s |
| BiDir | ≥ 57 GB/s |

+++
**Fail:** otherwise
:::

:::{card} Command: `build/all_reduce_perf -b 8 -e 8G -f 2 -g 4`
[RCCL Allreduce](../common/system-validation.md#rccl-allreduce)
^^^
**Pass:** ≥ 72 GB/s busbw (peak, at 8 GiB message size)
+++
**Fail:** otherwise
:::

:::{card} Command: `rocblas-bench` (see code block below)
[rocBLAS FP32](../common/system-validation.md#rocblas-gemm-benchmarks)
^^^

```bash
rocblas-bench -f gemm \
  -r s -m 4000 -n 4000 -k 4000 \
  --lda 4000 --ldb 4000 --ldc 4000 \
  --transposeA N --transposeB T
```

**Pass:** ≥ 28 TFLOPS per GPU
+++
**Fail:** otherwise
:::

:::{card} Command: `mpiexec -n 4 wrapper.sh`
[BabelStream](../common/system-validation.md#babelstream)
^^^

| Kernel | Threshold (MB/s) |
|--------|-----------------|
| Copy  | ≥ 940,000 |
| Mul   | ≥ 940,000 |
| Add   | ≥ 910,000 |
| Triad | ≥ 910,000 |
| Dot   | ≥ 950,000 |

+++
**Fail:** otherwise
:::
