---
myst:
    html_meta:
        "description": "AMD Instinct MI250 acceptance criteria — prerequisites, health checks, system validation, and performance benchmarks for CDNA 2 OAM GPU platforms."
        "keywords": "MI250, AMD Instinct, CDNA 2, OAM, ROCm, xGMI, Infinity Fabric, system acceptance, validation, benchmarks, HBM2e"
---

# AMD Instinct MI250 / MI250X

The AMD Instinct™ MI250 is a data-center OAM-form-factor GPU. This document provides MI250-specific prerequisites, health checks, validation steps, and performance acceptance criteria. It also applies to the AMD Instinct™ MI250X, which shares the same CDNA 2 (gfx90a) OAM platform and acceptance criteria; MI250X-specific differences are noted inline.

## Overview

The AMD Instinct MI250 brings the second-generation CDNA architecture to an OCP Accelerator Module (OAM) form factor purpose-built for HPC and large-scale AI training. Each MI250 packages two Graphics Compute Dies (GCDs) under a single OAM, each GCD presenting 110 CUs with Matrix Core technology and 64 GB of HBM2e memory at up to 1.6 TB/s, for a combined 128 GB and 3.2 TB/s per OAM. The two GCDs on an OAM are linked by a high-bandwidth on-package AMD Infinity Fabric™ interconnect, and each OAM exposes additional xGMI ports for direct GPU-to-GPU connectivity across a 4-OAM all-to-all mesh. A typical qualified configuration hosts 4 MI250 OAMs (8 GCDs total) per node.

The MI250 is built on the CDNA 2 architecture (gfx90a) in an OCP Accelerator Module (OAM) form factor. Each MI250 OAM hosts two Graphics Compute Dies (GCDs), each enumerated as an independent GPU by ROCm tools, with 128 GB of HBM2e memory per OAM (64 GB per GCD). GPUs are connected to each other and to the host CPUs through AMD Infinity Fabric™ (xGMI).

The MI250X is the higher-performance variant of the same CDNA 2 (gfx90a) OAM platform and is validated using the criteria in this document. It powers exascale-class supercomputers such as Frontier and LUMI. MI250X reference deployments commonly use an 8-OAM (16-GCD) node topology; scale the per-node GCD counts in the commands below accordingly (for example, `-g 16` for RCCL and `mpiexec -n 16` for BabelStream on an 8-OAM node). MI250X also shares the MI250 PCI vendor:device ID (`1002:740c`).

- **[MI250 Product Page](https://www.amd.com/en/products/accelerators/instinct/mi200/mi250.html)**
- **[MI250X Product Page](https://www.amd.com/en/products/accelerators/instinct/mi200/mi250x.html)**
- **[MI200 Series Microarchitecture](https://instinct.docs.amd.com/latest/gpu-arch/mi250.html)**
- **[MI200 Series Data Sheet](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instinct-mi200-datasheet.pdf)**

## System requirements

### Operating system support

For the most up-to-date information on supported operating systems and distributions, refer to the official ROCm documentation:

[ROCm System Requirements - Supported Distributions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions)

```{note}
[ROCm docs](https://rocm.docs.amd.com) is the single source of truth for supported versions, distribution compatibility, and required dependencies for the ROCm toolkit.
```

For BIOS, NUMA, and OS-level tuning that applies to all AMD Instinct hosts, see [BIOS settings](../common/bios-settings.md) and [OS tuning](../common/os-tuning.md).

### GPU identification

All MI250 GCDs (PCI vendor:device `1002:740c`) should appear in `lspci` output. On a fully populated 4-OAM MI250 platform you should see 8 GCD entries (2 per OAM):

```bash
sudo lspci -d 1002:740c
```

Expected output example:

```bash
0000:11:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
0000:14:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
0000:32:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
0000:35:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
0000:8e:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
0000:93:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
0000:ae:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
0000:b3:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
```

The 8 GCDs are paired by OAM (e.g. `11`+`14` are the two GCDs on one OAM, `32`+`35` on the next, and so on). Same-OAM GCD pairs are connected by a high-bandwidth on-package link, while cross-OAM connectivity uses external xGMI ports in a 4-OAM all-to-all mesh.

## Acceptance criteria

The MI250 system acceptance process validates that the platform is correctly configured, stable, and performing to expectations. Follow the sequence: Prerequisites → Basic Health Checks → System Validation → Performance Benchmarks.

### System acceptance process

1. **[Prerequisites validation](#prerequisites-validation)** - Ensure all system requirements and dependencies are met
2. **[Basic health checks](#basic-health-checks)** - Verify hardware detection and basic system health
3. **[System validation](#system-validation)** - Conduct comprehensive stress testing and qualification
4. **[Performance benchmarks](#performance-benchmarks)** - Validate compute, memory, and interconnect performance

The system is accepted when all criteria below are successfully validated.

### Prerequisites validation

Ensure all system requirements are met before proceeding with validation. See the [Prerequisites documentation](../common/prerequisites.md) and [System setup](../common/system-setup.md) for more details.

- ✅ Supported operating system version installed
- ✅ Compatible ROCm version installed (verify: `cat /opt/rocm/.info/version`); see the [ROCm System Requirements](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html) for the current supported version matrix
- ✅ BIOS configured per [BIOS settings](../common/bios-settings.md), with MI250-specific values per platform vendor
- ✅ Required kernel parameters present: `pci=realloc=off iommu=pt`
- ✅ Minimum 1T system memory available
- ✅ Latest applicable firmware applied consistently across nodes
- ✅ ROCm Validation Suite (RVS) installed

### Basic health checks

These checks ensure fundamental system health and proper GPU detection. For detailed procedures, see [Health Checks](../common/health-checks.md).

| Test | Command | Pass/Fail criteria |
|------|---------|-------------------|
| [Check OS distribution](../common/health-checks.md#check-os-distribution) | `cat /etc/os-release` | **Pass**: OS version listed in compatibility matrix<br>**Fail**: Otherwise |
| [Check kernel boot arguments](../common/health-checks.md#check-kernel-boot-arguments) | `cat /proc/cmdline` | **Pass**: Contains `pci=realloc=off iommu=pt`<br>**Fail**: Otherwise |
| [Check for driver errors](../common/health-checks.md#check-for-driver-errors) | `sudo dmesg -T \| grep amdgpu \| grep -i error` | **Pass**: Null<br>**Fail**: Errors reported |
| [Check available memory](../common/health-checks.md#check-for-available-system-memory) | `lsmem \| grep "Total online memory"` | **Pass**: ≥ 1T<br>**Fail**: Less than 1T |
| [Check GPU presence](../common/health-checks.md#check-gpu-presence) | `sudo lspci -d 1002:740c` | **Pass**: 8 MI250 GCDs found<br>**Fail**: Otherwise |
| [Check GPU link speed and width](../common/health-checks.md#check-gpu-pcie-bus-link-speed-and-width) | `sudo lspci -d 1002:740c -vvv \| grep -e DevSta -e LnkSta` | **Pass**: Speed PCIe Gen 4 (16 GT/s), width `x16`, no `FatalErr+`<br>**Fail**: Otherwise |
| [Monitor utilization metrics](../common/health-checks.md#monitor-utilization-metrics) | `amd-smi monitor -putm` | **Pass**: Idle metrics as specified<br>**Fail**: Otherwise |
| [Check system kernel logs for errors](../common/health-checks.md#check-system-kernel-logs) | `sudo dmesg -T \| grep -i 'error\|warn\|fail\|exception'` | **Pass**: Null<br>**Fail**: Otherwise |

### System validation

Comprehensive validation ensures system stability under load. For detailed procedures, see [System Validation](../common/system-validation.md).

| Test | Command | Pass/Fail criteria |
|------|---------|-------------------|
| [Compute/GPU properties](../common/system-validation.md#gpu-properties) | `rvs -c ${RVS_CONF}/gpup_single.conf` | **Pass**: All GCDs listed with no errors<br>**Fail**: Missing GCDs or errors |
| [GPU stress test (GST)](../common/system-validation.md#gpu-stress-test) | `rvs -c ${RVS_CONF}/MI250/gst_single.conf` | **Pass**: `met: TRUE` in logs<br>**Fail**: Target GFLOP/s not met |
| [Input energy delay product (IET)](../common/system-validation.md#input-energy-delay-product) | `rvs -c ${RVS_CONF}/MI250/iet_single.conf` | **Pass**: `met: TRUE` for all actions<br>**Fail**: Otherwise |
| [Memory test (MEM)](../common/system-validation.md#mem) | `rvs -c ${RVS_CONF}/mem.conf -l mem.txt` | **Pass**: All tests passed; bandwidth ≥ 1050 GB/s per GCD<br>**Fail**: Any test failed or low bandwidth |
| [PCIe bandwidth benchmark (PEBB)](../common/system-validation.md#pcie-bandwidth-benchmark) | `rvs -c ${RVS_CONF}/MI250/pebb_single.conf` | **Pass**: All distances and bandwidths displayed<br>**Fail**: Missing data |
| [PCIe qualification tool (PEQT)](../common/system-validation.md#pcie-qualification-tool) | `rvs -c ${RVS_CONF}/peqt_single.conf` | **Pass**: All actions true<br>**Fail**: Otherwise |
| [P2P benchmark and qualification tool (PBQT)](../common/system-validation.md#p2p-benchmark-and-qualification-tool) | `rvs -c ${RVS_CONF}/pbqt_single.conf` | **Pass**: `peers:true` lines and non-zero throughput across all xGMI peers<br>**Fail**: Otherwise |

### Performance benchmarks

Performance validation ensures the system meets MI250 specifications. For detailed procedures, see [Performance Benchmarking](../common/system-validation.md#performance-benchmarking).

:::{card} Command: `TransferBench a2a`
[TransferBench all-to-all](../common/system-validation.md#transferbench)
^^^
**Pass:** ≥ 800 GB/s aggregate
+++
**Fail:** otherwise
:::

:::{card} Command: `TransferBench p2p`
[TransferBench peer-to-peer](../common/system-validation.md#transferbench)
^^^

| Test | Pass criteria |
|------|--------------|
| UniDir | ≥ 30 GB/s |
| BiDir | ≥ 55 GB/s |

+++
**Fail:** otherwise
:::

:::{card} Command: `build/all_reduce_perf -b 8 -e 8G -f 2 -g 8`
[RCCL Allreduce](../common/system-validation.md#rccl-allreduce)
^^^
**Pass:** ≥ 125 GB/s busbw (peak, at 8 GiB message size)
+++
**Fail:** otherwise
:::

:::{card} Command: `rocblas-bench` (see code block below)
[rocBLAS FP32](../common/system-validation.md#rocblas-gemm-benchmarks)
^^^

```bash
rocblas-bench -f gemm \
  -r s -m 4000 -n 4000 -k 4000 \
  --lda 4000 --ldb 4000 --ldc 4000 \
  --transposeA N --transposeB T
```

**Pass:** ≥ 30 TFLOPS per GCD
+++
**Fail:** otherwise
:::

:::{card} Command: `mpiexec -n 8 wrapper.sh`
[BabelStream](../common/system-validation.md#babelstream)
^^^

| Kernel | Threshold (MB/s) |
|--------|-----------------|
| Copy  | ≥ 1,200,000 |
| Mul   | ≥ 1,200,000 |
| Add   | ≥ 1,100,000 |
| Triad | ≥ 1,100,000 |
| Dot   | ≥ 1,200,000 |

+++
**Fail:** otherwise
:::
