GPU Monitoring

GPU Monitoring#

AMD SMI: GPU Monitoring
GPU Monitoring

Functions

amdsmi_status_t amdsmi_get_gpu_activity (amdsmi_processor_handle processor_handle, amdsmi_engine_usage_t *info)
 Returns the current usage of the GPU engines (GFX, MM and MEM). Each usage is reported as a percentage from 0-100%. More...
 
amdsmi_status_t amdsmi_get_power_info (amdsmi_processor_handle processor_handle, uint32_t sensor_ind, amdsmi_power_info_t *info)
 Returns the current power and voltage of the GPU. More...
 
amdsmi_status_t amdsmi_set_power_cap (amdsmi_processor_handle processor_handle, uint32_t sensor_ind, uint64_t cap)
 Sets GPU power cap. More...
 
amdsmi_status_t amdsmi_is_gpu_power_management_enabled (amdsmi_processor_handle processor_handle, bool *enabled)
 Returns is power management enabled. More...
 
amdsmi_status_t amdsmi_get_clock_info (amdsmi_processor_handle processor_handle, amdsmi_clk_type_t clk_type, amdsmi_clk_info_t *info)
 Returns the measurements of the clocks in the GPU for the GFX and multimedia engines and Memory. This call reports the averages over 1s in MHz. For clk_type AMDSMI_CLK_TYPE_GFX cur_clk is expected to be larger than max_clk in some cases due to decoupled nature of master vs slave oscillator in DFLL clk_locked supported only for AMDSMI_CLK_TYPE_GFX. More...
 
amdsmi_status_t amdsmi_get_temp_metric (amdsmi_processor_handle processor_handle, amdsmi_temperature_type_t sensor_type, amdsmi_temperature_metric_t metric, int64_t *temperature)
 Returns temperature measurements of the GPU. The results are in °C. More...
 
amdsmi_status_t amdsmi_get_gpu_cache_info (amdsmi_processor_handle processor_handle, amdsmi_gpu_cache_info_t *info)
 Returns gpu cache info. More...
 
amdsmi_status_t amdsmi_get_gpu_metrics (amdsmi_processor_handle processor_handle, uint32_t *metrics_size, amdsmi_metric_t *metrics)
 Returns metrics information. More...
 
amdsmi_status_t amdsmi_get_soc_pstate (amdsmi_processor_handle processor_handle, amdsmi_dpm_policy_t *policy)
 Returns the soc pstate policy for the processor. More...
 
amdsmi_status_t amdsmi_set_soc_pstate (amdsmi_processor_handle processor_handle, uint32_t policy_id)
 Set the soc pstate policy for the processor. More...
 

Detailed Description

Function Documentation

◆ amdsmi_get_gpu_activity()

amdsmi_status_t amdsmi_get_gpu_activity ( amdsmi_processor_handle  processor_handle,
amdsmi_engine_usage_t info 
)

Returns the current usage of the GPU engines (GFX, MM and MEM). Each usage is reported as a percentage from 0-100%.

Parameters
[in]processor_handlePF of a processor for which to query
[out]infoReference to the gpu engine usage structure. Must be allocated by user.
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_get_power_info()

amdsmi_status_t amdsmi_get_power_info ( amdsmi_processor_handle  processor_handle,
uint32_t  sensor_ind,
amdsmi_power_info_t info 
)

Returns the current power and voltage of the GPU.

Note
amdsmi_power_info_t::socket_power metric can rarely spike above the socket power limit in some cases
Parameters
[in]processor_handlePF of a processor for which to query
[in]sensor_inda 0-based sensor index. Normally, this will be 0. If a processor has more than one sensor, it could be greater than 0. Parameter sensor_ind is unused on
Platform:
host.
Parameters
[out]infoReference to the gpu power structure. Must be allocated by user.
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_set_power_cap()

amdsmi_status_t amdsmi_set_power_cap ( amdsmi_processor_handle  processor_handle,
uint32_t  sensor_ind,
uint64_t  cap 
)

Sets GPU power cap.

Platform:

host

gpu_bm_linux

Set the power cap to the provided value cap. cap must be between the minimum (min_power_cap) and maximum (max_power_cap) power cap values, which can be obtained from amdsmi_power_cap_info_t.

Parameters
[in]processor_handleprocessor handle
[in]sensor_inda 0-based sensor index. Normally, this will be 0. If a processor has more than one sensor, it could be greater than 0. Parameter sensor_ind is unused on
Platform:
host.
Parameters
[in]capvalue representing power cap to set
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_is_gpu_power_management_enabled()

amdsmi_status_t amdsmi_is_gpu_power_management_enabled ( amdsmi_processor_handle  processor_handle,
bool *  enabled 
)

Returns is power management enabled.

Parameters
[in]processor_handlePF of a processor for which to query
[out]enabledReference to bool. Must be allocated by user.
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_get_clock_info()

amdsmi_status_t amdsmi_get_clock_info ( amdsmi_processor_handle  processor_handle,
amdsmi_clk_type_t  clk_type,
amdsmi_clk_info_t info 
)

Returns the measurements of the clocks in the GPU for the GFX and multimedia engines and Memory. This call reports the averages over 1s in MHz. For clk_type AMDSMI_CLK_TYPE_GFX cur_clk is expected to be larger than max_clk in some cases due to decoupled nature of master vs slave oscillator in DFLL clk_locked supported only for AMDSMI_CLK_TYPE_GFX.

Parameters
[in]processor_handlePF of a processor for which to query
[in]clk_typeEnum representing the clock type to query.
[out]infoReference to the gpu clock structure. Must be allocated by user.
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_get_temp_metric()

amdsmi_status_t amdsmi_get_temp_metric ( amdsmi_processor_handle  processor_handle,
amdsmi_temperature_type_t  sensor_type,
amdsmi_temperature_metric_t  metric,
int64_t *  temperature 
)

Returns temperature measurements of the GPU. The results are in °C.

Parameters
[in]processor_handlePF of a processor for which to query
[in]sensor_typeEnum representing GPU sensor to query.
[in]metricEnum representing the temperature metric to query (min, max, etc.)
[out]temperatureReference to the current, limit or shutdown temperature measured (depending on the metric parameter) Current temp is obtained for metric=AMDSMI_TEMP_CURRENT Limit temp is obtained for metric=AMDSMI_TEMP_CRITICAL Must be allocated by user.
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_get_gpu_cache_info()

amdsmi_status_t amdsmi_get_gpu_cache_info ( amdsmi_processor_handle  processor_handle,
amdsmi_gpu_cache_info_t info 
)

Returns gpu cache info.

Parameters
[in]processor_handlePF of a processor for which to query
[out]inforeference to the cache info struct. Must be allocated by user.
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_get_gpu_metrics()

amdsmi_status_t amdsmi_get_gpu_metrics ( amdsmi_processor_handle  processor_handle,
uint32_t *  metrics_size,
amdsmi_metric_t metrics 
)

Returns metrics information.

Parameters
[in]processor_handlePF of a processor for which to query
[in,out]metrics_sizeAs input, the size of the provided buffer. As output, number of metrics in the buffer. Parameter must be allocated by user.
[out]metricsReference to list of metrics returned by the library. Buffer must be allocated by user.
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_get_soc_pstate()

amdsmi_status_t amdsmi_get_soc_pstate ( amdsmi_processor_handle  processor_handle,
amdsmi_dpm_policy_t policy 
)

Returns the soc pstate policy for the processor.

Platform:

gpu_bm_linux

guest_1vf

host

Given a processor handle processor_handle, this function will write current soc pstate policy settings to policy. All the processors at the same socket will have the same policy.

Parameters
[in]processor_handlea processor handle
[out]policythe soc pstate policy for this processor. If this parameter is nullptr, this function will return AMDSMI_STATUS_INVAL
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_set_soc_pstate()

amdsmi_status_t amdsmi_set_soc_pstate ( amdsmi_processor_handle  processor_handle,
uint32_t  policy_id 
)

Set the soc pstate policy for the processor.

Platform:

gpu_bm_linux

guest_1vf

host

Given a processor handle processor_handle and a soc pstate policy policy_id, this function will set the soc pstate policy for this processor. All the processors at the same socket will be set to the same policy.

Parameters
[in]processor_handlea processor handle
[in]policy_idthe soc pstate policy id to set. The id is the id in amdsmi_dpm_policy_entry_t, which can be obtained by calling amdsmi_get_soc_pstate()
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail