Event Monitoring#
Functions | |
amdsmi_status_t | amdsmi_event_create (amdsmi_processor_handle *processor_list, uint32_t num_devices, uint64_t event_types, amdsmi_event_set *set) |
Allocate a new event set notifier to monitor different types of issues with the GPU running virtualization SW. This call registers an event set. The user must pass an array with the GPUs it wants to monitor with the selected event flags. More... | |
amdsmi_status_t | amdsmi_event_read (amdsmi_event_set set, int64_t timeout_usec, amdsmi_event_entry_t *event) |
The call blocks till timeout is expired to copy one event specified by the event set into the user provided notifier storage. More... | |
amdsmi_status_t | amdsmi_event_destroy (amdsmi_event_set set) |
Destroys and frees an event set. More... | |
Detailed Description
Function Documentation
◆ amdsmi_event_create()
amdsmi_status_t amdsmi_event_create | ( | amdsmi_processor_handle * | processor_list, |
uint32_t | num_devices, | ||
uint64_t | event_types, | ||
amdsmi_event_set * | set | ||
) |
Allocate a new event set notifier to monitor different types of issues with the GPU running virtualization SW. This call registers an event set. The user must pass an array with the GPUs it wants to monitor with the selected event flags.
- Parameters
-
[in] processor_list Processor handles for the GPU to listen for events. [in] num_devices Number of processors in the list. [in] event_types Bitmask of the different event_types that the event_set will monitor in this GPU. Bit index (from 0): | 63 62 61 60| 59 .......... 0 | | event severity | event category bit field |
There are 5 event severities and the appropriate macros to set them: 0b0000 High severity - AMDSMI_MASK_HIGH_ERROR_SEVERITY_ONLY 0b0001 Med severity - AMDSMI_MASK_INCLUDE_MED_ERROR_SEVERITY 0b0010 Low severity - AMDSMI_MASK_INCLUDE_LOW_ERROR_SEVERITY 0b0100 Warn severity - AMDSMI_MASK_INCLUDE_WARN_SEVERITY 0b1000 Info severity - AMDSMI_MASK_INCLUDE_INFO_SEVERITY
AMDSMI_MASK_INCLUDE_CATEGORY macro is used to set the category we want to monitor. Enum AMDSMI_EVENT_CATEGORY is used as the input parameter of the macro.
- Parameters
-
[out] set Reference to the pointer to the event set created by the library. This will be allocated by the library.
- Returns
- amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail
◆ amdsmi_event_read()
amdsmi_status_t amdsmi_event_read | ( | amdsmi_event_set | set, |
int64_t | timeout_usec, | ||
amdsmi_event_entry_t * | event | ||
) |
The call blocks till timeout is expired to copy one event specified by the event set into the user provided notifier storage.
- Note
- If timeout_usec is negative, the call will block forever, if timeout_usec is zero, the call returns immediately. Timeout value given in microseconds is converted to milliseconds. Minimal timeout is 1000 us. If provided timeout is lower than 1000 then the timeout will be set to 1000us by default. The timeout value in us will be converted to a smaller integer value in ms. (e.g. 1500us -> 1ms , 2600us -> 2ms)
- Provided event entry contains a 64 bit timestamp, fields for the category of the error, the sub-code and flags associated with the error, VF and GPU handles that originated the error and a 256B text buffer with a human-readable description of the error.
- Parameters
-
[in] set Event set to read from. Use the same variable set that was used in the amdsmi_event_create call. [in] timeout_usec Timeout in usec to wait for event [out] event Reference to the user allocated event notifier.
- Returns
- amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail
◆ amdsmi_event_destroy()
amdsmi_status_t amdsmi_event_destroy | ( | amdsmi_event_set | set | ) |
Destroys and frees an event set.
- Parameters
-
[in] set Event set to destroy.
- Returns
- amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail