All work-items in a sub-group executing the kernel on a processor must execute this function before any are allowed to continue execution beyond the subgroup barrier
void
sub_group_barrier
(
| cl_mem_fence_flags flags) |
void
sub_group_barrier
(
| cl_mem_fence_flags flags, |
memory_scope scope) |
All work-items in a sub-group executing the kernel on a processor must execute this function before any are allowed to continue execution beyond the subgroup barrier. This function must be encountered by all workitems in a sub-group executing the kernel. These rules apply to ND-ranges implemented with uniform and nonuniform work-groups.
If sub_group_barrier
is inside
a conditional statement, then all work-items
must enter the conditional if any work-item
enters the conditional statement and
executes the subgroup barrier.
If sub_group_barrier
is inside a loop,
all work-items within the sub-group must execute the
sub_group_barrier
for each
iteration of the loop before any are allowed
to continue execution beyond the sub_group_barrier
.
The sub_group_barrier
function
also queues a memory fence (reads and writes)
to ensure correct ordering of memory operations to local or global memory.
The flags
argument specifies
the memory address space and can be
set to a combination of the following values.
CLK_LOCAL_MEM_FENCE
- The
sub_group_barrier
function
will either flush any variables stored in local memory or queue a memory fence to
ensure correct ordering of memory operations to local memory.
CLK_GLOBAL_MEM_FENCE
- The
sub_group_barrier
function
will queue a memory fence to ensure correct ordering of memory operations to global
memory. This can be useful when work-items, for example, write to buffer or image
objects and then want to read the updated data from these buffer objects.
CLK_IMAGE_MEM_FENCE
- The
sub_group_barrier
function
will queue a memory fence to ensure correct
ordering of memory operations to image objects. This
can be useful when work-items, for example, write to
image objects and then want to read the updated data
from these image objects.