async_work_group_copy

Perform an async copy.

event_t async_work_group_copy ( __local gentype *dst,
  const __global gentype *src,
  size_t  num_gentypes,
  event_t event)
event_t async_work_group_copy ( __global gentype *dst,
  const __local gentype *src,
  size_t num_gentypes,
  event_t event)

Description

async_work_group_copy performs an async copy of num_gentypes gentype elements from src to dst. The async copy is performed by all work-items in a work-group and this built-in function must therefore be encountered by all work-items in a work-group executing the kernel with the same argument values; otherwise the results are undefined. This rule applies to ND-ranges implemented with uniform and non-uniform work-groups

Returns an event object that can be used by wait_group_events to wait for the async copy to finish. The event argument can also be used to associate the async_work_group_copy with a previous async copy allowing an event to be shared by multiple async copies; otherwise event should be zero.

If event argument is non-zero, the event object supplied in event argument will be returned.

This function does not perform any implicit synchronization of source data such as using a barrier before performing the copy.

The generic type name gentype indicates the built-in data types char, char{2|3|4|8|16}, uchar, uchar{2|3|4|8|16}, short, short{2|3|4|8|16}, ushort, ushort{2|3|4|8|16}, int, int{2|3|4|8|16}, uint, uint{2|3|4|8|16}, long, long{2|3|4|8|16}, ulong, ulong{2|3|4|8|16}, float, float{2|3|4|8|16}, or double, double{2|3|4|8|16} as the type for the arguments unless otherwise stated.

When extended by the cl_khr_fp16 extension, the generic type gentypen is extended to include half, half2, half3, half4, half8, and half16.

The kernel must wait for the completion of all async copies using the wait_group_events built-in function before exiting; otherwise the behavior is undefined.

Notes

async_work_group_copy and async_work_group_strided_copy for 3-component vector types behave as async_work_group_copy and async_work_group_strided_copy respectively for 4-component vector types.

Specification

OpenCL Specification

Also see

Async Copy and Prefetch Functions

Copyright © 2007-2013 The Khronos Group Inc. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and/or associated documentation files (the "Materials"), to deal in the Materials without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Materials, and to permit persons to whom the Materials are furnished to do so, subject to the condition that this copyright notice and permission notice shall be included in all copies or substantial portions of the Materials.