async_work_group_copy performs an 
          async copy of num_gentypes gentype elements from
          src to dst. The async copy is performed by all
          work-items in a work-group and this built-in function must therefore be encountered
          by all work-items in a work-group executing the kernel with the same argument values;
          otherwise the results are undefined. This rule applies 
            to ND-ranges implemented with uniform 
            and non-uniform work-groups
        
          Returns an event object that can be used by
          wait_group_events to
          wait for the async copy to finish. The event argument can also be
          used to associate the async_work_group_copy with a previous
          async copy allowing an event to be shared by multiple async copies; otherwise
          event should be zero.
        
          If event argument is non-zero, the event object supplied in
          event argument will be returned.
        
This function does not perform any implicit synchronization of source data such as using a barrier before performing the copy.
The generic type name gentype indicates the built-in data types char, char{2|3|4|8|16}, uchar, uchar{2|3|4|8|16}, short, short{2|3|4|8|16}, ushort, ushort{2|3|4|8|16}, int, int{2|3|4|8|16}, uint, uint{2|3|4|8|16}, long, long{2|3|4|8|16}, ulong, ulong{2|3|4|8|16}, float, float{2|3|4|8|16}, or double, double{2|3|4|8|16} as the type for the arguments unless otherwise stated.
          When extended by the
          cl_khr_fp16 extension,
          the generic type gentypen is extended to
          include half, half2, half3, half4,
          half8, and half16.
        
            The kernel must wait for the completion of all async copies using the 
            wait_group_events built-in function before 
            exiting; otherwise the behavior is undefined.
        
        async_work_group_copy and
        async_work_group_strided_copy for 3-component
        vector types behave as async_work_group_copy and
        async_work_group_strided_copy respectively for 4-component vector
        types.
      
 Copyright © 2007-2013 The Khronos Group Inc. 
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and/or associated documentation files (the
"Materials"), to deal in the Materials without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Materials, and to
permit persons to whom the Materials are furnished to do so, subject to
the condition that this copyright notice and permission notice shall be included
in all copies or substantial portions of the Materials.
Copyright © 2007-2013 The Khronos Group Inc. 
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and/or associated documentation files (the
"Materials"), to deal in the Materials without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Materials, and to
permit persons to whom the Materials are furnished to do so, subject to
the condition that this copyright notice and permission notice shall be included
in all copies or substantial portions of the Materials.