![]() |
VisionWorks Toolkit ReferenceDecember 18, 2015 | 1.2 Release |
This tutorial demonstrates VisionWorks array processing with user CUDA code.
This tutorial demonstrates implementation of the AXPY (generalized vector addition) using the CUBLAS Library.
VisionWorks array objects provide CUDA pointers with a simple 1D memory layout, similar to memory allocated by the cudaMalloc
function. Array elements are located in contiguous memory segments without any gaps between them.
Determine the number of elements in the input array:
Map the array object into the CUDA address space. The vxAccessArrayRange
requires a range for access; in this sample, the whole array range [0, num_items) is used.
NULL
before calling the vxAccessArrayRange
function; otherwise, the function will work in COPY mode, assuming that the pointer refers to a pre-allocated buffer.vxAccessArrayRange
function also returns stride
in bytes between arrays elements. In VisionWorks, this stride is always equal to the size of the element.After you get the mapped pointer, you can use it in CUDA kernels and CUDA libraries in the same way as plain CUDA pointers allocated by cudaMalloc
function.
Unmap array object.
cudaStreamSynchronize
or cudaDeviceSynchronize
to be sure that all custom CUDA kernels finish processing.