This tutorial demonstrates VisionWorks importing functionality for existing CUDA memory.
This tutorial shows how to import existing CUDA memory for an image in NV12 format into the VisionWorks framework and how to process it with VisionWorks primitives.
Allocate CUDA memory for the NV12 image using the cudaMallocPitch
function. NV12 image consists of 2 planes:
- First Y plane has \( W \times H \) size of 8-bit format.
- Second interleaved CrCb plane has \( \frac{W}{2} \times \frac{H}{2} \) size of 16-bit format.
int width = 640;
int height = 480;
size_t y_pitch = 0;
void *y_dev_ptr = NULL;
cudaMallocPitch(&y_dev_ptr, &y_pitch, width *
sizeof(
vx_uint8), height);
size_t crcb_pitch = 0;
void *crcb_dev_ptr = NULL;
cudaMallocPitch(&crcb_dev_ptr, &crcb_pitch, width / 2 *
sizeof(
vx_uint16), height / 2);
Create an array of pointers for each image plane:
void *img_ptrs[] = {
y_dev_ptr,
crcb_dev_ptr
};
Create an array of vx_imagepatch_addressing_t
structures that describes memory layout for each plane. We need to fill size and stride fields of the vx_imagepatch_addressing_t
structure.
- Note
- Currently, VisionWorks does not support memory layouts with gaps between pixels (i.e.,
vx_imagepatch_addressing_t::stride_x
should always be equal to pixel size).
-
Current implementation does not support image dimensions larger than
INT32_MAX
.
img_addrs[0].
dim_x = width;
img_addrs[0].
dim_y = height;
img_addrs[1].
dim_x = width / 2;
img_addrs[1].
dim_y = height / 2;
Create a VisionWorks image object using pre-allocated CUDA memory:
The imported image object can be used with VisionWorks primitives in the same way as plain image objects:
To update the imported image or get data from it, you must use access/commit methods to notify the framework about access so that the framework can do appropriate buffers synchronization. If you work with imported memory without access/commit methods, it will cause "undefined" behavior.
void* ptr = NULL;
cudaMemset2D(y_dev_ptr, y_pitch, 0, width *
sizeof(
vx_uint8), height);
The Full Code for This Tutorial
{
int width = 640;
int height = 480;
size_t y_pitch = 0;
void *y_dev_ptr = NULL;
cudaMallocPitch(&y_dev_ptr, &y_pitch, width *
sizeof(
vx_uint8), height);
size_t crcb_pitch = 0;
void *crcb_dev_ptr = NULL;
cudaMallocPitch(&crcb_dev_ptr, &crcb_pitch, width / 2 *
sizeof(
vx_uint16), height / 2);
void *img_ptrs[] = {
y_dev_ptr,
crcb_dev_ptr
};
img_addrs[0].
dim_x = width;
img_addrs[0].
dim_y = height;
img_addrs[1].
dim_x = width / 2;
img_addrs[1].
dim_y = height / 2;
void* ptr = NULL;
cudaMemset2D(y_dev_ptr, y_pitch, 0, width *
sizeof(
vx_uint8), height);
}