Joomla 3 Templates by Varsity Jacket UK


Published: Saturday, 27 February 2010

CUDA (Compute Unified Device Architecture) is a framework to program NVIDIA graphics processing units. Due to the parallel architecture of graphics cards it is possible to accelarate some computer programs. The algorithms should be splitable into independent data units that are processed with the same algorithm.


Here I do not want to describe the architecture itself. But I want to provide an example how CUDA can be used for image processing.
Some image processing algorithms can be easily adapted to CUDA because each pixel is processed with an algoritm that uses input values in a small area surrounding the result pixel. Filters are such an example.
Average filters or simple edge detection filters (Sobel filter) are as easy to adapt to CUDA as morphology filters (dilation, erosion) or thresholding. Even more complex algorithms can be adapted to CUDA, e.g. Canny Edge detector.

An interesting option is to do image preprocessing on the graphics board so far that either a result is sent to the main program (object is good - yes/no). Or the CUDA part of the program delivers a small amount of part-results (e.g.: five objects included in the image - further analysis of the right bottom part of the image necessary).

Considering, if it is useful to port an algorithm to CUDA you should analyze the potential in parallelism of data processing in your task.
CUDA has no direct access to the PC's RAM so the input data has to be copied to the GPU's memory. If the result data is large (whole processed image instead of yes/no) the "copy delay" occurs twice. So the question is: Does the performance benefit in calculation compensate the time delay for copying data.

Hits: 7354

This website uses cookies for improved user experience