Convolution Kernel Mask Operation
A powerful array of image-processing technologies utilize multipixel operations with convolution kernel masks, in which each output pixel is altered by contributions from a number of adjoining input pixels. These types of operations are commonly referred to as convolution or spatial convolution. This interactive tutorial explores how a convolution operation is performed on a digital image.
The tutorial initializes with an ensemble of random pixel brightness values appearing in the Input Image window. The numerical brightness value of each pixel in the Input Image is displayed in the Digital Image with Kernel Overlay window. In the latter window, the currently selected kernel mask (indicated in red) is superimposed over an ensemble of pixels, where the central pixel lying under the convolution mask is highlighted in blue. The Output Image window displays the pixels that result from the convolution operation. Those pixels lying along the border of the Output Image window are not altered by the convolution operation, but are copied directly from the input image pixel ensemble. The pixels displayed in the Output Image window that result from the convolution operation are distinguished by a slight reddish coloration. To operate the tutorial, select a convolution kernel from the Choose A Kernel pull-down menu, and use the Manual or Auto buttons to advance the tutorial to the next step in the convolution process. When the Auto button is selected, the Auto Speed slider sets the rate at which the tutorial advances to the next step of the convolution operation. Clicking the Reset button will randomize the input pixel ensemble and will allow the user to restart the convolution operation.
In the simplest form, a two-dimensional convolution operation on a digital image utilizes a box convolution kernel. Convolution kernels typically feature an odd number of rows and columns in the form of a square, with a 3 × 3 pixel mask (convolution kernel) being the most common form, but 5 × 5 and 7 × 7 kernels are also frequently employed. The convolution operation is performed individually on each pixel of the original input image, and involves three sequential operations, which are presented in Figure 1. The operation begins when the convolution kernel is overlaid on the original image in such a manner that the center pixel of the mask is matched with the single pixel location to be convolved from the input image. This pixel is referred to as the target pixel (monitor the Position Kernel Mask step in the tutorial).
Next, each pixel integer value in the original (often termed the source) image is multiplied by the corresponding value in the overlying mask (the Multiply Kernel with Image step in the tutorial). In the third step, the sum of products from the second step is computed (the Sum Products step in the tutorial). Finally, the gray level value of the target pixel is replaced by the sum of all the products determined in the third step (the Store Output Value step in the tutorial). To perform a convolution on an entire image, this operation must be repeated for each pixel in the original image. Typically, the output gray level value is scaled to match the storage range, although this step is not illustrated in the tutorial.
In general, the numerical values utilized in convolution kernels tend to be integers with a divisor that can vary depending upon the desired operation. Also, because many convolution operations result in negative values (note that the value of a convolution kernel integer can be negative), offset values are often applied to restore a positive value. The smoothing convolution kernel included in the tutorial has a value of unity for each cell in the matrix, with a divisor value of 9 and an offset of zero. Kernel matrices for 8-bit grayscale images are often constrained with divisors and offsets that are chosen so that all processed values following the convolution fall between 0 and 255. Many of the popular software packages have user-specified convolution kernels designed to fine-tune the type of information that is extracted for a particular application.
Convolution kernels are useful for a wide variety of digital image processing operations, including smoothing of noisy images (spatial averaging) and sharpening of images by edge enhancement, utilizing Laplacian, sharpening, or gradient filters (convolution kernels). In addition, local contrast can be adjusted through the use of maximum, minimum, or median filters, and images can be transformed from the spatial to the frequency domain (in effect, performing a Fourier transformation) with convolution kernels. The total number of convolution kernels developed for image processing is enormous, but several filters enjoy widespread application among many of the popular image processing software packages.