Saliency map

Saliency map

Patrik Polatsek


Saliency model predicts what attracts the attention. The results of such models are saliency maps. A saliency map is a topographic representation of saliency which refers to visually dominant locations.

The aim of the project is to implement Itti’s saliency model. It is a hierarchical biologically inspired bottom-up model based on three features: intensity, color and orientation. The resulting saliency model is created by hierarchical decomposition of the features and their combination to the single map. Attended locations are searched using Winner-take-all neuron network.

The process

First, the features are extracted from an input image.

Intensity is obtained by converting the image to grayscale.

cvtColor( input, intensity, CV_BGR2GRAY );

For color extraction the image is converted to red-green-blue-yellow color space.

R = bgr[2] - ( bgr[1] + bgr[0] ) / 2;
G = bgr[1] - ( bgr[2] + bgr[0] ) / 2;
B = bgr[0] - ( bgr[2] + bgr[1] ) / 2;
Y = ( bgr[2] + bgr[1] ) / 2 - abs( bgr[2] - bgr[1] ) / 2 - bgr[0];

Information about local orientation is extracted using Gabor filter in four angles.

Mat kernel = getGaborKernel( Size(11, 11), 2.5, degreeToRadian(theta), 2.5, 0.5 );
filter2D( input, im, -1, kernel );

The next phase consists of creation of Gaussian pyramids.

buildPyramid( channel, pyramid, levels);

Center-surround organization of receptive field of ganglion neurons is implemented as difference-of-Gaussian between finer and coarser scales of a pyramid called a feature map.

for( int i : centerScale )
for (int i : centerScale)
	pyr_c = pyramid[i];
	for (int j : surroundScale)
		Mat diff;
		resize(pyramid[i + j], pyr_s, pyr_c.size());
		absdiff(pyr_c, pyr_s, diff);

The model creates three conspicuous maps for intensity, color and orientation combining created feature maps.

The final saliency map is a mean of the conspicuous maps.

Mat saliencyMap = maps[0] / maps.size() + maps[1] / maps.size() + maps[2] / maps.size();

Basic structure of saliency model