Saliency map – Vision & Graphics Group

Patrik Polatsek

Introduction

Saliency model predicts what attracts the attention. The results of such models are saliency maps. A saliency map is a topographic representation of saliency which refers to visually dominant locations.

The aim of the project is to implement Ittiâ€™s saliency model. It is a hierarchical biologically inspired bottom-up model based on three features: intensity, color and orientation. The resulting saliency model is created by hierarchical decomposition of the features and their combination to the single map. Attended locations are searched using Winner-take-all neuron network.

The process

First, the features are extracted from an input image.

Intensity is obtained by converting the image to grayscale.

[c language=”c++”]
cvtColor( input, intensity, CV_BGR2GRAY );
[/c]

For color extraction the image is converted to red-green-blue-yellow color space.

[c language=”c++”]
R = bgr[2] – ( bgr[1] + bgr[0] ) / 2;
G = bgr[1] – ( bgr[2] + bgr[0] ) / 2;
B = bgr[0] – ( bgr[2] + bgr[1] ) / 2;
Y = ( bgr[2] + bgr[1] ) / 2 – abs( bgr[2] – bgr[1] ) / 2 – bgr[0];
[/c]

Information about local orientation is extracted using Gabor filter in four angles.

[c language=”c++”]
Mat kernel = getGaborKernel( Size(11, 11), 2.5, degreeToRadian(theta), 2.5, 0.5 );
filter2D( input, im, -1, kernel );
[/c]

The next phase consists of creation of Gaussian pyramids.

[c language=”c++”]
buildPyramid( channel, pyramid, levels);
[/c]

Center-surround organization of receptive field of ganglion neurons is implemented as difference-of-Gaussian between finer and coarser scales of a pyramid called a feature map.

[c language=”c++”]
for( int i : centerScale )
for (int i : centerScale)
{
pyr_c = pyramid[i];
for (int j : surroundScale)
{
Mat diff;
resize(pyramid[i + j], pyr_s, pyr_c.size());
absdiff(pyr_c, pyr_s, diff);
differencies.push_back(diff);
}
}
[/c]

The model creates three conspicuous maps for intensity, color and orientation combining created feature maps.

The final saliency map is a mean of the conspicuous maps.

[c language=”c++”]
Mat saliencyMap = maps[0] / maps.size() + maps[1] / maps.size() + maps[2] / maps.size();
[/c]