Patrik Polatsek
Introduction
Saliency model predicts what attracts the attention. The results of such models are saliency maps. A saliency map is a topographic representation of saliency which refers to visually dominant locations.
The aim of the project is to implement Itti’s saliency model. It is a hierarchical biologically inspired bottom-up model based on three features: intensity, color and orientation. The resulting saliency model is created by hierarchical decomposition of the features and their combination to the single map. Attended locations are searched using Winner-take-all neuron network.
The process
First, the features are extracted from an input image.
Intensity is obtained by converting the image to grayscale.
cvtColor( input, intensity, CV_BGR2GRAY );
For color extraction the image is converted to red-green-blue-yellow color space.
R = bgr[2] - ( bgr[1] + bgr[0] ) / 2; G = bgr[1] - ( bgr[2] + bgr[0] ) / 2; B = bgr[0] - ( bgr[2] + bgr[1] ) / 2; Y = ( bgr[2] + bgr[1] ) / 2 - abs( bgr[2] - bgr[1] ) / 2 - bgr[0];
Information about local orientation is extracted using Gabor filter in four angles.
Mat kernel = getGaborKernel( Size(11, 11), 2.5, degreeToRadian(theta), 2.5, 0.5 ); filter2D( input, im, -1, kernel );
The next phase consists of creation of Gaussian pyramids.
buildPyramid( channel, pyramid, levels);
Center-surround organization of receptive field of ganglion neurons is implemented as difference-of-Gaussian between finer and coarser scales of a pyramid called a feature map.
for( int i : centerScale ) for (int i : centerScale) { pyr_c = pyramid[i]; for (int j : surroundScale) { Mat diff; resize(pyramid[i + j], pyr_s, pyr_c.size()); absdiff(pyr_c, pyr_s, diff); differencies.push_back(diff); } }
The model creates three conspicuous maps for intensity, color and orientation combining created feature maps.
The final saliency map is a mean of the conspicuous maps.
Mat saliencyMap = maps[0] / maps.size() + maps[1] / maps.size() + maps[2] / maps.size();