Posted on 3. June 202421. September 2024 by Igor Janos

Football Playing Field Registration Using Distance Maps

Introduction

In football, the playing field registration problem can be thought of as the homography estimation between the playing field plane and the visible area of

the image. Existing approaches rely on the identification of significant keypoints in the image, such as the corners or intersections of the playing

field lines, which are then used for the initial estimate of the homography. Finding the exact location of such keypoints can be very challenging, and

that is why the initial homography estimates are not accurate enough and need to be refined iteratively. We propose distance maps, a technique that

utilizes distance transform to predict the location of imaginary lines spread evenly over the playing field. Intersections of these lines can be

used to find more accurate homography estimates. We also introduce the Calib360 playing field registration dataset, that can provide a sufficient amount of

training data with accurate ground truth labels.

The source code of our method can be found at – https://github.com/IgorJanos/stuFieldReg-official

The official repository of the Calib360 dataset can be found at – https://github.com/IgorJanos/stuCalib360

The best pre-trained models can be downloaded at – https://vggnas.fiit.stuba.sk/download/janos/fieldreg/fieldreg-experiments.tar.gz

You can find our original paper at – … URL …

Dataset

We propose the Calib360 dataset, which contains 300,000 training images with accurate homography annotations, which we have constructed from equirectangular panoramas. In panoramas, we annotate the positions of known keypoints of the football playing field model.

Next, from the annotated keypoints we extract the panoramic camera position and orientation with respect to the playing field. Once the camera parameters are extracted, we can generate unlimited number of different views of the playing field by adjusting the viewing angles and field of view. These generated images are perfect for training of neural networks because they all conform to the pinhole camera model and contain accurate annotations.

Results

Estimated playing field registration by our method (first column), by the method of Nie et al. (2021, second column), by the method of Chu et al. (2022, third column), and by the method of Theiner et al. (2023, fourth column).

Posted on 25. January 20243. June 2024 by Igor Janos

Improving Radial Lens Distortion Correction Using Multi-task Learning

Introduction

Sports image analysis is one of the domains that has significantly benefited from advances in computer vision. With the growing adoption of machine learning methods new products and tools have emerged that make sports image analysis both accurate and accessible for professional athletes, amateur and junior athletes, and fans all around the world. Sports images serve as crucial visual records, aiding in post-game or live analysis and decision-making. However, very often they are affected by radial distortion that hinders their accurate interpretation and utilization.

We propose a deep-learning regression-based method to rectify images containing radial lens distortion, that operates on single independent images. Our method hase been trained on the football domain dataset, and has been fine-tuned specifically for images containing small radial distortion, which is more difficult to rectify accurately. Compared to other deep-learning radial distortion correction methods, we introduce the following contributions:

Secondary learning task that learns useful distortion features on random combinations of sample pairs, prevents overfitting, and encourages the feature extractor to learn more general distortion features.
Penalty term that encourages better accuracy on low-distortion images

The official GitHub repository for our method can be found at – https://github.com/IgorJanos/stuImprovingRadial-official

The best trained model and training configuration can be downloaded here – improving-radial-best.tar.gz

The original paper can be found here – https://doi.org/10.1016/j.patrec.2024.05.008

Results

We have performed a quantitative evaluation of our proposed method on the validation set of the Football360 validation subset, and compared our method with 3 other methods. Our method outperformed all other evaluated methods on all metrics by a significant margin.

[table id=6 /]

We have performed a qualitative evaluation of our method on the World Cup 2014 dataset. The WorldCup 2014 images have not been used during the training of neither of the evaluated methods.

[table id=5 /]

Posted on 24. October 20223. June 2024 by Igor Janos

Football360 – Introducing a New Dataset for Camera Calibration in Sports Domain

Introduction

The aim of this dataset is to aid the development and evaluation of computer vision algorithms, primarily the radial distortion correction algorithms, in the sports domain.

This dataset contains 268 panorama images, and was created using the PANONO panoramic camera in 3 football arenas in Slovakia. Each arena was covered from numerous locations on all levels of the tribune, and broadcast camera platforms. The images capture regular football game, pitch maintenance, low/challenging lighting conditions, day and night situations.

[table id=2 /]

You can download the raw images from the following link – football360-raw.tar.gz (7.7 GB)

There is also an official GitHub repository available at – https://github.com/IgorJanos/stuFootball360

The original paper can be found here – https://www.scitepress.org/PublishedPapers/2023/116812/116812.pdf

Raw Images

Raw images are stored as 16384×8192 JPG, they are the direct result of the PANONO stitching service.

Exported Sets

There are four exported datasets available for direct download and use. Each dataset contains a collection of images (data) distorted with known distortion coefficients (labels).

[table id=3 /]

You can see an example of how to load and use these files in a linked notebook – https://github.com/IgorJanos/stuFootball360/blob/main/notebooks/explore.ipynb

[table id=4 /]

Igor Janos

Posted on 19. October 202019. October 2020 by Dipl.-Ing. Wanda Benešová, PhD.

Explainable 3D convolutional neural network using GMM encoding

Abstract: The aim of this paper is to propose a novel method to explain, interpret, and support the decision-making process of deep Convolutional Neural Network (CNN). This is achieved by analyzing neuron activations of trained 3D-CNN on selected layers via Gaussian Mixture Model (GMM) … more

Posted on 7. August 201919. October 2020 by Dipl.-Ing. Wanda Benešová, PhD.

MODELLING OF HUMAN VISUAL ATTENTION

Ing. Patrik Polatsek – Dissertation thesis

MODELLING OF HUMAN VISUAL ATTENTION

Degree Course: Applied Informatics
Author: Ing. Patrik Polatsek
Supervisor: doc. Ing. Vanda Benesova, PhD.
Slovak University of Technology Bratislava, FACULTY OF INFORMATICS AND INFORMATION TECHNOLOGIES, May 2019, … more

Posted on 7. May 201915. April 2021 by Dipl.-Ing. Wanda Benešová, PhD.

Segmentation of anatomical organs in medical data – Master Thesis : Bc. Martin Tamajka

Download: Master Thesis- Martin Tamajka: Segmentation of anatomical organs in medical data

Annotation:

2016, May
Medical image segmentation is an important part of medical practice. Primarily as far as radiologists are concerned it simplifies their everyday tasks and allows them to use their time more effective, because in most cases radiologists only have a certain amount of time they can spend examining patientâ€™s data. Computer aided diagnosis is also a powerful instrument in elimination of possible human failure.
In this work, we propose a novel approach to human organs segmentation. We primarily concentrate on segmentation of human brain from MR volume. Our method is based on oversegmenting 3D volume to supervoxels using SLIC algorithm. Individual supervoxels are described by features based on intensity distribution of contained voxels and on position within the brain. Supervoxels are classified by neural networks which are trained to classify supervoxels to individual tissues. In order to give our method additional precision, we use information about the shape and inner structure of the organ. In general we propose a 6-step segmentation method based on classification.
We compared our results with those of state-of-the-art methods and we can conclude that the results are clearly comparable.
Apart from the global focus of this thesis, our goal is to apply engineering skills and best practices to implement proposed method and necessary tools in such a way that they can be easily extended and maintained in the future.

Posted on 7. May 201919. October 2020 by Lukas Martak

Automatic Music Transcription using WaveNet

Deep generative models such as WaveNet are surprisingly good at various modelling tasks. We exploit the modelling capacity of WaveNet architecture in a setup that is quite different from the original generative case: for feature extraction and pattern recognition in sake of polyphonic music transcription. The model is trained end-to-end to perform the underlying task of multiple fundamental frequency estimation by processing raw waveforms of digital audio signal. … more

Posted on 26. February 201920. October 2020 by Patrik Polatsek

Exploring Visual Saliency of Real Objects at Different Depths

Depth cues are important aspects that influence the visual saliency of objects around us. However, the depth aspect and its quantified impact on the visual saliency has not yet been thoroughly examined in real environments. We designed and carried out an experimental study to examine the influence of the depth cues on the visual saliency of the objects at the scene. The experimental study took place with 28 participants under laboratory conditions with the objects in various depth configurations at the real scene. Visual attention data were measured by the wearable eye-tracking glasses. .. more

Posted on 28. August 201820. October 2020 by Patrik Polatsek

Color Saliency

Color is a fundamental component of visual attention. Saliency is usually associated with color contrasts. Besides this bottom-up perspective, some recent works indicate that psychological aspects should be considered too. However, relatively little research has been done on the potential impacts of color psychology on attention. To our best knowledge, a publicly available fixation dataset specialized in color features does not exist. We, therefore, conducted a novel eye-tracking experiment with color stimuli. We studiedÂ fixations of 15 participants to find out whether color differences can reliably model color saliency or particular colors are preferably fixated regardless of scene content, i.e. color prior. … more

Posted on 25. August 20187. July 2019 by Patrik Polatsek

Effects of individual’s emotions on saliency and visual search

Patrik Polatsek, Miroslav Laco, Å imon DekrÃ©t, Wanda Benesova, Martina BarÃ¡nkovÃ¡, Bronislava StrnÃ¡delovÃ¡, Jana KorÃ³niovÃ¡, MÃ¡ria GablÃkovÃ¡

Abstract.

While psychological studies have confirmed a connection between emotional stimuli and visual attention, there is a lack of evidence, how much influence individual’s mood has on visual information processing of emotionally neutral stimuli. In contrast to prior studies, we explored if bottom-up low-level saliency could be affected by positive mood. We therefore induced positive or neutral emotions in 10 subjects using autobiographical memories during free-viewing, memorizing the image content and three visual search tasks. We explored differences in human gaze behavior between both emotions and relate their fixations with bottom-up saliency predicted by a traditional computational model. We observed that positive emotions produce a stronger saliency effect only during free exploration of valence-neutral stimuli. However, the opposite effect was observed during task-based analysis. We also found that tasks could be solved less efficiently when experiencing a positive mood and therefore, we suggest that it rather distracts users from a task.

download: saliency-emotions

Please cite this paper if you use the dataset:

Polatsek, P., Laco, M., DekrÃ©t, Å ., Benesova, W., BarÃ¡nkovÃ¡, M., StrnÃ¡delovÃ¡, B., KorÃ³niovÃ¡, J., & GablÃkovÃ¡, M. (2019)

Effects of individual’s emotions on saliency and visual search

Posted on 30. July 201825. March 2019 by Patrik Polatsek

Computational Models of Shape Saliency

Patrik Polatsek, Marek Jakab, Wanda Benesova, Matej KuÅ¾ma

Abstract. Computational models predicting stimulus-driven human visual attention usually incorporate simple visual features, such as intensity, color and orientation. However, saliency of shapes and their contour segments influence attention too. Therefore, we built 30 own shape saliency models based on existing shape representation and matching techniques and compared them with 5 existing saliency methods. Since available fixation datasets were usually recorded on natural scenes where various factors of attention are present, we performed a novel eye-tracking experiment that primarily focuses on shape and contour saliency. Fixations from 47 participants who looked at silhouettes of abstract and realworld objects were used to evaluate the accuracy of proposed saliency models and investigate which shape properties are most attentive. The results showed that visual attention integrates local contour saliency, saliency of global shape features and shape dissimilarities. Fixation data also showed that intensity and orientation contrasts play an important role in shape perception. We found that humans tend to fixate first irregular geometrical shapes and objects whose similarity to a circle is different from other objects.

shapeSal dataset contains an extended version of this eye-tracking experiment including images and fixation data (73 participants, 158 scenes).

download:Â shapeSal.zip [V2.0; update: 25.3.2019]

Please cite this paper if you use the dataset:

Polatsek, P., Jakab, M., Benesova, W., & KuÅ¾ma, M. (2019)
Computational Models of Shape Saliency
11th International Conference on Machine Vision (ICMV 2018) (Vol. 11041)
International Society for Optics and Photonics

https://doi.org/10.1117/12.2522779

Posted on 1. February 201824. October 2022 by Patrik Polatsek

Exploring Visual Attention and Saliency Modeling for Task-Based Visual Analysis

Patrik Polatsek, Manuela Waldner, Ivan Viola, Peter Kapec, Wanda Benesova

Abstract. Memory, visual attention, and perception play a critical role in the design of visualizations. The way users observe a visualization is affected by salient stimuli in a scene as well as by domain knowledge, interest, and the task. While recent saliency models manage to predict the usersâ€™ visual attention in visualizations during exploratory analysis, there is little evidence of how much influence bottom-up saliency has on task-based visual analysis. Therefore, we performed an eye-tracking study with 47 users to determine the usersâ€™ path of attention when solving three low-level analytical tasks using 30 different charts from the MASSVIS database. We also compared our task-based eye-tracking data to the data from the original memorability experiment by Borkin et al.. We found that solving a task leads to more consistent viewing patterns compared to exploratory visual analysis. However, bottom-up saliency of visualization has negligible influence on usersâ€™ fixations and task efficiency when performing a low-level analytical task. Also, the efficiency of visual search for an extreme target data point is barely influenced by the targetâ€™s bottom-up saliency. Therefore, we conclude that bottom-up saliency models tailored toward information visualization are not suitable for predicting visual attention when performing task-based visual analysis. We discuss potential reasons and suggest extensions to visual attention models to better account for task-based visual analysis.

TASKVIS dataset contains eye-tracking data from this task-based visual analysis experiment.

download: taskvis.zip

Please cite this paper if you use the dataset:

Polatsek, P., Waldner, M., Viola, I., Kapec, P., & Benesova, W. (2018)
Exploring Visual Attention and Saliency Modeling for Task-Based Visual Analysis
Computers & Graphics, 72, 26-38

https://doi.org/10.1016/j.cag.2018.01.010