Modeling human attention by learning from large amount of emotional images

Macario O. Cordel II
August, 2021


Recent resurgence of neural networks in computer vision have resulted in tremendous improvements in saliency prediction, eventually, saturating some saliency metrics. This leads researchers to devise higher-level concepts in images in order to match the key image regions attended to by human observers. In this paper, we propose a saliency model which utilizes the top-down attention mechanism through the involvement of emotion-inducing region information in the predictor’s feature space. The proposed framework is inspired by psychological and neurological studies that emotion attracts attention. Using three publicly available datasets with emotion-rich images, we were able to show that awareness of the emotion-inducing region improves saliency prediction of images. Saliency metrics for probabilistic models, particularly information gain and KL divergence, have improved with respect to the same architecture without emotion information. Statistical tests show that emotional regions generally have higher improvement than neutral regions corroborating psychological studies that emotion attracts attention.

Keywords: deep learning, attention, emotion stimuli map, visual saliency model



author={Cordel, Macario O.},
booktitle={2019 IEEE International Conference on Big Data (Big Data)},
title={Modeling human attention by learning from large amount of emotional images},