USC Media Communications Lab – Image characterization and categorization based on learning of visual attention

Image characterization and categorization based on learning of visual attention

Author: Jia He, Xiang Fu, Shangwen Li, Chang-Su Kim, and C.-C. Jay Kuo

Research Problem

Visual attention of an image, known as saliency of the image, is defined as the regions and contents of the image that attract human eyes’ attention, such as regions with high-contrast, bright luminance, vivid color, clear scene structure and so on, or can be the semantic objects that human expect to see. Our research is to learn the visual attention of the image database, and then develop image characterization and classification algorithms according to the learned visual attention features. These algorithms will be applied into image compression, retargeting, annotation, segmentation, image retrieval, etc.

Main Ideas

Recently, the image saliency has been widely studied. However, most work focuses on extracting the salience map of the image using a bottom-up context computation framework [1~5]. The saliency of the image does not always match exactly the visual attention of human, since human tend to “be attracted” by things of their particular interests. To bridge the gap, the learning of visual attention should combine both bottom-up and top-down frameworks. To achieve this goal, we are building a hierarchical human perception tree and learning the image visual attentions with detailed image characteristics, including the salient region’s appearance, semantics, attention priority and intensity. And then the image classification will be based on the content of the saliency area and its saliency intensity. Our system will capture not only the locations of visual attention regions in an image but also estimate their priorities and intensities.

Future Challenges

Building a hierarchical human perceptual tree for visual attention learning will be challenging because of its complication, and little work has been done on this modeling. We aim to model the perceptual tree as close as possible to real human perceptual system.

References

[1] Judd, Tilke, et al. “Learning to predict where humans look.” Computer Vision, 2009 IEEE 12th international conference on. IEEE, 2009.
[2] Borji, Ali, and Laurent Itti. “State-of-the-art in visual attention modeling.” (2013): 1-1.
[3] Achanta, Radhakrishna, et al. “Frequency-tuned salient region detection.” Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009.
[4] Goferman, Stas, Lihi Zelnik-Manor, and Ayellet Tal. “Context-aware saliency detection.” Pattern Analysis and Machine Intelligence, IEEE Transactions on 34.10 (2012): 1915-1926.
[5] Cheng, Ming-Ming, et al. “Global contrast based salient region detection.” Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
[6] He, Jia, Chang-Su Kim, and C.-C. Jay Kuo. “Interactive Image Segmentation Techniques: Algorithms and Performance Evaluation.” Springer Singapore, 2014.17-62

About the Author: Jia He

Employment: Microsoft, USA Jia He received her B.S. and M.S. degree in Information and Communication Engineering from Xi'an Jiaotong University (XJTU), China, in 2007 and 2010 respectively. She worked in Realsil Microelectronics (Suzhou) Co. LTD. as system engineer in 2010. She was with Professor Kuo’s Media Communications Lab from January 2011 to August 2015. She received her MS degree from the University of Southern California in 2015 Summer.

Image characterization and categorization based on learning of visual attention

Research Problem

Main Ideas

Future Challenges

References

Share This Story, Choose Your Platform!

About the Author: Jia He

You May Also Like

MCL Research on Small Neural Netwrok

MCL Research on Source-Distribution-Aimed Generative Model

MCL Research on Image Super-resolution

Hierarchical Bag-of-Words Model for Joint Muli-View Object Representation and Classification

Automatic Image Annotation

TEAM: Ensemble Classifier for Large-Scale Indoor/Outdoor Images

3-D Object Classification & Retrieval