ASALTAG : Automatic Image Annotation Through Salient Object Detection and Improved k-Nearest Neighbor Feature Matching
Abstract
Image databases are becoming very large nowadays, and there is an increasing need for automatic image annotation, for assiting on finding the desired specific image. In this paper, we present a new approach of automatic image annotation using salient object detection and improved k-Nearest Neigbor classifier named ASALTAG. ASALTAG is consist of three major part, the segmentation using Minimum Barirer Salienct Region Segmentation, feature extraction using Block Truncation Algorithm, Gray Level Co-occurrence Matrix and Hu’ Moments, the last part is classification using improved k-Nearest Neigbor. As the result we get maximum accuracy of 79.56% with k=5, better than earlier research. It is because the saliency object detection we do before the feature extraction proccess give us more focused object in image to annotate. Normalization of the feature vector and the distance measure that we use in ASALTAG also improve the kNN classifier accuracy for labeling image.
Downloads
References
[2] M. Ames and M. Naaman, “Why We Tag : Motivations for Annotation in Mobile and Online Media,” Proc. SIGCHI Conf. Hum. factors Comput. Syst. ACM, 2007.
[3] A. Makadia, V. Pavlovic, and S. Kumar, “Baselines for Image Annotation,” Int. J. Comput. Vis., vol. 1, no. 90, pp. 88–105, 2010.
[4] D. C. Khrisne and D. Putra, “Automatic Image Annotation Menggunakan Block Truncation dan K-Nearest Neighbor” Lontar Komputer, vol. 4, no. 1, pp. 224–230, 2013.
[5] J. Zhang, S. Sclaroff, Z. Lin, X. Shen, and B. Price, “Minimum Barrier Salient Object Detection at 80 FPS,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1404–1412.
[6] S. Silakari, M. Motwani, and M. Maheshwari, “Color Image Clustering using Block Truncation Algorithm,” Int. J. Comput. Sci. Issues, vol. 4, no. 2, pp. 2–6, 2009.
[7] R. M. Haralick, K. Shanmugam, and I. Dinstein, “Haralick-TexturalFeatures.pdf,” IEEE Trans. Syst. Man Cybern., vol. SMC-3, no. 6, pp. 610–621, 1973.
[8] M. K. Hu, “Visual Pattern Recognition by Moment Invariants,” IRE Trans. Inf. Theory, vol. 8, no. 2, pp. 179–187, 1962.
[9] M. Guillaumin, T. Mensink, J. Verbeek, C. Schmid, and L. J. Kuntzmann, “TagProp : Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation,” in Computer Vision, 2009 IEEE 12th International Conference, 2009, pp. 309–316.
[10] L. Wu, R. Jin, and A. K. Jain, “Tag Completion for Image Retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 3, pp. 716–727, 2013.
[11] W. Zhu, S. Liang, Y. Wei, and J. Sun, “Saliency Optimization from Robust Background Detection,” in IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 2814–2821.
[12] D. C. Khrisne and M. D. Yusanto, “Content-Based Image Retrieval Menggunakan Metode Block Truncation Algorithm dan Grid Partitioning”, S@ CIES vol. 5, no. 2, pp. 79-85, 2015.
[13] C. Singh and P. Sharma, “Performance analysis of various local and global shape descriptors for image retrieval,” Multimed. Syst., vol. 19, no. 4, pp. 339–357, 2013.
[14] M. Stricker and M. Orengo, “Similarity of Color Images,” in Storage and Retrieval for Image and Video Databases (SPIE), 1995, pp. 381–392.
[15] H. Y. Chai, L. K. Wee, T. T. Swee, S. H. Salleh, and A. K. Ariff, “Gray-Level Co-occurrence Matrix Bone Fracture Detection,” Am. J. Appl. Sci., vol. 8, no. 1, p. 26, 2011.
[16] M. Partio, B. Cramariuc, M. Gabbouj, and A. Visa, “Rock Texture Retrieval using Gray Level Co-occurrence Matrix,” in Proc. of 5th Nordic Signal Processing Symposium, 2002.
[17] Peterson, Leif E. "K-nearest neighbor." Scholarpedia 4.2 (2009): 1883.
[18] Golub, G. H. and Van Loan, C. F. “Matrix Computations”, Baltimore, MD, Johns Hopkins University Press, 1985, pg. 15.
[19] D. Tao, X. Li, and S. J. Maybank, “Negative Samples Analysis in Relevance Feedback,” IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 19, no. 4, pp. 568-580, April 2007
[20] Wang, J. Z., Li, J., and Wiederhold, G., “SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture Libraries,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 23, no. 9, pp. 947-963, September 2001.