Classification of Mobile Application Reviews using Word Embedding and Convolutional Neural Network

Main Article Content

I Made Mika Parwita Daniel Siahaan


The app reviews are useful for app developers because they contain valuable information, e.g. bug, feature request, user experience, and rating. This information can be used to better understand user needs and application defects during software maintenance and evolution phase. The increasing number of reviews causes problems in the analysis process for developers. Reviews in textual form are difficult to understand, this is due to the difficulty of considering semantic between sentences. Moreover, manual checking is time-consuming, requires a lot of effort, and costly for manual analysis. Previous research shows that the collection of the review contains non-informative reviews because they do not have valuable information. Non-informative reviews considered as noise and should be eliminated especially for classification process. Moreover, semantic problems between sentences are not considered for the reviews classification. The purpose of this research is to classify user reviews into three classes, i.e. bug, feature request, and non-informative reviews automatically. User reviews are converted into vectors using word embedding to handle the semantic problem. The vectors are used as input into the first classifier that classifies informative and non-informative reviews. The results from the first classifier, that is informative reviews, then reclassified using the second classifier to determine its category, e.g. bug report or feature request. The experiment using 306,849 sentences of reviews crawled from Google Play and F-Droid. The experiment result shows that the proposed model is able to classify mobile application review by produces best accuracy of 0.79, precision of 0.77, recall of 0.87, and F-Measure of 0.81.



Download data is not yet available.

Article Details

How to Cite
MIKA PARWITA, I Made; SIAHAAN, Daniel. Classification of Mobile Application Reviews using Word Embedding and Convolutional Neural Network. Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, [S.l.], p. 1-8, may 2019. ISSN 2541-5832. Available at: <>. Date accessed: 25 nov. 2020. doi:


[1] W. Maalej and H. Nabil, “Bug Report, Feature Request, or Simply Praise? On Automatically Classifying App Reviews,” 2015 IEEE 23rd international requirements engineering conference (RE), pp. 116–125, 2015.
[2] E. Guzman, M. El-halaby, and B. Bruegge, “Ensemble Methods for App Review Classification : An Approach for Software Evolution,” 30th IEEE/ACM International Conference on Automated Software Engineering, pp. 771–776, 2015.
[3] M. Lu and P. Liang, “Automatic Classification of Non-Functional Requirements from Augmented App User Reviews,” Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering, pp. 344–353, 2017.
[4] A. E. Hassan, S. Mcilroy, N. Ali, H. Khalid, and A. E. Hassan, “Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews issues that are raised in mobile app reviews,” Empirical Software Engineering, no. July, 2016.
[5] S. Panichella, A. Di Sorbo, E. Guzman, C. A. Visaggio, G. Canfora, and H. C. Gall, “How Can I Improve My App ? Classifying User Reviews for Software Maintenance and Evolution,” 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 281–290, 2015.
[6] D. Galih, P. Putri, and D. O. Siahaan, “Software Feature Extraction using Infrequent Feature Extraction,” 6th International Annual Engineering Seminar (InAES), pp. 165–169, 2016.
[7] S. Panichella, A. Di Sorbo, E. Guzman, C. A. Visaggio, G. Canfora, and H. Gall, “ARdoc : App Reviews Development Oriented Classifier,” Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 1023–1027, 2016.
[8] A. Puspaningrum, D. Siahaan, and C. Fatichah, “Mobile App Review Labeling Using LDA Similarity and Term Frequency-Inverse Cluster Frequency ( TF-ICF ),” 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE). IEEE, 2018.
[9] K. Giannakopoulos, “Informative vs . Non-informative Short Message Detection in Social Networks,” International Conference on Big Data Computing and Communications Informative, pp. 165–171, 2017.
[10] A. R. Chrismanto and Y. Lukito, “Identifikasi Komentar Spam Pada Instagram,” Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, vol. 8, no. 3, p. 219, 2017.
[11] Y. Goldberg, “A Primer on Neural Network Models for Natural Language Processing,” Journal of Artificial Intelligence Research 57, vol. 57, pp. 345–420, 2016.
[12] Y. Kim, “Convolutional Neural Networks for Sentence Classification,” arXiv preprint arXiv:1408.5882, 2014.
[13] P. Wang, J. Xu, B. Xu, C. Liu, H. Zhang, F. Wang, and H. Hao, “Semantic Clustering and Convolutional Neural Network for Short Text Categorization,” Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 352–357, 2015.
[14] W. Maalej, Z. Kurtanovic, H. Nabil, and C. Stanik, “On the Automatic Classification of App Reviews,” Requirements Engineering, pp. 311–331, 2016.
[15] W. Maalej, M. Nayebi, T. Johann, and G. Ruhe, “Towards Data-Driven Requirements Engineering,” IEEE Software SI - FUTURE OF SOFTWARE ENGINEERING, vol. 33, no. 1, pp. 48–54, 2015.
[16] G. Grano, A. Di Sorbo, F. Mercaldo, C. A. Visaggio, G. Canfora, and S. Panichella, “Android Apps and User Feedback: A Dataset for Software Evolution and Quality Improvement,” Proceedings of the 2nd ACM SIGSOFT International Workshop on App Market Analytics, pp. 8–11, 2017.
[17] N. N. E. Smrti, “Otomatisasi Klasifikasi Buku Perpustakaan dengan Menggabungkan Metode K-NN dengan K-Medoids,” Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, vol. 4, no. 1, pp. 201–214, 2013.
[18] A. Risteski, “RAND-WALK : A latent variable model approach to word embeddings,” ArXiv preprint arXiv:1502.03520, pp. 1–33, 2015.
[19] J. Pennington, R. Socher, and C. D. Manning, “GloVe : Global Vectors for Word Representation,” Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.
[20] B. Jan, H. Farman, M. Khan, M. Imran, I. Ul, A. Ahmad, S. Ali, and G. Jeon, “Deep learning in big data Analytics : A comparative study,” Computers and Electrical Engineering, pp. 1–13, 2017.
[21] W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, “A survey of deep neural network architectures and their applications,” Neurocomputing, vol. 234, no. October 2016, pp. 11–26, 2017.
[22] B. C. Wallace and Y. Zhang, “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification,” arXiv preprint arXiv:1510.03820, 2016.