Text Based Approach For Similar Traffic Incident Detection from Twitter
Abstract
Microblog has been used as an information source to detect real-world event. Several related studies retrieved road traffic event based on textual content. Not only detect traffic incident, we found that it is necessary to recognize statuses with similar traffic incident content. Better representation of traffic information will help the handling of traffic incident by related parties. This study proposes text-based approach for identification of similar traffic incident from twitter posts. The proposed approach performs traffic incident information extraction and calculates information’s weight based on textual similarity upon traffic incident information gained. We evaluate the proposed method by using a traffic incident information retrieval system. We used Indonesian language corpus contains traffic incident tweets data. Best average f-measure 70% was achieved by retrieval system that tested using Jaccard coefficient. Therefore text matching such as Jaccard coefficient is more suitable to be implemented in very short text document such as extracted tweet document. The experiment result gives the conclusion that the proposed approach can be implemented for identification of similar traffic incident information from Twitter.
Downloads
References
[2] T. Sakaki, M. Okazaki, and Y.Matsuo, “Tweet analysis for real-time event detection and earthquake reporting system development”, IEEE Trans.Knowl. Data Eng., vol. 25, no. 4, pp. 919–931, Apr. 2013.
[3] J. Allan, “Topic Detection and Tracking: Event-Based Information Organization”, Norwell, MA, USA: Kluwer, 2002.
[4] N. Wanichayapong, W. Pruthipunyaskul, W. Pattara-Atikom, and P. Chaovalit, “Social-based traffic information extraction and classification”, in Proc. 11th Int. Conf. ITST, St. Petersburg, Russia, pp. 107–112, 2011.
[5] E. D'Andrea P. Ducange B. Lazzerini F. Marcelloni "Real-time detection of traffic from twitter stream analysis" IEEE Trans. Intell. Transp. Syst. vol. 16 no. 4 pp. 1-15, Aug. 2015.
[6] Khodra, M.L., Purwarianti, A., “Optimal Path Finding based on Traffic Information Extraction from Twitter”, Prosiding International Conference on ICT for Smart Society 2013, Jakarta, 2013.
[7] Endarnoto, S., Pradipta, S., A.S, N., & Purnama, J, “Traffic Condition Information Extraction & Visualizations from Social Media Twitter for Android Mobile Application”, ICEEI (pp. 1-4). IEEE, 2011.
[8] Jiang, J., “Information Extraction from Text, in Mining Text Data”, Springer, 2012.
[9] A. Hotho, A. Nürnberger, and G. Paaß, “A brief survey of text mining”, LDV Forum-GLDV J. Comput. Linguistics Lang. Technol., vol. 20, no. 1, pp. 19–62, May 2005.
[10] C. D. Manning, P. Raghavan, and H. Schutze, “Introduction to Information Retrieval”, Camridge: Cambridge University Press, 2008.
[11] Fauzi, M. Ali; Arifin, Agus; Yuniarti, Anny, “Term Weighting Berbasis Indeks Buku dan Kelas untuk Perangkingan Dokumen Berbahasa Arab”, Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, vol.5 no.2, Aug.2014.
[12] Khodra, M.L., Purwarianti, A., “Ekstraksi Informasi Transaksi Online pada Twitter”, Jurnal Cybermatika, vol.1, 2013.
[13] Khodra, M.L., Purwarianti, A., “Optimal Path Finding based on Traffic Information Extraction from Twitter”, Prosiding International Conference on ICT for Smart Society 2013, Jakarta 2013.
[14] N. Indra, “Sistem Pemberi Tahu Kemacetan Lalu Lintas di Kota Bandung Berbasis Media Sosial”, Laporan tugas akhir, InstitutTeknologi Bandung, Bandung: Program Studi Teknik Informatika.
[15] Manning, C., Information Extraction and Named Entity Recognition.California: Stanford University. 2012.
The Authors submitting a manuscript do so on the understanding that if accepted for publication, the copyright of the article shall be assigned to Jurnal Lontar Komputer as the publisher of the journal. Copyright encompasses exclusive rights to reproduce and deliver the article in all forms and media, as well as translations. The reproduction of any part of this journal (printed or online) will be allowed only with written permission from Jurnal Lontar Komputer. The Editorial Board of Jurnal Lontar Komputer makes every effort to ensure that no wrong or misleading data, opinions, or statements be published in the journal.
This work is licensed under a Creative Commons Attribution 4.0 International License.