KEBI 1.0: Indonesian Spelling Error Detection System for Scientific Papers using Dictionary Lookup and Peter Norvig Spelling Corrector
: Indonesian Spelling Error Detection System for Scientific Papers using Dictionary Lookup and Peter Norvig Spelling Corrector
Abstract
Many Indonesian spelling errors occur in research papers published to the public, closely related to academics in all institutions such as research institutions, government, schools, and universities. The spelling errors usually writing punctuation, writing letters, writing words, writing words originating from foreign or regional languages (uptake words), using affixed words, and writing ineffective sentences. The mistakes made by the academics then become a cycle in the academic environment. They usually provide guidance for writing an undergraduate thesis, thesis, dissertations to students, or the other forms of documents and scientific papers. Therefore, the research proposed the application to facilitate all authors of scientific papers in producing quality scientific works based on the General Guidelines for Indonesian Spelling published by the Agency for Development and Language Development. The application is named KEBI 1.0 Checker (Indonesian Spelling Error 1.0 Checker), a web-based application with a built-in algorithm to detect and correct Indonesian Spelling in scientific papers. The experiment result shows that the application has given the best accuracy performance to correct the non-standard words, and typographical errors reached 100% and 55,52%, respectively. The application also has been detected 209 meaningless words. The application processing time is relatively low, the average time needed to correct non-standard words is 0.016 seconds, and typo words are 14.58 seconds. KEBI 1.0 Checker is helpful for the end-user in academics but needs to improve the vocabulary of the large corpus in various fields of science for correcting typo words.
Downloads
References
[2] H. Alwi and dkk, Tata Bahasa Baku Bahasa Indonesia, Jakarta: Balai Pustaka, 1998.
[3] Murtiningsih, "Kesalahan Berbahasa Indonesia Mahasiswa S-1 PGSD STIKIP Nuuwar Fak-fak," Peneliti Ilmu Pendidik, vol. 6, no. 1, pp. 74-82, 2013.
[4] G. L. Y. Londo, Y. S. P. W.P. and M. Maslim, "Pembangunan Aplikasi Identifikasi Kesalahan Ketik Dokumen Berbahasa Indonesia Menggunakan Algoritma Jaro-Winkler Distance," AKSIS: Jurnal Pendidikan Bahasa dan Sastra Indonesia, vol. 2, no. 2, pp. 138-153, 2018.
[5] T. Hartina and A. Masri, "Pendeteksi Kesalahan Pengetikan Kata Nonbaku pada Karya Tulis Menggunakan N-Gram," Jurnal Informatika, vol. 7, no. 1, pp. 77-84, 2020.
[6] R. N. E. Anggraini, M. A. Zinni and S. Rochimah, "Kakas Bantu Pendeteksi Kesalahan Tanda Baca pada Karya Tulis Ilmiah," JUTI, vol. 14, no. 1, pp. 117-125, 2016.
[7] D. Surianto, D. Triyanto and U. Ristian, "Penerapan Algoritma Boyer Moore dan Metode N-Gram pada Aplikasi Penyunting Naskah Teks Bahasa Indonesia Berbasis Web," Coding: Jurnal Komputer dan Aplikasi, vol. 8, no. 3, pp. 50-60, 2020.
[8] V. C. Mawardi, N. Susanto and D. S. Naga, "Spelling Correction for Text Documents in Bahasa Indonesia Using Finite State Automata and Levinshtein Distance Method," MATEC Web of Conferences, vol. 164, no. 7, p. 1047, 2018.
[9] B. Irmawati, H. Shindoa and Y. Matsumotoa, "Exploiting Syntactic Similarities for Preposition Error Corrections on Indonesian Sentences Written by Second Language Learner," Procedia Computer Science, vol. 81, pp. 214-220, 2016.
[10] D. Gunawan, Z. Saniyah and A. Hizriadi, "Normalization of Abbreviation and Acronym on Microtext in Bahasa Indonesia by Using Directionary-based and Longest Common Subsequence (LCS)," Procedia Computer Science, vol. 161, pp. 533-559, 2019.
[11] B. D. Nurwicaksono and D. Amelia, "Analisis Kesalahan Berbahasa Indonesia Pada Teks Ilmiah Mahasiswa," AKSIS: Jurnal Pendidikan Bahasa dan Sastra Indonesia, vol. 2, no. 2, pp. 138-153, 2018.
[12] H. C. Dulay, M. K. Burt and S. D. Krashen, Language Two, New York: Oxford University, 1982.
[13] Supriadin, "Identifikasi Penggunaan Kosakata Baku Dalam Wacana Bahasa Indonesia Pada Siswa Kelas VII Di SMP Negeri 1 Wera Kabupaten Bima Tahun Pelajaran 2013/2014," JIME, vol. 2, no. 2, pp. 150-161, 2016.
[14] S. Setiawati, "Penggunaan Kamus Besar Bahasa Indonesia (KBBI) Dalam Pembelajaran Kosakata Baku dan Tidak Baku pada Siswa Kelas IV SD," Jurnal Penelitian Bahasa dan Sastra Indonesia, vol. 2, no. 1, pp. 44-51, 2016.
[15] A. R. N., M. Kamayani, R. Reinanda, S. Simbolon, M. Y. Soleh and A. Purwarianti, "Application of Document Spelling Checker for Bahasa Indonesia," in 2011 International Conference on Advanced Computer Science and Information Systems, Jakarta, 2011.
[16] S. and S. Saudah, Buku Ajar Bahasa Indonesia Akademik, Yogyakarta: Pustaka Pelajar, 2015.
[17] V. S. Ningrum, "Penggunaan Kata Baku dan Tidak Baku di Kalangan Mahasiswa Universitas Pembangunan Nasional "Veteran" Yogyakarta," Jurnal Skripta: Jurnal Pembelajaran Bahasa dan Sastra Indonesia, vol. 5, no. 2, pp. 22-27, 2019.
[18] V. C. Mawardi, F. Augusfian, J. Pragantha and S. Bressan, "Spelling Correction Application with Damerau-Levenshtein Distance to Help Teachers Examine Typographical Error in Exam Test Scripts," E3S Web of Conferences, vol. 188, pp. 1-10, 2020.
[19] E. Hargittai, "Hurdles to Information Seeking: Spelling and Typographical Mistakes During Users' Online Behavior," Journal of the Association for Information Systems, vol. 7, no. 1, pp. 52-67, 2006.
[20] A. I. Fahma, I. Cholissodin and R. S. Perdana, "Identifikasi Kesalahan Penulisan Kata (Typographical Error) pada Dokumen Berbahasa Indonesia Menggunakan Metode N-gram dan Levenshtein Distance," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 1, pp. 53-62, 2018.
The Authors submitting a manuscript do so on the understanding that if accepted for publication, the copyright of the article shall be assigned to Jurnal Lontar Komputer as the publisher of the journal. Copyright encompasses exclusive rights to reproduce and deliver the article in all forms and media, as well as translations. The reproduction of any part of this journal (printed or online) will be allowed only with written permission from Jurnal Lontar Komputer. The Editorial Board of Jurnal Lontar Komputer makes every effort to ensure that no wrong or misleading data, opinions, or statements be published in the journal.
This work is licensed under a Creative Commons Attribution 4.0 International License.