Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Principal Component Analysis (PCA) for Anomaly Detection

  • Hanna Arini Parhusip Master of Data Science Departement, FSM /UKSW
  • Suryasatriya Trihandaru Master of Data Science FSM UKSW
  • Bambang Susanto Master of Data Science FSM UKSW
  • Adrianus Herry Heriadi PT Artha APuncak Semesta Indonesia (APSI)
  • Petrus Priyo Santosa
  • Yohanes Sardjono BRIN Jogjakarta
  • Johanes Dian Kurniawan Master of Data Science FSM UKSW

Abstract

This research addresses a critical issue in industrial environments: air quality, specifically regarding PM 1.0 and PM 2.5. High concentrations of these particles pose significant health risks. The study measures temperature, humidity, pressure, altitude, PM 1.0, and PM 2.5 and shows the effectiveness of using AIOT-Particle devices to analyze these features with Principal Component Analysis (PCA). Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is used to detect anomalies during the observation period. Anomalies occur when the altitude ranges from 65 to 70 units, according to PM 1.0 and PM 2.5 values. The positions where anomalies occur are illustrated based on altitude, temperature, pressure, and concentration. The results demonstrate that altitude dominates as the first feature. Finally, the research concludes that altitude, PM 1.0, and PM 2.5 are the dominant features. The study confirms the effectiveness of PCA and recommends using these three features for anomaly detection in DBSCAN. Overall, the research highlights the novelty and success of AIOT-Particle in industrial environments. 

Downloads

Download data is not yet available.

References

[1] S. Algarni, R. A. Khan, N. A. Khan, and N. M. Mubarak, “Particulate matter concentration and health risk assessment for a residential building during COVID-19 pandemic in Abha, Saudi Arabia,” Environmental Science and Pollution Research International, vol. 28, no. 46, pp. 65822–65831, 2021, doi: 10.1007/s11356-021-15534-6.
[2] T. Xayasouk, H. M. Lee, and G. Lee, “Air pollution prediction using long short-term memory (LSTM) and deep autoencoder (DAE) models,” Sustainability, vol. 12, no. 6, 2020, doi: 10.3390/su12062570.
[3] K. Prem et al., “The effect of control strategies that reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China,” The Lancet Public Health, vol. 5, no. 5, p. 2020.03.09.20033050, 2020, doi: 10.1101/2020.03.09.20033050.
[4] I. Manisalidis, E. Stavropoulou, A. Stavropoulos, and E. Bezirtzoglou, “Environmental and Health Impacts of Air Pollution: A Review,” Front Public Health, vol. 8, no. 14, pp. 1–13, 2020, doi: 10.3389/fpubh.2020.00014.
[5] Geneva: World Health Organization, “WHO global air quality guidelines,” WHO global air quality guidelines Particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide. WHO European Centre for Environment and Health, Bonn, pp. 1–360, 2021.
[6] D. E. Schraufnagel, “The Health Effects of Ultrafine Particles,” Experimental and Molecular Medicine, vol. 52, no. 3, pp. 311–317, 2020, doi: 10.1038/s12276-020-0403-3.
[7] G. B. Fioccola, R. Sommese, I. Tufano, R. Canonico, and G. Ventre, “Polluino: An efficient cloud-based management of IoT devices for air quality monitoring,” in 2016 IEEE 2nd International Forum on Research and Technologies for Society and Industry Leveraging a Better Tomorrow, RTSI 2016, 2016, pp. 1–7. doi: 10.1109/RTSI.2016.7740617.
[8] V. Mohammadi, A. M. Rahmani, A. M. Darwesh, and A. Sahafi, “Trust-based Recommendation Systems in Internet of Things: a Systematic Literature Review,” Human-centric Computing and Information Sciences, vol. 9, no. 1, 2019, doi: 10.1186/s13673-019-0183-8.
[9] M. Noura, “Interoperability in Internet of Things : Taxonomies and Open Challenges,” Mobile Networks and Applications, Vol. 24, pp. 796–809, 2019, doi: 10.1007/s11036-018-1089-9.
[10] J. Jo, B. Jo, J. Kim, S. Kim, and W. Han, “Development of an IoT-Based indoor air quality monitoring platform,” Journal of Sensors, vol. 2020, pp. 13–15, 2020, doi: 10.1155/2020/8749764.
[11] F. Durán, A. Krishna, M. Le Pallec, R. Mateescu, and G. Salaün, “Models and analysis for user-driven reconfiguration of rule-based IoT applications,” Internet of Things (Netherlands), vol. 19, no. August, pp. 1–10, 2022, doi: 10.1016/j.iot.2022.100515.
[12] Mustakim, E. Rahmi, M. R. Mundzir, S. T. Rizaldi, Okfalisa, and I. Maita, “Comparison of DBSCAN and PCA-DBSCAN Algorithm for Grouping Earthquake Area,” 2 2021 International Congress of Advanced Technology and Engineering (ICOTEN), pp. 0–4, 2021, doi: 10.1109/ICOTEN52080.2021.9493497.
[13] S. Umadevi and N. S. Rajini, “Dimensionality reduction of production data using PCA and DBSCAN techniques,” Advances in Parallel Computig, vol. 37, no. 1, pp. 458–462, 2020, doi: 10.3233/APC200184.
[14] S. Wibisono, M. T. Anwar, A. Supriyanto, and I. H. A. Amin, “Multivariate weather anomaly detection using DBSCAN clustering algorithm,” Journal of Physics: Conference Series, vol. 1869, no. 1, 2021, doi: 10.1088/1742-6596/1869/1/012077.
[15] T. W. Sung, P. W. Tsai, T. Gaber, and C. Y. Lee, “Artificial Intelligence of Things (AIoT) Technologies and Applications,” Wireless Communications and Mobile Computing, vol. 2021, 2021, doi: 10.1155/2021/9781271.
[16] H. Belyadi and A. Haghighat, “Chapter 4 - Unsupervised machine learning : clustering algorithms,” 2021, pp. 1–3. [Online]. Available: https://doi.org/10.1016/B978-0-12-821929-4.00002-0
[17] A. Fahim, “A Varied Density-based Clustering Algorithm,” Journal of Computational Science, vol. 66, p. 101925, 2023, doi: https://doi.org/10.1016/j.jocs.2022.101925.
[18] F. Huang et al., “Research on the parallelization of the DBSCAN clustering algorithm for spatial data mining based on the Spark platform,” Remote Sensing, vol. 9, no. 12, 2017, doi: 10.3390/rs9121301.
[19] M. Monshizadeh, V. Khatri, R. Kantola, and Z. Yan, “A deep density based and self-determining clustering approach to label unknown traffic,” Journal of Network Computer Applications, vol. 207, no. July, p. 103513, 2022, doi: 10.1016/j.jnca.2022.103513.
[20] G. Erda, C. Gunawan, and Z. Erda, “Grouping of Poverty in Indonesia Using K-Means With Silhouette Coefficient,” Parameter: Journal of Statistics, vol. 3, no. 1, pp. 1–6, 2023, doi: 10.22487/27765660.2023.v3.i1.16435.
[21] G. R. Igtisamova, N. N. Soloviev, F. A. Ikhsanova, D. S. Nosirov, and A. A. Abdulmanov, “Principal component analysis for assessing oil and gas production (the case of the Kogalym field),” in IOP Conference Series: Earth and Environmental Science, IOP, 2019. doi: 10.1088/1755-1315/378/1/012113.
[22] M. Greenacre, P. J. F. Groenen, T. Hastie, A. I. D’Enza, A. Markos, and E. Tuzhilina, “Principal component analysis,” Nature Reviews Methods Primers, vol. 2, no. 1, 2022, doi: 10.1038/s43586-022-00184-w.
[23] I. Gergen and M. Harmanescu, “Application of principal component analysis in the pollution assessment with heavy metals of vegetable food chain in the old mining areas,” Chemistry Central Journal, vol. 6, no. 1, pp. 1–13, 2012, doi: 10.1186/1752-153X-6-156.
[24] M. He, Y. Zhang, D. Wen, and Y. Wang, “Forecasting crude oil prices: A scaled PCA approach,” Energy Economics, vol. 97, no. May, pp. 4–7, 2021, doi: 10.1016/j.eneco.2021.105189.
[25] L. Levei et al., “Temporal trend of PM10 and associated human health risk over the past decade in Cluj-Napoca City, Romania,” Applied Sciences, vol. 10, no. 15, pp. 1–13, 2020, doi: 10.3390/APP10155331.
Published
2024-07-04
How to Cite
PARHUSIP, Hanna Arini et al. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Principal Component Analysis (PCA) for Anomaly Detection. Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, [S.l.], v. 15, n. 02, p. 75-86, july 2024. ISSN 2541-5832. Available at: <https://ojs.unud.ac.id/index.php/lontar/article/view/109995>. Date accessed: 08 feb. 2025. doi: https://doi.org/10.24843/LKJITI.2024.v15.i02.p01.