Perbandingan Model Decision Tree, Support Vector Machine dan K-Nearest Neighbors untuk Memprediksi Kualitas Air Minum

Thomas Brian, Alief Nur Aisyi Maulidhia, Evi Nafiatus Sholikhah, Sekarsari Wibowo

Abstract

The need for drinking water is increasing so that appropriate method support is needed to determine water potability. In this study, machine learning models will be implemented including Decision Tree, Support Vector Machine, and K-Nearest Neighbors to determine the best model in classifying drinking water quality from the Kaggle Water Quality dataset. The dataset consists of 3,276 data with 9 parameters consisting of ph, Hardness, Solids, Chloramines, Sulfate, Conductivity, Organic_carbon, Trihalomethanes and Turbidity, and one Potability attribute as a target that indicates the feasibility of consumption. This study will apply several machine learning models consisting of Decision Tree, Support Vector Machine, and K-Nearest Neighbors. Based on the results of the trial using 20% and 30% testing data, the results are close to the same for the confusion matrix model evaluation metrics (Accuracy, F1 Score, Precision and Recall). So it can be concluded that the Decision Tree classification model gets the best Accuracy value among other classification models of 70.50% on 20% testing data and 70.98% on 30% testing data. However, the one chosen as the final classification model is Support Vector Machine because it has the highest value by meeting three requirements with F1 Score, Precision and Recall values of 82.40% each) from the four requirements tested.

Full Text:

PDF

References

Abdusyukur, F. 2023. Penerapan Algoritma Support Vector Machine (SVM) Untuk Klasifikasi Pencemaran Nama Baik di Media Sosial Twitter. Jurnal Komputa

Bansal, M., Goyal, A., Choudhary, A. 2021. A comparative analysis of K-Nearest Neighbor, Genetic, Support Vector Machine, Decision Tree, and Long Short Term Memory algorithms in machine learning. Decision Analytics Journal

Brian T., Sholikhah, E. N. 2025. Penerapan Algoritma K-Nearest Neighbor (KNN) untuk Memprediksi Kualitas Air Minum. Jurnal JTECS, vol. 5, no. 1

Christian, Y., Jacky, Winata, P. A., Ricky, dan Jeonanto, N. 2022. Prediksi Kualitas Air Menggunakan Algoritma Naïve Bayes Dan Random Forest. Komputek

Hikmayanti, H., Nurmasruriyah, A. F., Fauzi, A. 2023. Performance Comparison of Support Vector Machine Algorithm and Logistic Regression Algorithm. International Journal of Artificial Intelegence Research, Vol 7, No.1.1

Kadiwal, A. 2025. Water Potability Dataset. https://www.kaggle.com/adityakadiwal/water-potability

Maulidah, N., Maulidah, M. 2024. Prediksi Kualitas Air Menggunakan Metode Random Forest, Decision Tree, dan Gradient Boosting. Jurnal Khatulistiwa Informatika, vol. 12, no. 1, hal. 1-6

Musadi, A., Tertius, C. C., Steven, J. 2023. Comparing Artificial Neural Network and Decision Tree Algorithm to Predict Tides at Tanjung Priok Port. Procedia Computer Science

Nurmalitasari, Purwanto, E. 2022. Prediksi Performa Mahasiswa Menggunakan Model Regresi Logistik. Jurnal Derivat, vol. 9, no. 2

Nurussakinah, Faisal, M. 2023. Klasifikasi Penyakit Diabetes Menggunakan Algoritma Decision Tree. Jurnal Informatika

Putrawansyah, F., Susanti, T. 2024. Penerapan Metode Support Vector Machine Terhadap Klasifikasi Jenis Jambu Biji. Jurnal JIKO, vol. 8, no. 1, hal. 193-204

Said, H., Matondang, N., dan Irmanda, H. N. 2022. Penerapan Algoritma K-Nearest Neighbor Untuk Memprediksi Kualitas Air Yang Dapat Dikonsumsi. Techno.Com

Situngkir, R. H., Sembiring, P. 2023. Analisis Regresi Logistik Untuk Menentukan Faktor-Faktor Yang Mempengaruhi Kesejahteraan Masyarakat Kabupaten/Kota Di Pulau Nias. Jurnal Matematika dan Pendidikan Matematika

Tumangger, Sahalatua, R. M. 2020. Komparasi Metode Data Mining Support Vector Machine Dengan Naive Bayes Untuk Klasifikasi Status Kualitas Air. Universitas Brawijaya

Vidiastanta, Gusti, I., Hidayat, N., Dewi, R. K. 2020. Komparasi Metode K-Nearest Neighbors (K-NN) Dengan Support Vector Machine (SVM) Untuk Klasifikasi Status Kualitas Air. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer

Refbacks

  • There are currently no refbacks.