Comparison of F1-Score Naive Bayes, Logistic Regression, K-Nearest Neighbors, and SVM for Sentiment Classification X in Police Institutions
Downloads
Background: Social media, especially platform X, is the main channel for the public to express their opinions on public institutions, including the police. Analysis of public sentiment on this platform can provide insight into police performance. This study aims to compare the performance of machine learning algorithms in the classification of negative sentiment towards policing, focusing on unbalanced social media data. Objective: This study aims to compare the performance of machine learning algorithms—Naive Bayes, Logistic Regression, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM)—in classifying negative sentiments towards policing on social media X, as well as overcoming data imbalances using the SMOTE method. Method: The dataset consisted of 1,274 Indonesian-language data collected by crawling, then processed using preprocessing techniques such as text cleaning, stopword removal, and TF-IDF feature extraction. Testing is conducted with and without the implementation of SMOTE for data balancing. Evaluate the model's performance using F1-Score. Result: Without SMOTE, all algorithms fail to recognize neutral classes. After the implementation of SMOTE, Logistic Regression showed the best performance with an F1-Score of 80.85%, followed by SVM, Naive Bayes, and KNN. The implementation of SMOTE significantly improves the model's ability to classify negative sentiments. Conclusion: The combination of Logistic Regression and SMOTE is the best approach to classifying public sentiment towards policing, which can help police agencies understand public sentiment more accurately.
Al Mustaqim, D., Hakim, F. A., Atfalina, H., & Fatakh, A. (2024). Peran media sosial sebagai sarana partisipasi warganet dalam mewujudukan keadilan dan akuntabilitas penegakan hukum di Indonesia. Journal of Multidisciplinary Research and Development, 1(1), 53–66.
AminiMotlagh, M., Shahhoseini, H., & Fatehi, N. (2022). A reliable sentiment analysis for classification of tweets in social networks. Social Network Analysis and Mining, 13(1), 7.
Amri, M. R. A., Permana, E., Pachadria, P. A., & Fitri, S. (2025). Perbandingan Metode Naïve Bayes dan Random Forest dalam Memprediksi Penyakit Diabetes Melitus pada Klinik Citra Sejati. Jurnal Teknologi Informasi Dan Multimedia, 7(4), 847–858.
Bahtiar, S. A. H., Dewa, C. K., & Luthfi, A. (2023). Comparison of Naïve Bayes and logistic regression in sentiment analysis on marketplace reviews using rating-based labeling. Journal of Information Systems and Informatics, 5(3), 915–927.
Brownlee, J., Sanderson, M., Koshy, A., Cheremskoy, A., & Halfyard, J. (2020). Machine Learning Mastery With Python: Data Cleaning, Feature Selection, and Data Transforms in Python. Machine Learning Mastery: Vermont, VIC, Australia.
Cam, H., Cam, A. V., Demirel, U., & Ahmed, S. (2024). Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers. Heliyon, 10(1).
Chen, W., Yang, K., Yu, Z., Shi, Y., & Chen, C. L. P. (2024). A survey on imbalanced learning: latest research, applications and future directions. Artificial Intelligence Review, 57(6), 137.
Dikiyanti, T. D., Rukmi, A. M., & Irawan, M. I. (2021). Sentiment analysis and topic modeling of BPJS Kesehatan based on twitter crawling data using Indonesian Sentiment Lexicon and Latent Dirichlet Allocation algorithm. Journal of Physics: Conference Series, 1821(1), 12054.
Effendi, I. F., Utami, D. A., Rahmawati, R. A., Prasetyowibowo, R., & Isbandono, P. (2023). Twitter data sentiment analysis on the economic sector: Public response to government policies during the covid-19 pandemic in indonesia. International Joint Conference on Arts and Humanities 2023 (IJCAH 2023), 472–491.
Handika, Y., Hanif, I. F., & Hasan, F. N. (2024). Analysis of Public Sentiment Towards POLRI’s Performance using Naive Bayes and K-Nearest Neighbors. IJID (International Journal on Informatics for Development), 13(1), 386–399.
Iqbal, M., Afdal, M., & Novita, R. (2024). Implementasi Algoritma Support Vector Machine Untuk Analisa Sentimen Data Ulasan Aplikasi Pinjaman Online di Google Play Store: Implementation of Support Vector Machine Algorithm for Sentiment Analysis of Online Loan Application Review Data on Google Play Store. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 4(4), 1244–1252.
Julizar, A., & Sulaeman, M. K. (n.d.). Evaluation of Logistic Regression and Random Forest Algorithms for Hate Speech Identification.
Manaf, S. A. R., Alamudi, A., & Fitrianto, A. (2023). Analisis Sentimen Tanggapan Masyarakat Pengguna Twitter terhadap Pembelajaran Tatap Muka. Indonesian Journal of Statistics & Its Applications, 7(1).
Mantika, A. M., Triayudi, A., & Aldisa, R. T. (2024). Sentiment analysis on twitter using naïve Bayes and logistic regression for the 2024 presidential election. SaNa: Journal of Blockchain, NFTs and Metaverse Technology, 2(1), 44–55.
Matarat, K., Mingmuang, C., & Charoenrat, W. (n.d.). A Comprehensive Performance Analysis of Supervised Machine Learning Techniques for Sentiment Analysis. International Journal of Computer Applications, 975, 8887.
Nasution, M. R. A., & Hayaty, M. (2019). Perbandingan akurasi dan waktu proses algoritma K-NN dan SVM dalam analisis sentimen twitter. Jurnal Informatika, 6(2), 226–235.
Permatasari, P. A., Linawati, L., & Jasa, L. (2021). Survei Tentang Analisis Sentimen Pada Media Sosial. Majalah Ilmiah Teknologi Elektro, 20(2), 177.
Qi, Y., & Shabrina, Z. (2023). Sentiment analysis using Twitter data: a comparative application of lexicon-and machine-learning-based approach. Social Network Analysis and Mining, 13(1), 31.
Rakasiwi, R. K. M., Kurnianingsih, K., Suharjono, A., Enriko, I. K. A., & Kubota, N. (2024). Predicting Battery Storage of Residential PV Using Long Short-Term Memory. JOIV: International Journal on Informatics Visualization, 8(1), 141–149.
Sabir, A., Ali, H. A., & Aljabery, M. A. (2024). ChatGPT tweets sentiment analysis using machine learning and data classification. Informatica, 48(7).
Suandi, F., Anam, M. K., Firdaus, M. B., Fadli, S., Lathifah, L., Yumami, E., Saleh, A., & Hasibuan, A. Z. (2024). Enhancing Sentiment Analysis Performance Using SMOTE and Majority Voting in Machine Learning Algorithms. 7th International Conference on Applied Engineering (ICAE 2024), 126–138.
Sulasno, M. S., Amalia, H., & Paramarta, V. (2022). Analisis Sentimen Opini Masyarakat Indonesia Terhadap Vaksin Covid-19 Pada Sosial Media Twitter Menggunakan Metode Naïve Bayes dan k-Nearest Neighbors. TopUp Perbanas, 1(1).
Syahrohim, I., Saputra, S. D., Saputra, R. W., Pranatawijaya, V. H., & Priskila, R. (2024). Perbandingan analisis sentimen setelah pilpres 2024 di Twitter menggunakan algoritma machine learning. Jurnal Informatika Dan Teknik Elektro Terapan, 12(2). https://doi.org/10.23960/jitet.v12i2.4249
Taskiran, S. F., Turkoglu, B., Kaya, E., & Asuroglu, T. (2025). A comprehensive evaluation of oversampling techniques for enhancing text classification performance. Scientific Reports, 15(1), 21631.
Thurnhofer-Hemsi, K., López-Rubio, E., Molina-Cabello, M. A., & Najarian, K. (2020). Radial basis function kernel optimization for support vector machine classifiers. ArXiv Preprint ArXiv:2007.08233.
Wang, L., Han, M., Li, X., Zhang, N., & Cheng, H. (2021). Review of classification methods on unbalanced data sets. Ieee Access, 9, 64606–64628.
Wijati, D., Atika, P. D., Setiawati, S., & Rasim, R. (2024). Sentiment analysis of application reviews using the K-Nearest Neighbors (KNN) algorithm. PIKSEL: Penelitian Ilmu Komputer Sistem Embedded and Logic, 12(1), 209–218.
Copyright (c) 2026 Robertos Hartanto Wijaya

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


