Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms

Harun Kunovac; Zerina Altoka

Research Article

Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms

by Harun Kunovac, Zerina Altoka

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 187 - Issue 94

Published: March 2026

Authors: Harun Kunovac, Zerina Altoka

10.5120/ijca2026926634

PDF

Harun Kunovac, Zerina Altoka . Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms. International Journal of Computer Applications. 187, 94 (March 2026), 48-54. DOI=10.5120/ijca2026926634

                        @article{ 10.5120/ijca2026926634,
                        author  = { Harun Kunovac,Zerina Altoka },
                        title   = { Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms },
                        journal = { International Journal of Computer Applications },
                        year    = { 2026 },
                        volume  = { 187 },
                        number  = { 94 },
                        pages   = { 48-54 },
                        doi     = { 10.5120/ijca2026926634 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2026
                        %A Harun Kunovac
                        %A Zerina Altoka
                        %T Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms%T 
                        %J International Journal of Computer Applications
                        %V 187
                        %N 94
                        %P 48-54
                        %R 10.5120/ijca2026926634
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

Software maintenance is one of the costliest activities in the software development process, and bug fixing is among the most time-consuming. Time estimation for bug fixes is a major issue for developers and project managers, as it directly affects task order, release planning, and customer satisfaction. This study investigates the prediction of bug resolution time by classifying bugs into fast and slow groups using machine learning approaches. Publicly available issue tracking datasets are utilized, with structured metadata features (e.g., severity, priority, and comments count) and textual features from bug report summaries. Textual features were preprocessed via NLP methods like TF-IDF vectorization and text embeddings, depending on the model type. RandomForest, LogisticRegression, LightGBM, SGD Classifier, and Multi-Layer Perceptron (MLP) classifiers were optimized and tested for classification. Among utilized models, Random Forest performed best with higher F1-scores compared to others (0.772), marginally better than the closest MLP (0.745) and LightGBM (0.725). The number of comments, priority, and severity features alongside main text features made the highest contribution towards prediction. Experiment confirms that combining structured metadata and text information improves classification accuracy and provides actionable feedback to allow teams to maximize prioritization and bug-fixing allocation.

References

Bhattacharya, P. and Neamtiu, I. 2011. Bug-fix time prediction models: can we do better? In Proceedings of the 8th Working Conference on Mining Software Repositories (MSR ’11). Association for Computing Machinery. https://doi.org/10.1145/1985441.1985472
Abdelmoez, W., Kholief, M., and Elsalmy, F. M. 2012. Bug fix-time prediction model using naïve Bayes classifier. In Proceedings of the 2012 22nd International Conference on Computer Theory and Applications (ICCTA). Alexandria, Egypt. https://doi.org/10.1109/ICCTA.2012.6523564
Ardimento, P., Boffoli, N., and Mele, C. 2020. A text-based regression approach to predict bug-fix time. https://doi.org/10.1007/978-3-030-36617-9_5
Sepahvand, R., Akbari, R., and Hashemi, S. 2020. Predicting the bug fixing time using word embedding and deep LSTM. IET Software, 14. https://doi.org/10.1049/iet-sen.2019.0260
Habayeb, M., Murtaza, S. S., Miranskyy, A., and Bener, A. B. 2018. On the use of Hidden Markov Model to predict the time to fix bugs. IEEE Transactions on Software Engineering, 44, 12, 1224–1244. https://doi.org/10.1109/TSE.2017.2757480
Du, J., Ren, X., Li, H., Jiang, F., and Yu, X. 2022. Prediction of bug-fixing time based on distinguishable sequences fusion in open source software. Journal of Software: Evolution and Process, 35. https://doi.org/10.1002/smr.2443
Zhang, H., Gong, L., and Versteeg, S. 2013. Predicting bug-fixing time: An empirical study of commercial software projects. In Proceedings of the 2013 35th International Conference on Software Engineering (ICSE). San Francisco, CA, USA. https://doi.org/10.1109/ICSE.2013.6606654
Akbarinasaji, S., Caglayan, B., and Bener, A. 2018. Predicting bug-fixing time: A replication study using an open source software project. Journal of Systems and Software, 136, 173–186. https://doi.org/10.1016/j.jss.2017.02.021
Ozkan, H. Y., Heegaard, P. E., Kellerer, W., and Mas-Machuca, C. 2024. Bug analysis towards bug resolution time prediction. arXiv. https://doi.org/10.48550/arXiv.2407.21241
Acharya, J. and Ginde, G. 2025. BugsRepo: A comprehensive curated dataset of bug reports, comments, and contributors information from Bugzilla. arXiv. https://doi.org/10.48550/arXiv.2504.18806
Reimers, N. and Gurevych, I. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. arXiv. https://arxiv.org/abs/1908.10084
Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., and Zhou, M. 2020. MiniLM: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. arXiv. https://arxiv.org/abs/2002.10957
Breiman, L. 2001. Random forests. Machine Learning, 45, 1, 5–32. https://doi.org/10.1023/A:1010933404324
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. 2017. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. 2019. PyTorch: An imperative style, high-performance deep learning library. arXiv. https://arxiv.org/abs/1912.01703
Popescu, M.-C., Balas, V. E., Perescu-Popescu, L., and Mastorakis, N. 2009. Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems, 8, 7.
Powers, D. M. W. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2, 1, 37–63.
Molnar, C. 2020. Interpretable machine learning: A guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/ Scikit-learn. n.d. Scikit-learn: Machine learning in Python. https://scikit-learn.org/stable/

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Machine learning predictive modeling classification bug resolution time software maintenance feature importance