Research Article

Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms

by  Harun Kunovac, Zerina Altoka
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Issue 94
Published: March 2026
Authors: Harun Kunovac, Zerina Altoka
10.5120/ijca2026926634
PDF

Harun Kunovac, Zerina Altoka . Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms. International Journal of Computer Applications. 187, 94 (March 2026), 48-54. DOI=10.5120/ijca2026926634

                        @article{ 10.5120/ijca2026926634,
                        author  = { Harun Kunovac,Zerina Altoka },
                        title   = { Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms },
                        journal = { International Journal of Computer Applications },
                        year    = { 2026 },
                        volume  = { 187 },
                        number  = { 94 },
                        pages   = { 48-54 },
                        doi     = { 10.5120/ijca2026926634 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2026
                        %A Harun Kunovac
                        %A Zerina Altoka
                        %T Predicting Software Bug Resolution Time: A Comparative Study of Machine Learning Algorithms%T 
                        %J International Journal of Computer Applications
                        %V 187
                        %N 94
                        %P 48-54
                        %R 10.5120/ijca2026926634
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Software maintenance is one of the costliest activities in the software development process, and bug fixing is among the most time-consuming. Time estimation for bug fixes is a major issue for developers and project managers, as it directly affects task order, release planning, and customer satisfaction. This study investigates the prediction of bug resolution time by classifying bugs into fast and slow groups using machine learning approaches. Publicly available issue tracking datasets are utilized, with structured metadata features (e.g., severity, priority, and comments count) and textual features from bug report summaries. Textual features were preprocessed via NLP methods like TF-IDF vectorization and text embeddings, depending on the model type. RandomForest, LogisticRegression, LightGBM, SGD Classifier, and Multi-Layer Perceptron (MLP) classifiers were optimized and tested for classification. Among utilized models, Random Forest performed best with higher F1-scores compared to others (0.772), marginally better than the closest MLP (0.745) and LightGBM (0.725). The number of comments, priority, and severity features alongside main text features made the highest contribution towards prediction. Experiment confirms that combining structured metadata and text information improves classification accuracy and provides actionable feedback to allow teams to maximize prioritization and bug-fixing allocation.

References
  • Bhattacharya, P. and Neamtiu, I. 2011. Bug-fix time prediction models: can we do better? In Proceedings of the 8th Working Conference on Mining Software Repositories (MSR ’11). Association for Computing Machinery. https://doi.org/10.1145/1985441.1985472
  • Abdelmoez, W., Kholief, M., and Elsalmy, F. M. 2012. Bug fix-time prediction model using naïve Bayes classifier. In Proceedings of the 2012 22nd International Conference on Computer Theory and Applications (ICCTA). Alexandria, Egypt. https://doi.org/10.1109/ICCTA.2012.6523564
  • Ardimento, P., Boffoli, N., and Mele, C. 2020. A text-based regression approach to predict bug-fix time. https://doi.org/10.1007/978-3-030-36617-9_5
  • Sepahvand, R., Akbari, R., and Hashemi, S. 2020. Predicting the bug fixing time using word embedding and deep LSTM. IET Software, 14. https://doi.org/10.1049/iet-sen.2019.0260
  • Habayeb, M., Murtaza, S. S., Miranskyy, A., and Bener, A. B. 2018. On the use of Hidden Markov Model to predict the time to fix bugs. IEEE Transactions on Software Engineering, 44, 12, 1224–1244. https://doi.org/10.1109/TSE.2017.2757480
  • Du, J., Ren, X., Li, H., Jiang, F., and Yu, X. 2022. Prediction of bug-fixing time based on distinguishable sequences fusion in open source software. Journal of Software: Evolution and Process, 35. https://doi.org/10.1002/smr.2443
  • Zhang, H., Gong, L., and Versteeg, S. 2013. Predicting bug-fixing time: An empirical study of commercial software projects. In Proceedings of the 2013 35th International Conference on Software Engineering (ICSE). San Francisco, CA, USA. https://doi.org/10.1109/ICSE.2013.6606654
  • Akbarinasaji, S., Caglayan, B., and Bener, A. 2018. Predicting bug-fixing time: A replication study using an open source software project. Journal of Systems and Software, 136, 173–186. https://doi.org/10.1016/j.jss.2017.02.021
  • Ozkan, H. Y., Heegaard, P. E., Kellerer, W., and Mas-Machuca, C. 2024. Bug analysis towards bug resolution time prediction. arXiv. https://doi.org/10.48550/arXiv.2407.21241
  • Acharya, J. and Ginde, G. 2025. BugsRepo: A comprehensive curated dataset of bug reports, comments, and contributors information from Bugzilla. arXiv. https://doi.org/10.48550/arXiv.2504.18806
  • Reimers, N. and Gurevych, I. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. arXiv. https://arxiv.org/abs/1908.10084
  • Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., and Zhou, M. 2020. MiniLM: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. arXiv. https://arxiv.org/abs/2002.10957
  • Breiman, L. 2001. Random forests. Machine Learning, 45, 1, 5–32. https://doi.org/10.1023/A:1010933404324
  • Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. 2017. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30.
  • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. 2019. PyTorch: An imperative style, high-performance deep learning library. arXiv. https://arxiv.org/abs/1912.01703
  • Popescu, M.-C., Balas, V. E., Perescu-Popescu, L., and Mastorakis, N. 2009. Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems, 8, 7.
  • Powers, D. M. W. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2, 1, 37–63.
  • Molnar, C. 2020. Interpretable machine learning: A guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/ Scikit-learn. n.d. Scikit-learn: Machine learning in Python. https://scikit-learn.org/stable/
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Machine learning predictive modeling classification bug resolution time software maintenance feature importance

Powered by PhDFocusTM