Research Article

A Machine Learning Method for Detecting Depression Among College Students

by  Peter J. Yu
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 185 - Issue 24
Published: Jul 2023
Authors: Peter J. Yu
10.5120/ijca2023923003
PDF

Peter J. Yu . A Machine Learning Method for Detecting Depression Among College Students. International Journal of Computer Applications. 185, 24 (Jul 2023), 44-51. DOI=10.5120/ijca2023923003

                        @article{ 10.5120/ijca2023923003,
                        author  = { Peter J. Yu },
                        title   = { A Machine Learning Method for Detecting Depression Among College Students },
                        journal = { International Journal of Computer Applications },
                        year    = { 2023 },
                        volume  = { 185 },
                        number  = { 24 },
                        pages   = { 44-51 },
                        doi     = { 10.5120/ijca2023923003 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2023
                        %A Peter J. Yu
                        %T A Machine Learning Method for Detecting Depression Among College Students%T 
                        %J International Journal of Computer Applications
                        %V 185
                        %N 24
                        %P 44-51
                        %R 10.5120/ijca2023923003
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

As depression is becoming more prevalent on college campuses, it is increasingly a critical topic to investigate. Recently, studies using machine learning techniques have begun to predict depression and other mental illnesses. However, there is little understanding of why these mental problems occur. In this study, the causation of depression among college students posting on the popular social media platform Reddit is studied, and several machine learning classifiers for depression detection are compared. Of the 7,680 semi-anonymous Reddit posts examined, 552 contained depression-related keywords. After applying a series of natural language processing (NLP) techniques, three primary areas of depression were found among college students: institutions and programs; academic projects and assignments; and the college environment. Moreover, the results of this study show the effectiveness and performance of different machine learning classifiers. The classifier with the highest accuracy was Adaptive Boosting (AdaBoost), detecting depression with 99% accuracy, while the Random Forest classifier had the highest F1 score of 1.0.

References
  • American Psychiatric Association. (2020, October). What is depression? Psychiatry.org – What is Depression? Retrieved March 20, 2023, from https://www.psychiatry.org/patients-families/depression/what-is-depression
  • Mayo Clinic Health System. (2023, May 31). College students and Depression. Mayo Clinic Health System. https://www.mayoclinichealthsystem.org/hometown-health/speaking-of-health/college-students-and-depression
  • National Institute of Mental Health. (2020). Major depression. National Institute of Mental Health. https://www.nimh.nih.gov/health/statistics/major-depression#:~:text=In%202020%2C%20an%20estimated%2066.0,treatment%20in%20the%20past%20year
  • Beiter, R., Nash, R., McCrady, M., Rhoades, D., Linscomb, M., Clarahan, M., & Sammut, S. (2015). The prevalence and correlates of depression, anxiety, and stress in a sample of college students. Journal of Affective Disorders, 173, 90–96. https://doi.org/10.1016/j.jad.2014.10.054
  • Thurber, C. A., & Walton, E. A. (2012). Homesickness and adjustment in university students. Journal of American College Health, 60(5), 415–419. https://doi.org/10.1080/07448481.2012.673520
  • Sun, J., Hagedorn, L. S., & Zhang, Y. (Leaf). (2016). Homesickness at college: Its impact on Academic Performance and Retention. Journal of College Student Development, 57(8), 943–957. https://doi.org/10.1353/csd.2016.0092
  • Barbayannis, G., Bandari, M., Zheng, X., Baquerizo, H., Pecor, K. W., & Ming, X. (2022). Academic stress and mental well-being in college students: Correlations, affected groups, and covid-19. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.886344
  • Liu, X. Q., Guo, Y. X., Zhang, W. J., & Gao, W. J. (2022). Influencing factors, prediction and prevention of depression in college students: A literature review. World journal of psychiatry, 12(7), 860–873. https://doi.org/10.5498/wjp.v12.i7.860
  • Goswami, S., Sachdeva, S., & Sachdeva, R. (2012). Body image satisfaction among female college students. Industrial psychiatry journal, 21(2), 168–172. https://doi.org/10.4103/0972-6748.119653
  • Orzech, K. M., Salafsky, D. B., & Hamilton, L. A. (2011). The state of sleep among college students at a large public university. Journal of American college health: J of ACH, 59(7), 612–619. https://doi.org/10.1080/07448481.2010.520051
  • Doom, J. R., & Haeffel, G. J. (2013). Teasing apart the effects of cognition, stress, and depression on health. American Journal of Health Behavior, 37(5), 610–619. https://doi.org/10.5993/ajhb.37.5.4
  • Ebert, D. D., Buntrock, C., Mortier, P., Auerbach, R., Weisel, K. K., Kessler, R. C., Cuijpers, P., Green, J. G., Kiekens, G., Nock, M. K., Demyttenaere, K., & Bruffaerts, R. (2018). Prediction of major depressive disorder onset in college students. Depression and Anxiety, 36(4), 294–304. https://doi.org/10.1002/da.22867
  • Shen, J. H., & Rudzicz, F. (2017). Detecting anxiety through Reddit. Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology -From Linguistic Signal to Clinical Reality. https://doi.org/10.18653/v1/w17-3107
  • Yu, P. (in press). Entrepreneurial Struggle: A Natural Language Processing Approach. International Journal of High School Research.
  • Gil, M., Kim, S.-S., & Min, E. J. (2022). Machine learning models for predicting risk of depression in Korean college students: Identifying family and individual factors. Frontiers in Public Health, 10. https://doi.org/10.3389/fpubh.2022.1023010
  • Proferes, N., Jones, N., Gilbert, S., Fiesler, C., & Zimmer, M. (2021). Studying Reddit: A systematic overview of disciplines, approaches, methods, and Ethics. Social Media + Society, 7(2), 205630512110190. https://doi.org/10.1177/20563051211019004
  • Loper, E., & Bird, S. (2002). NLTK. Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics -. https://doi.org/10.3115/1118108.1118117
  • Balakrishnan, V., & Ethel, L.-Y. (2014). Stemming and lemmatization: A comparison of retrieval performances. Lecture Notes on Software Engineering, 2(3), 262–267. https://doi.org/10.7763/lnse.2014.v2.134
  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
  • Darling, W. M. (2011, December). A theoretical and practical implementation tutorial on topic modeling and gibbs sampling. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies (pp. 642-647).
  • GeeksforGeeks. (2021, June 6). Latent dirichlet allocation. GeeksforGeeks. https://www.geeksforgeeks.org/latent-dirichlet-allocation/
  • Clark, S. (2013). Topic modelling and latent dirichlet allocation. Online, Lent.
  • Rosner, F., Hinneburg, A., Röder, M., Nettling, M., & Both, A. (2014, March 25). Evaluating topic coherence measures. arXiv.org. https://arxiv.org/abs/1403.6397
  • Zvornicanin, W. by: E. (2023, May 31). When coherence score is good or bad in topic modeling?. Baeldung on Computer Science. https://www.baeldung.com/cs/topic-modeling-coherence-score
  • Pleplé, Q. (2013). Topic Coherence To Evaluate Topic Models. Topic coherence to evaluate topic models. http://qpleple.com/topic-coherence-to-evaluate-topic-models/
  • Edgar, T. W., & Manz, D. O. (2017). Science and cyber security. Research Methods for Cyber Security, 33–62. https://doi.org/10.1016/b978-0-12-805349-2.00002-9
  • Noble, W. S. (2006). What is a support vector machine?. Nature News. https://www.nature.com/articles/nbt1206-1565
  • Speiser, J. L., Miller, M. E., Tooze, J., & Ip, E. (2019). A comparison of random forest variable selection methods for classification prediction modeling. Expert Systems with Applications, 134, 93–101. https://doi.org/10.1016/j.eswa.2019.05.028
  • Tadesse, M. M., Lin, H., Xu, B., & Yang, L. (2019). Detection of depression-related posts in Reddit Social Media Forum. IEEE Access, 7, 44883–44893. https://doi.org/10.1109/access.2019.2909180
  • Korstanje, J. (2021, August 31). The F1 score. Medium. https://towardsdatascience.com/the-f1-score-bec2bbc38aa6#:~:text=The%20F1%20score%20is%20defined,when%20computing%20an%20average%20rate.
  • van der Maaten , L., & Hinton, G. (2008). Visualizing data using T-SNE. Journal of Machine Learning Research. https://jmlr.csail.mit.edu/papers/volume9/vandermaaten08a/vandermaaten08a.pdf
  • Huilgol, P. (2019, August 24). Accuracy vs. F1-score. Medium. https://medium.com/analytics-vidhya/accuracy-vs-f1-score-6258237beca2
  • Vandana, Marriwala, N., & Chaudhary, D. (2023). A hybrid model for depression detection using Deep Learning. Measurement: Sensors, 25, 100587. https://doi.org/10.1016/j.measen.2022.100587
  • Patel, M. J., Khalaf, A., & Aizenstein, H. J. (2016). Studying depression using imaging and Machine Learning Methods. NeuroImage: Clinical, 10, 115–123. https://doi.org/10.1016/j.nicl.2015.11.003
  • Gitnux, A. (2023, July 12). Reddit user statistics and Trends in 2023 • gitnux. GITNUX. https://blog.gitnux.com/reddit-user-statistics/#:~:text=engage%20in%20conversations.-,With%20over%20430%20million%20monthly%20active%20users%2C%2074%25%20of%20which,ranging%20from%20politics%20to%20entertainment.
  • Barthel, M., Stocking, G., Holcomb, J., & Mitchell, A. (2016). Seven-in-ten Reddit users get news on the site.
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

College College Students Depression Mental Health Machine Learning Natural Language Processing (NLP) Latent Dirichlet Allocation (LDA) Social Media Reddit

Powered by PhDFocusTM