CRISP-DM-based Machine Learning Models for Analyzing the Depression Level in Students of the National Polytechnic School
Abstract
This project analyzes the depression rates among students from Escuela Politécnica Nacional (EPN). A total of 302 students from different EPN careers, voluntarily and anonymously completed an online survey of the Beck Depression Inventory-II (BDI-II). In addition, they were asked to answer 19 questions related to the lifestyle of an EPN student; These questions were reviewed and endorsed about their possible relationship with depressive disorders by a professional in the field of psychology. The CRISP-DM methodology was used for the project phases, which involved the analysis of the current situation, objectives setting, data collection, data preparation, and construction of ML models that allows predicting the degree of depression based on the BDI-II metrics and evaluation of the models. The model obtained has 0.59 accuracy score and shows that variables of gender, age and relationships are significant to determine severity depression.
Downloads
References
P. Retamal, Depresión - Guías para el paciente y la familia, Santiago de Chile: Editorial Universitaria, 1999.
World Health Organization (WHO), «COVID-19 pandemic triggers 25% increase in prevalence of anxiety and depression worldwide,» World Health Organization (WHO), 02 03 2022. [En línea]. Available: https://www.who.int/news/item/02-03-2022-covid-19-pandemic-triggers-25-increase-in-prevalence-of-anxiety-and-depression-worldwide. [Último acceso: 19 09 2022].
World Health Organization (WHO), «Depression,» World Health Organization (WHO), 2022. [En línea]. Available: https://www.who.int/health-topics/depression#tab=tab_1. [Último acceso: 19 09 2022].
D. Agudelo, C. Claudia y S. Diana, «CARACTERÍSTICAS DE ANSIEDAD Y DEPRESIÓN EN ESTUDIANTES UNIVERSITARIOS,» International Journal of Psychological Research, vol. 1, nº 1, pp. 34-39, 2008.
A. M. Juan, B. Nora, C. A. Paola y M. R. Fray, Prevalencia de Depresión y Factores Asociados en Estudiantes Universitarios de la Ciudad de Cuenca-Ecuador, Cuenca: Universidad de la Ciudad de Cuenca-Ecuador, 2015.
L. Alomoto y G. Cañarejo, ASOCIACIÓN ENTRE EL APOYO SOCIAL Y SÍNTOMAS DE ANSIEDAD Y DEPRESIÓN EN ESTUDIANTES UNIVERSITARIOS DE PRIMER NIVEL DE LA PONTIFICIA UNIVERSIDAD CATÓLICA DEL ECUADOR, SEDES QUITO, IBARRA, SANTO DOMINGO Y PORTOVIEJO DURANTE EL AÑO 2018., Quito: PONTIFICIA UNIVERSIDAD CATÓLICA DEL ECUADOR, 2018.
I. Gaibor y R. Moreta, «Optimismo disposicional, ansiedad, depresión y estrés en una muestra del Ecuador. Análisis inter-género y de predicción,» Actualidades en Psicología, vol. 34, nº 129, pp. 17-31, 2018.
R. Moreta, J. Zambrano, H. Sánchez y S. Naranjo, «Salud mental en universitarios del Ecuador: síntomas relevantes, diferencias por género y prevalencia de casos,» Pensamiento Psicológico, vol. 19, nº 1, pp. 1-26, 2021.
S. d. R. Puchaicela, J. Lozam, I. Fiallo, A. Benítez y A. Amaya, «Evaluación de estrés, ansiedad y depresión en Ecuador durante la pandemia de COVID-19,» La Ciencia al Servicio de la Salud y la Nutrición, vol. 13, nº 1, pp. 13-25, 2022.
R. Chiong, G. SatiaBudhi, S. Dhakal y F. Chiong, «A textual-based featuring approach for depression detection using machine learning classifiers and social media texts,» Computers in Biology and Medicine, vol. 135, 2021.
L. Danxia, F. Xing Lin, A. Farooq, S. Muhammad y G. Jing, «Detecting and Measuring Depression on Social Media Using a Machine Learning Approach: Systematic Review,» JMIR Ment Health, vol. 9, nº 3, 2022.
R. Razavi, A. Gharipour y M. Gharipour, «Depression screening using mobile phone usage metadata: a machine learning approach,» Journal of the American Medical Informatics Association, vol. 24, nº 4, pp. 522-530, 2020.
X. Xu, P. Chikersal, A. Doryab, D. K. Villalba, J. M. Dutcher, M. J. Tumminia, T. Althoff, S. Cohen, K. G. Creswell, J. D. Creswell, J. Mankoff y A. K. Dey, «Leveraging Routine Behavior and Contextually-Filtered Features for Depression Detection among College Students,» Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 3, nº 3, pp. 1-33, 2019.
M. Zhao y Z. Feng, «Machine Learning Methods to Evaluate the Depression Status of Chinese Recruits: A Diagnostic Study,» Neuropsychiatric Disease and Treatment, vol. 16, p. 2743 – 2752, 2020.
A. Beck, R. Steer y G. Brown, «BECK DEPRESSION INVENTORY-SECOND EDITION,» National Child, 2022. [En línea]. Available: https://www.nctsn.org/measures/beck-depression-inventory-second-edition. [Último acceso: 31 08 2022].
G. A. Mehmet Taha, «Beck Depression Inventory-II: A Study for Meta Analytical Reliability Generalization,» Pegem Journal of Education and Instruction, vol. 11, nº 3, pp. 88-101, 2021.
K. L. Smarr y A. Keefer, «Measures of Depression and Depressive Symptoms,» Arthritis Care & Research, vol. 63, nº 11, pp. 454-466, 2011.
IBM, «CRISP-DM Help Overview,» IBM, 17 08 2021. [En línea]. Available: https://www.ibm.com/docs/en/spss-modeler/saas?topic=dm-crisp-help-overview. [Último acceso: 19 09 2022].
S. Raschka, J. Patterson y C. Nolet, «Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence,» information, vol. 11, nº 4, pp. 1-44, 2020.
Google Colab, «Te damos la bienvenida a Colab,» Google Colab, 2022. [En línea]. Available: https://colab.research.google.com/#scrollTo=5fCEDCU_qrC0. [Último acceso: 19 09 2022].
Microsoft 365, «Microsoft Forms,» Microsoft 365, 2022. [En línea]. Available: https://www.microsoft.com/en-us/microsoft-365/online-surveys-polls-quizzes. [Último acceso: 19 09 2022].
R. Barrientos, N. Ramírez, H. Acosta, I. Suárez, M. Trejo, P. León y S. Blázquez, «Árboles de decisión como herramienta en el diagnóstico,» Revista Médica de la Universidad Veracruzana, vol. 9, nº 2, pp. 19-24, 2009.
H. Coronado, A. Han y L. García, Detección Automática de Sitios Web Fraudulentos, Madrid: Universidad Complutense de Madrid, 2020.
J. Domínguez, Inteligencia artificial para la detección de fraude en transacciones realizadas con tarjetas de crédito, Sevillar: Universidad de Sevilla, 2021.
C. Chamat, Modelo Predictivo de Deserción Estudiantil de Educación Preescolar, Básica y Media en el Municipio de Medellín, Medellín: Universidad de Antioquia, 2021.
C. Sánchez, Selection Heuristics on Semantic Genetic Programming for Classification Problems, Aguascalientes: INFOTEC CENTRO DE INVESTIGACIÓN E INNOVACIÓN EN TECNOLOGÍAS DE LA INFORMACIÓN Y COMUNICACIÓN, 2020.
F. Izco, «Base de datos corporativa de personas,» bookdown.org, 27 11 2018. [En línea]. Available: https://bookdown.org/f_izco/BDC-POC/metricas.html. [Último acceso: 20 09 2022].
N. HOTZ, «What is CRISP DM?,» Data Science Process Alliance, 08 08 2022. [En línea]. Available: https://www.datascience-pm.com/crisp-dm-2/. [Último acceso: 31 08 2022].
This article is published by LAJC under a Creative Commons Attribution-Non-Commercial-Share-Alike 4.0 International License. This means that non-exclusive copyright is transferred to the National Polytechnic School. The Author (s) give their consent to the Editorial Committee to publish the article in the issue that best suits the interests of this Journal. Find out more in our Copyright Notice.
Disclaimer
LAJC in no event shall be liable for any direct, indirect, incidental, punitive, or consequential copyright infringement claims related to articles that have been submitted for evaluation, or published in any issue of this journal. Find out more in our Disclaimer Notice.