Academic performance prediction model for the propedeutic course of the Escuela Politécnica Nacional and the implementation of an automated supervised learning model
In this article, a supervised machine learning model is applied that predicts the probability that a student of the National Polytechnic School will pass the leveling course. To carry out this task, a statistical methodology based on gradient boosting and logistic regression is described where the learning problem is formulated in terms of the minimization of the error function through the gradient descent method. To explain the probability of approval, dimensions suggested by the literature related to socioeconomic, demographic, family, institutional and academic performance variables are taken into consideration in the application and in the leveling course that the student has. The results of the decision tree model show a precision level of 96% in the test data set, with an area under the ROC curve of 89.1, these levels being generally accepted. On the other hand, the results of the logistic regression suggest that factors such as the weighted qualification of the first two months, the qualification with which they applied, their study schedule, their geographical location of origin, among others, affect in one way or another the probability of the student to pass the leveling course.
By participating as Author (s) in LAJC, non-exclusive copyright is transferred to the National Polytechnic School, represented by the Department of Informatics and Computer Sciences, to publish the material submitted by the Author (s) on institutional websites, or print materials from the institution.
The National Polytechnic School and the Department of Informatics and Computer Sciences, ensure that the material will not be released, nor will be used internally for profit through paid subscriptions. The material sent will be used only for academic and scientific purposes.