Classification of Failure Using Decision Trees Induced by Genetic Programming

Keywords: decision trees, multiclass classification, fault detection, genetic programming


Fault classification in industrial processes is of paramount importance, as it allows the implementation of preventive and corrective measures before catastrophic failures occur, which can result in significant repair costs and production loss, for example. Therefore, the purpose of this study was to develop a classification model by merging the concepts of Decision Trees with Genetic Programming. To accomplish this, the proposed model randomly generates a set of decision trees using the adapted Tennessee Eastman dataset. The generation of these trees does not rely on classical construction logic; instead, they employ an approach where the structure and characteristics of the trees are randomly determined and adjusted throughout the evolutionary process. This approach enables a broader exploration of the search space and may lead to diverse solutions. The results obtained were moderate, largely due to the high number of target classes for classification (21 classes), resulting in the creation of complex trees. The average accuracy on the test data was 0.75, indicating the need to implement new alternatives and enhancements in the algorithm to improve the results.


Download data is not yet available.


A. Rajkomar, J. Dean, e I. Kohane, “Machine Learning in Medicine,” New England Journal of Medicine, vol. 380, no. 14, pp. 1347-1358, Abr. 2019.

E. F. Brown et al., “Hierarchical decision trees for anomaly detection in interconnected systems,” in Proceedings of the International Conference on Industrial Engineering, pp. 126–132, 2020.

J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, 1992.

R. K. DeLisle and S. L. Dixon, “Induction of decision trees via evolutionary programming,” Journal of Chemical Information and Computer Sciences, vol. 44, no. 3, pp. 862–870, 2004.

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees. Chapman & Hall, 1984.

A. Silva, T. Killian, I. D. Jimenez Rodriguez, S. Son, e M. Gombolay, “Optimization Methods for Interpretable Differentiable Decision Trees in Reinforcement Learning,” arXiv, 2019.

J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, 1985.

Q. U. Nguyen, M. Zhang, K. Zhang, and S. Li, “Evolutionary construction of decision trees for multiclass classification,” IEEE Transactions on Evolutionary Computation, vol. 19, no. 6, pp. 822–834, 2015.

N. Javed, F. Gobet, e P. Lane, “Simplification of genetic programs: a literature survey,” Data Mining and Knowledge Discovery, vol. 36, no. 4, pp. 1279-1300, Abr. 2022.

R. W. J. Westerhout, F. J. J. Verhagen, and P. M. J. van den Hof, “Monitoring and diagnosis of industrial processes using chemomet- ric techniques,” Computers & Chemical Engineering, vol. 27, no. 9, pp. 1259–1273, 2003.

P. Wang and H. Wang, “A review of data-driven approaches for process systems fault detection and diagnosis,” Computers & Chemical Engi- neering, vol. 94, pp. 188–200, 2016.

M. F. D’Angelo, R. M. Palhares, R. H. Takahashi, and R. H. Loschi, “Fuzzy/bayesian change point detection approach to incipient fault detection,” Control Theory & Applications, IET, vol. 5, pp. 539–551, 2011

How to Cite
R. Rocha, L. Santos, R. Almeida Soares, F. Alves Barbosa, and M. Silveira Vasconcelos D’Angelo, “Classification of Failure Using Decision Trees Induced by Genetic Programming”, LAJC, vol. 11, no. 2, pp. 60-69, Jul. 2024.
Research Articles for the Regular Issue