Setting a generalized functional linear model (GFLM) for the classification of different types of cancer
Abstract
This work aims to classify the DNA sequences of healthy and malignant cancer respectively. For this, supervised and unsupervised classification methods from a functional context are used; i.e. each strand of DNA is an observation. The observations are discretized, for that reason different ways to represent these observations with functions are evaluated. In addition, an exploratory study is done: estimating the mean and variance of each functional type of cancer. For the unsupervised classification method, hierarchical clustering with different measures of functional distance is used. On the other hand, for the supervised classification method, a functional generalized linear model is used. For this model the first and second derivatives are used which are included as discriminating variables. It has been verified that one of the advantages of working in the functional context is to obtain a model to correctly classify cancers by 100%. For the implementation of the methods it has been used the fda.usc R package that includes all the techniques of functional data analysis used in this work. In addition, some that have been developed in recent decades. For more details of these techniques can be consulted Ramsay, J. O. and Silverman (2005) and Ferraty et al. (2006).
Downloads
References
Cuevas A, Febrero M, Fraiman R. 2001. Cluster Analysis: a further approach based on density estimation. Computational Statisticsand Data Analysis 36: 441–456.
Dudoit et al. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data, Journal of the American Statistical Association, 97 (457), 77-87.
Febrero-Bande, M. and Oviedo de la Fuente, M. 2012. Statistical computing in functional data analysis: The R package fda.usc. Journal of Statistical Software, 51(4):1-28.
Febrero-Bande, M. and Gonzalez-Manteiga, W. 2012. Generalized additive models for functional data. TEST, 22(2):278-292.
Ferraty, F. andVieu, P. 2006. “Nonparametric Functional Data Analysis: Theory” and Practice.Springer-Verlag, New York., Pp. 113-146.
Fraiman R. and Muniz G. 2001 Trimmed means for functional data, Test, 10(2), 419-440.
Lopez-Pintado, S.,Romo, J., Torrente A. 2010. “Robust depth-based tool for the analysis of gene expression data”. Biostatistics11, 2, pp 254-264.
Ramsay, J. O. andSilverman, B. W.2005. “Functional Data Analysis”, 2nd ed., Springer-Verlag, NewYork., pp. 147-325.
Romualdi C., Campanaro S., Campagna D., Celegato B., Cannata,N, Toppo S.,Valle G. and LanfranchiG. 2003Pattern recognition in gene expression profiling using DNA array: a comparative study of different statistical methods applied to cancer classification. Human Molecular Genetics 12, 823-836.
Singh D. et al. 2002. Gene expression correlates of clinincal prostate cancer behavior, Cancer cell, 1 (2), 203-209.
Tárraga J., Medina I., Carbonell J., Huerta-Cepas J., Mínguez P., Alloza E., Al-Shahrour F., Vegas-Azcarate S. Gotz S. Escobar P and others 2008. GEPAS a web-based tool for microarray data analysis and interpretation. Nucleic Acids Research 36, W308-W314.
Wessels L.F.A., Reinders M. J. T., Hart,A.A.M.,Veenman C.J., Dai H.,He Y.D. and Van’t Veer L.J. 2005. A protocol for building and evaluatingpredictors of disease state based on microarray data. Bioinformatics 21, 3755-3762.
Zuo Y, Serfling R. 2000.General notions of statistical depth function. Annals of Statistics 28: 461–482.
This article is published by LAJC under a Creative Commons Attribution-Non-Commercial-Share-Alike 4.0 International License. This means that non-exclusive copyright is transferred to the National Polytechnic School. The Author (s) give their consent to the Editorial Committee to publish the article in the issue that best suits the interests of this Journal. Find out more in our Copyright Notice.
Disclaimer
LAJC in no event shall be liable for any direct, indirect, incidental, punitive, or consequential copyright infringement claims related to articles that have been submitted for evaluation, or published in any issue of this journal. Find out more in our Disclaimer Notice.