Evaluacion De Metodos De Imputacion Para Datos De Expresion Genetica

0 Comentarios ».

Resumen del Libro



The technology of microarrays introduced in the middle of the nineties allow the analysis of the gene expression levels of thousands of genes simultaneously. The identification of genes with an expression level very different to the others is crucial to identify the possible causes of certain illness and it permits to create a treatment for its cure. Due to many reasons related to the microarray technology is common to find missing values in the gene expression matrix. Other characteristic of the gene expression matrix is its high dimensionality. That is, it has a very large number of columns representing the genes, and few rows representing the arrays that are coming from samples taken to patients. The imputation of missing values is absolutely necessary for the application of several tasks of Data Mining and Knowledge Discovery in Bioinformatics. One of there tasks is the identification of differentially expressed genes. There are several imputation methods for this kind of data. Unfortunately, most of them have been tested in one or two datasets, and until now there is not a general evaluation of the imputation methods. In this thesis, a comparison of five methods for imputation of gene expression data is carried out. Six well known gene expression data related to cancer are used. The comparison is done using two criterion: the normalized root mean squared error (NRMSE) and the percentage of differential expressed genes lost after the imputation. Finally, a recommendation on the use of the imputation methods is given, and an explanation of such recommendation is discussed.


Autores



Opciones de descarga:

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Este sitio usa Akismet para reducir el spam. Aprende cómo se procesan los datos de tus comentarios.