Imputation of Missing Gene Expressions from Microarray Dataset: A Review

International Journal of Computer Trends and Technology (IJCTT)          
© 2017 by IJCTT Journal
Volume-46 Number-1
Year of Publication : 2017
Authors : Chanda Panse(Wajgi), Manali Kshirsagar, Dipak Wajgi
DOI :  10.14445/22312803/IJCTT-V46P104


Chanda Panse(Wajgi), Manali Kshirsagar, Dipak Wajgi "Imputation of Missing Gene Expressions from Microarray Dataset: A Review". International Journal of Computer Trends and Technology (IJCTT) V46(1):15-22, April 2017. ISSN:2231-2803. Published by Seventh Sense Research Group.

Abstract -
DNA microarray technology captures gene expressions of thousands of genes simultaneously. But while recording these gene expressions through software after scanning, missing values get generated in the database due to various artifacts. It could be due to variety of reasons including hybridization failures, artefacts on the microarray, insufficient resolution, noisy image or corrupted image. It may also occur systematically as a result of the spotting process. This hinders performance of downstream analysis. There are certain solutions proposed in the literature to deal with this problem but due to their limitations imputation of missing values is preferred as the best solutions. This paper presented a review of existing methods used for imputation of missing values along with their advantages and limitations.

1. Y. Yang, M. Buckley, and S. Dudoit, “Comparison of methods for image Analysis in cDNA microarray data”, Technical Report No.584, Department of Statistics, UC Berkeley University,2000
2. M. Ouyang, W. Welsh, P. Georgopoulos, “Gaussian mixture clustering and imputation of microarray data”, Journal of Bioinformatics, 20(6):913-923, 2004
3. C. Chiu, S. Chan, C. Wang, and W. Wu, “Missing value imputation for microarray data: a comprehensive comparison study and a web tool”, BMC System Biology, 7(6):1-13,2013
4. A. De Brevern, and H. Malpertuy “A: Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering” BMC Bioinformatics, 5(114):4272-4279, 2004
5. I. Scheel, M. Aldrin, I. Glad, R. Sorum, H. Lyng, and A. Frigessi, “The influence of missing value imputation on detection of differentially expressed genes from microarray data”, Bioinformatics, 21(23):4272-4279, 2005.
6. M. Sehgal, I. Gondal, L. Dooley, and R. Coppel, “How to improve postgenomic knowledge discovery using imputation”, EURASIP Journal on Bioinformatics and Systems Biology, 2009(717136):1-14,2009
7. Y. Zhang, J. Xuan, B. Reyes, R. Clarke, and H. Ressom, “Reverse engineering module networks by PSO-RNN hybrid modelling”, BMC Genomics, 10(1):1-15, 2009
8. S. Oba, M. Sato, I. Takemasa, M. Monden, K. Matsubara, and S. Ishii, “A Bayesian missing value estimation method for gene expression profile data,” Bioinformatics, 19(16):2088–2096, 2003
9. O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, T. Tibshirani, D. Botstein and R. Altman, “Missing value estimation methods for DNA microarrays”, International Journal of Bioinformatics, 17(6):520-525, 2001.
10. A. Gasch, P. Spellman, C. Kao, O. Harel, M. Eisen, G. Storz, D. Botstein and P. Brown, “Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes”, Journal of Molecular Biology of the cell, 11(12):4241-4257, 2000
11. K. Kim, B. Kim, and G. Yi, “Reuse of imputed data in microarray analysis increases imputation efficiency”, International Journal of BMC Bioinformatics, 5(160):1-9,2004
12. L. Bras, and J. Menezes, “Improving cluster-based missing value estimation of DNA microarray data”, International Journal of Biomolecular Engineering, 24(1): 273-282,2007
13. M. Ashburner, C. Ball, J. Blake, D. Botstein, H. Butler, J. Cherry, A. Davis, K. Dolinski, S. Dwight, and J. Eppig, “Gene ontology: tool for the unification of biology”, Journal of Nature Genetics, 25(1): 25- 29, 2000.
14. A. Alizadeh, M. Eisen, R. Davis, I. Lossos, A. Rosenwald, J. Boldrick, H. Sabet, T. Tran, and J. Powell, Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Journal of Nature, 403(6769):503-11, 2000
15. T. Bo, B. Dysvik and I. Jonassen, “LSimpute: accurate estimation of missing values in microarray data with least squares methods”, Nucleic Acids Res. e34, 32(3):1-8,2004
16. K. Hyunsoo, G. Golub and P. Haesun, “Missing value estimation for DNA microarray gene expression data: local least squares imputation”, Journal of Bioinformatics,21(2):187-198,2005
17. X. Zhang, X. Song, and H. Wang, “Sequential local least squares imputation estimating missing value of microarray data”, International Journal of Computers in Biology and Medicine, 38(1):1112-1120,2008
18. G. Brock, J Shaffer, R. Blakesley, M. Lotz and G. Tseng, “Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes”, BMC Bioinformatics, 9(12):1-12,2008
19. L. Bras and J. Menezes, “Dealing with gene expression missing data,” IEEE Transaction on Systems Biology, 153(3):105–119, 2006.
20. F. Shi, D. Zhang, J. Chen and H. Karimi, “Missing Value Estimation for Microarray Data by Bayesian Principal Component Analysis and Iterative Local Least Squares”, Mathematical Problems in Engineering, 2013(162938):1-5, 2013
21. X. Wang, A. Li, Z. Jiang, and H. Feng, “Missing value estimation for DNA microarray gene expression data by support vector regression imputation and orthogonal coding scheme”, BMC Bioinformatics,7(32), 2006
22. S. Friedland, A. Niknejad, and L. Chihara, “A simultaneous reconstruction of missing data in DNA microarrays”, Journal of Linear Algebra and its applications, 416(1):8-28,2006
23. P. Spellman, G. Sherlock, M. Zhang, V. Iyer, K. Anders, P. Brown, D. Botstein and B. Futcher, “Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization,” Molecular Biology of the Cell, 9(12):3273-3297, 1998

Microarray, Hybridization, Imputation.