Prediction of Human Disease-Related Gene Clusters by Clustering Analysis

Sun, Peng Gang; Gao, Lin; Han, Shan

doi:10.7150/ijbs.7.61

Theranostics

International Journal of Medical Sciences

Nanotheranostics

Journal of Cancer

Journal of Genomics

open access Global reach, higher impact

Full Text | PDF

Int J Biol Sci 2011; 7(1):61-73. doi:10.7150/ijbs.7.61 This issue Cite

Research Paper

Prediction of Human Disease-Related Gene Clusters by Clustering Analysis

Peng Gang Sun^1✉, Lin Gao^1✉, Shan Han²

1. School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
2. Faculty of Science, University of Copenhagen, Copenhagen, 1307K, Denmark

Citation:

Sun PG, Gao L, Han S. Prediction of Human Disease-Related Gene Clusters by Clustering Analysis. Int J Biol Sci 2011; 7(1):61-73. doi:10.7150/ijbs.7.61. https://www.ijbs.com/v07p0061.htm

Other styles

Abstract

Since genes associated with similar diseases/disorders show an increased tendency for their protein products to interact with each other through protein-protein interactions (PPI), clustering analysis obviously as an efficient technique can be easily used to predict human disease-related gene clusters/subnetworks. Firstly, we used clustering algorithms, Markov cluster algorithm (MCL), Molecular complex detection (MCODE) and Clique percolation method (CPM) to decompose human PPI network into dense clusters as the candidates of disease-related clusters, and then a log likelihood model that integrates multiple biological evidences was proposed to score these dense clusters. Finally, we identified disease-related clusters using these dense clusters if they had higher scores. The efficiency was evaluated by a leave-one-out cross validation procedure. Our method achieved a success rate with 98.59% and recovered the hidden disease-related clusters in 34.04% cases when removed one known disease gene and all its gene-disease associations. We found that the clusters decomposed by CPM outperformed MCL and MCODE as the candidates of disease-related clusters with well-supported biological significance in biological process, molecular function and cellular component of Gene Ontology (GO) and expression of human tissues. We also found that most of the disease-related clusters consisted of tissue-specific genes that were highly expressed only in one or several tissues, and a few of those were composed of housekeeping genes (maintenance genes) that were ubiquitously expressed in most of all the tissues.

Keywords: Disease-related gene cluster, Clustering analysis, PPI network, Gene expression data

Citation styles

APA

Sun, P.G., Gao, L., Han, S. (2011). Prediction of Human Disease-Related Gene Clusters by Clustering Analysis. International Journal of Biological Sciences, 7(1), 61-73. https://doi.org/10.7150/ijbs.7.61.

ACS

Sun, P.G.; Gao, L.; Han, S. Prediction of Human Disease-Related Gene Clusters by Clustering Analysis. Int. J. Biol. Sci. 2011, 7 (1), 61-73. DOI: 10.7150/ijbs.7.61.

NLM

Sun PG, Gao L, Han S. Prediction of Human Disease-Related Gene Clusters by Clustering Analysis. Int J Biol Sci 2011; 7(1):61-73. doi:10.7150/ijbs.7.61. https://www.ijbs.com/v07p0061.htm

CSE

Sun PG, Gao L, Han S. 2011. Prediction of Human Disease-Related Gene Clusters by Clustering Analysis. Int J Biol Sci. 7(1):61-73.

This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY-NC) License. See http://ivyspring.com/terms for full terms and conditions.