Header image
Department of Bioengineering @ UIC
  

 
 
 

 
 
Computational Reconstruction of Genomewide Knowledge-based interactome

 

Incorporating Knowledge of Topology Improves Reconstruction of Interaction Networks from Microarray Data
Peter Larsen, Eyad Almasri, Guanrao Chen and Yang Dai
Lecture Notes in Bioinformatics, Vol. 4983 (eds.by I.I. Mandoiu, Raj Sunderraman, and A. Zelikovsky), Springer Verlag, pp. 434-443, 2008. PDF

Abstract Reconstruction of biological interaction networks from highthroughput experimental data is one of the most challenging problems in bioinformatics. These networks have specific topologies, whose characteristics are defined by evolutionary relationships between proteins and the physical limitations imposed on proteins interacting in three-dimensional space. In this study, a method is proposed applying the topology of known biological networks to the analysis of microarray data for protein-protein binding interactions. In this method, genomic biological networks are derived from the body of published scientific literature. The numbers of interacting neighbors for proteins of specific molecular functions are observed. That information is used in the analysis of microarray expression data to regenerate biological networks using a rank-based algorithm, Gene Ontology Restricted Value Neighborhood (GRV-N). The results of this analysis demonstrate that incorporating knowledge of network topology improves the ability of expression analysis to reconstruct interaction networks with a high degree of biological relevance.

Incorprating Literature Knowledge in Baysian Network for Inferring Gene Networks with Gene Expression Data
Eyad Almasri, Peter Larsen, Guanrao Chen and Yang Dai
Lecture Notes in Bioinformatics, Vol. 4983 (eds. by I.I. Mandoiu, Raj Sunderraman, and A. Zelikovsky), Springer Verlag, pp.184-195, 2008. PDF

Abstract The reconstruction of gene networks from microarray gene expression has been a challenging problem in bioinformatics. Various methods have been proposed for this problem. The incorporation of var- ious genomic and proteomic data has been shown to enhance the learning ability in the Bayesian Network (BN) approach. However, the knowledge embedded in the large body of published literature has not been utilized in a systematic way. In this work, prior knowledge on gene interaction was derived based on the statistical analysis of published interactions between pairs of genes or gene products. This information was used (1) to construct a structure prior and (2) to reduce the search space in the BN algorithm. The performance of the two approaches was evaluated and compared with the BN method without prior knowledge on two time course microarray gene expression data related to the yeast cell cy- cle. The results indicate that the proposed algorithms can identify edges in learned networks with higher biological relevance. Furthermore, the method using literature knowledge for the reduction of the search space outperformed the method using a structure prior in the BN framework.

Rank-based edge reconstruction for scale-free genetic regulatory networks
Guanrao Chen, Peter Larsen, Eyad Almasri, Yang Dai,
BMC Bioinformatics, 9:75, 2008. PDF

Background The reconstruction of genetic regulatory networks from microarray gene expression data has been a challenging task in bioinformatics. Various approaches to this problem have been proposed, however, they do not take into account the topological characteristics of the targeted networks while reconstructing them.
Results In this study, an algorithm that explores the scale-free topology of networks was proposed based on the modification of a rank-based algorithm for network reconstruction. The new algorithm was evaluated with the use of both simulated and microarray gene expression data. The results demonstrated that the proposed algorithm outperforms the original rank-based algorithm. In addition, in comparison with the Bayesian Network approach, the results show that the proposed algorithm gives much better recovery of the underlying network when sample size is much smaller relative to the number of genes.
Conclusions The proposed algorithm is expected to be useful in the reconstruction of biological networks whose degree distributions follow the scale-free topologg.

 

A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments
Peter Larsen, Eyad Almasri, Guanrao Chen, Yang Dai
BMC Bioinformatics, 8:317, 2007 PDF

Background The incorporation of prior biological knowledge in the analysis of microarray data has become important in the reconstruction of transcription regulatory networks in a cell. Most of the current research has been focused on the integration of multiple sets of microarray data as well as curated databases for a genome scale reconstruction. However, individual researchers are more interested in the extraction of most useful information from the data of their hypothesis-driven microarray experiments. How to compile the prior biological knowledge from literature to facilitate new hypothesis generation from a microarray experiment is the focus of this work. We propose a novel method based on the statistical analysis of reported gene interactions in PubMed literature.
Results Using Gene Ontology (GO) Molecular Function annotation for reported gene regulatory interactions in PubMed literature, a statistical analysis method was proposed for the derivation of a likelihood of interaction (LOI) score for a pair of genes. The LOI-score and the Pearson correlation coefficient of gene profiles were utilized to check if a pair of query genes would be in the above specified interaction. The method was validated in the analysis of two gene sets formed from the yeast Saccharomyces cerevisiae cell cycle microarray data. It was found that high percentage of identified interactions shares GO Biological Process annotations (39.5% for a 102 interaction enriched gene set and 23.0% for a larger 999 cyclically expressed gene set).
Conclusions This method can uncover novel biologically relevant gene interactions. With stringent confidence levels, small interaction networks can be identified for further establishment of a hypothesis testable by biological experiment. This procedure is computationally inexpensive and can be used as a preprocessing procedure for screening potential biologically relevant gene pairs subject to the analysis with sophisticated statistical methods. An excel template of the program for calculating LOI-scores is available at here.

 
 
 
 
 

Copyright 2006 UIC Bioengineering Department Dai Lab. All rights reserved