Incorporating Knowledge of Topology Improves Reconstruction of Interaction Networks from Microarray Data
Peter Larsen, Eyad Almasri, Guanrao Chen and Yang Dai
Lecture Notes in Bioinformatics, Vol. 4983 (eds.by I.I. Mandoiu, Raj Sunderraman, and A. Zelikovsky), Springer Verlag, pp. 434-443, 2008.
PDF
Abstract Reconstruction of biological interaction networks from highthroughput
experimental data is one of the most challenging problems
in bioinformatics. These networks have specific topologies, whose
characteristics are defined by evolutionary relationships between
proteins and the physical limitations imposed on proteins interacting in
three-dimensional space. In this study, a method is proposed applying
the topology of known biological networks to the analysis of
microarray data for protein-protein binding interactions. In this method,
genomic biological networks are derived from the body of published
scientific literature. The numbers of interacting neighbors for proteins
of specific molecular functions are observed. That information is used
in the analysis of microarray expression data to regenerate biological
networks using a rank-based algorithm, Gene Ontology Restricted
Value Neighborhood (GRV-N). The results of this analysis demonstrate
that incorporating knowledge of network topology improves the ability
of expression analysis to reconstruct interaction networks with a high
degree of biological relevance.
Incorprating Literature Knowledge in Baysian Network for Inferring Gene Networks with Gene Expression Data
Eyad Almasri, Peter Larsen, Guanrao Chen and Yang Dai
Lecture Notes in Bioinformatics, Vol. 4983 (eds. by I.I. Mandoiu, Raj Sunderraman, and A. Zelikovsky), Springer Verlag, pp.184-195, 2008.
PDF
Abstract The reconstruction of gene networks from microarray gene
expression has been a challenging problem in bioinformatics. Various
methods have been proposed for this problem. The incorporation of var-
ious genomic and proteomic data has been shown to enhance the learning
ability in the Bayesian Network (BN) approach. However, the knowledge
embedded in the large body of published literature has not been utilized
in a systematic way. In this work, prior knowledge on gene interaction
was derived based on the statistical analysis of published interactions
between pairs of genes or gene products. This information was used (1)
to construct a structure prior and (2) to reduce the search space in the
BN algorithm. The performance of the two approaches was evaluated
and compared with the BN method without prior knowledge on two
time course microarray gene expression data related to the yeast cell cy-
cle. The results indicate that the proposed algorithms can identify edges
in learned networks with higher biological relevance. Furthermore, the
method using literature knowledge for the reduction of the search space
outperformed the method using a structure prior in the BN framework.
Rank-based edge reconstruction for scale-free genetic regulatory networks
Guanrao Chen, Peter Larsen, Eyad Almasri, Yang Dai,
BMC Bioinformatics, 9:75, 2008.
PDF
Background
The reconstruction of genetic regulatory networks from microarray gene expression data has been a challenging task in bioinformatics. Various approaches to this problem have been proposed, however, they do not take into account the topological characteristics of the targeted networks while reconstructing them.
Results
In this study, an algorithm that explores the scale-free topology of networks was proposed based on the modification of a rank-based algorithm for network reconstruction. The new algorithm was evaluated with the use of both simulated and microarray gene expression data. The results demonstrated that the proposed algorithm outperforms the original rank-based algorithm. In addition, in comparison with the Bayesian Network approach, the results show that the proposed algorithm gives much better recovery of the underlying network when sample size is much smaller relative to the number of genes.
Conclusions
The proposed algorithm is expected to be useful in the reconstruction of biological networks whose degree distributions follow the scale-free topologg.
A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments
Peter Larsen, Eyad Almasri, Guanrao Chen, Yang Dai
BMC Bioinformatics, 8:317, 2007 PDF 
Background
The incorporation of prior biological knowledge in the analysis of microarray data has become important in the reconstruction of transcription regulatory networks in a cell. Most of the current research has been focused on the integration of multiple sets of microarray data as well as curated databases for a genome scale reconstruction. However, individual researchers are more interested in the extraction of most useful information from the data of their hypothesis-driven microarray experiments. How to compile the prior biological knowledge from literature to facilitate new hypothesis generation from a microarray experiment is the focus of this work. We propose a novel method based on the statistical analysis of reported gene interactions in PubMed literature.
Results
Using Gene Ontology (GO) Molecular Function annotation for reported gene regulatory interactions in PubMed literature, a statistical analysis method was proposed for the derivation of a likelihood of interaction (LOI) score for a pair of genes. The LOI-score and the Pearson correlation coefficient of gene profiles were utilized to check if a pair of query genes would be in the above specified interaction. The method was validated in the analysis of two gene sets formed from the yeast Saccharomyces cerevisiae cell cycle microarray data. It was found that high percentage of identified interactions shares GO Biological Process annotations (39.5% for a 102 interaction enriched gene set and 23.0% for a larger 999 cyclically expressed gene set).
Conclusions
This method can uncover novel biologically relevant gene interactions. With stringent confidence levels, small interaction networks can be identified for further establishment of a hypothesis testable by biological experiment. This procedure is computationally inexpensive and can be used as a preprocessing procedure for screening potential biologically relevant gene pairs subject to the analysis with sophisticated statistical methods. An excel template of the program for calculating LOI-scores is available at
here.