Reverse Engineering of Temporal Gene Expression Data Using Dynamic Bayesian Networks And Evolutionary Search

Thumbnail Image
Salehi, Maryam
Reverse Engineering of Gene Regulatory Networks , Dynamic Bayesian Networks , Covariance Matrix Adaptation Evolutionary Search , Gene Expression Analysis
Capturing the mechanism of gene regulation in a living cell is essential to predict the behavior of cell in response to intercellular or extra cellular factors. Such prediction capability can potentially lead to development of improved diagnostic tests and therapeutics [21]. Amongst reverse engineering approaches that aim to model gene regulation are Dynamic Bayesian Networks (DBNs). DBNs are of particular interest as these models are capable of discovering the causal relationships between genes while dealing with noisy gene expression data. At the same time, the problem of discovering the optimum DBN model, makes structure learning of DBN a challenging topic. This is mainly due to the high dimensionality of the search space of gene expression data that makes exhaustive search strategies for identifying the best DBN structure, not practical. In this work, for the first time the application of a covariance-based evolutionary search algorithm is proposed for structure learning of DBNs. In addition, the convergence time of the proposed algorithm is improved compared to the previously reported covariance-based evolutionary search approaches. This is achieved by keeping a fixed number of good sample solutions from previous iterations. Finally, the proposed approach, M-CMA-ES, unlike gradient-based methods has a high probability to converge to a global optimum. To assess how efficient this approach works, a temporal synthetic dataset is developed. The proposed approach is then applied to this dataset as well as Brainsim dataset, a well known simulated temporal gene expression data [58]. The results indicate that the proposed method is quite efficient in reconstructing the networks in both the synthetic and Brainsim datasets. Furthermore, it outperforms other algorithms in terms of both the predicted structure accuracy and the mean square error of the reconstructed time series of gene expression data. For validation purposes, the proposed approach is also applied to a biological dataset composed of 14 cell-cycle regulated genes in yeast Saccharomyces Cerevisiae. Considering the KEGG1 pathway as the target network, the efficiency of the proposed reverse engineering approach significantly improves on the results of two previous studies of yeast cell cycle data in terms of capturing the correct interactions.
External DOI