(173e) An Integer Programming Formulation to Identify the Network Architecture Governing Embryonic Stem Cell Differentiation | AIChE

(173e) An Integer Programming Formulation to Identify the Network Architecture Governing Embryonic Stem Cell Differentiation

Authors 

Banerjee, I. - Presenter, University of Pittsburgh


The primary purpose of modeling gene regulatory network for developmental process is to reveal pathways governing the cellular differentiation to specific phenotypes. Knowledge of differentiation network will enable generation of desired cell fates by careful alteration of the governing network by adequate manipulation of the cellular environment. We have developed a method to reconstruct the underlying regulatory architecture of a differentiating cell population from discrete temporal gene expression data. We utilize an inherent feature of biological networks, that of sparsity, in formulating the network reconstruction problem as a bi-level mixed-integer programming problem. The formulation optimizes the network topology at the upper level based on the notion of sparsity, and determines the network connectivity strength at the lower level. Thus the upper level is formulated as an integer programming problem, by introducing binary variables corresponding to each network connection, which is solved using evolutionary algorithm. The lower level consists of continuous variable, and is essentially a parameter estimation problem for each of the chosen network topology. The lower level is solved both by conventional regression techniques like least square and evolutionary techniques like Genetic Algorithm. The non-convexity of the objective function gives rise to multiple local minima, which can be determined using least square by altering the initial guess. However evolutionary technique proved to be superior in searching the parameter space for multiple solutions, which were more informative than a single global solution, given the uncertainty associated with biological systems. Likewise, the upper level also had associated redundancy in network connectivity. However, the connections with higher connectivity strength were found to be invariant across different solutions. Constraints of sparsity and robustness were imposed to determine the optimal network connectivity. The method is first validated by in-silico data, before applying it to a more complex system of embryonic stem (ES) cell differentiation.

The differentiation process is considered to be occurring in a sequence of cascades, enabling modular treatment of the overall ES cell differentiation network. Current work concentrates on a specific stage of pancreatic differentiation, marked by Pdx-1 expression. Murine ES cells are first differentiated to definitive endoderm followed by induction of pancreatic lineage by up-regulation of Pdx-1 transcription factors. The pancreatic differentiation stage is tracked by analyzing the differentiating cell population for relevant transcription factors by q-PCR. The temporal profiles of relevant transcription factors constitute the input data to the bi-level programming formulation, which efficiently identifies the nature and strength of interactions within the network. The identified topology successfully captured the interactions which have been reported in the literature of pancreas development, confirming the validity of the proposed methodology. In the next step the identified network is used to predict the pathway to the next stage of differentiation, which is marked by a higher expression of neurogenin3. The model prediction identifies down-regulation of Foxa2 to be a likely pathway leading to significant increase in neurogenin3 expression, the validity of which is verified by performing concurrent experiments with Foxa2 siRNA. Encouragingly, up-regulation of neurogenin3 was the strongest effect of experimental Foxa2 silencing which validates the prediction of the developed mathematical modeling framework.

The developed methodology will have significant impact in extracting the regulatory information from a differentiating population of embryonic stem cells. Although currently the methodology is applied to pancreatic differentiation of murine ES cells, the framework is general enough to be applicable to any developmental system.