In silico Model for Mining the Cis-Regulatory Determinants of Tissue-Specific Gene Expression | AIChE

In silico Model for Mining the Cis-Regulatory Determinants of Tissue-Specific Gene Expression

Authors 

Megraw, M. - Presenter, Oregon State University
Fraser, V. N., Oregon State University
Ansariola, M., Oregon State University
O'Neil, S., Oregon State University
Filichkin, S., Oregon State University
Gene expression across tissues is regulated by an unknown number of determinants, including prevalence of transcription factors (TFs) and their binding sites along with other aspects of cellular state. Recent studies have suggested the importance of both genetic and epigenetic aspects, at least two of which have substantial literature support as causal determinants of tissue specificity: TF binding sites, and chromatin accessibility at those sites. In order to investigate the extent and relative contributions of these potential determinants, we produced three datasets for both leaf and root tissues of Arabidopsis thaliana plants: TSS-seq data to identify Transcription Start Sites, OC-seq data to identify regions of Open Chromatin, and RNA-seq data to assess gene expression levels. For those genes that are differentially expressed between root and leaf, we constructed a model incorporating chromatin accessibility with TF binding information upstream of TSS locations, with the goal of predicting the tissue in which each of these genes would be upregulated. The resulting model was highly accurate when both chromatin structure and sequence were considered (over 90% auROC and auPRC), allowing one to predict the tissue in which a given gene will express. Specifically, one can use the model to (1) create “in silico knockouts” of TFs that strongly influence the predicted tissue of expression, and (2) identify collections of TFs whose absence moves a specific native gene promoter across the decision boundary to predict expression in a different tissue. The ultimate goal of this work is to develop a rational design process for the tissue-specific targeting of plant gene promoters.