A Bayesian Hidden Markov Model for Detecting Differentially Methylated Regions | AIChE

A Bayesian Hidden Markov Model for Detecting Differentially Methylated Regions

Authors 

Ji, T. - Presenter, University of Missouri at Columbia
Alterations in DNA methylation have been linked to the development and progression of many diseases. The technique of bisulfite sequencing provides methylation profiles at base resolution. Count data on the number of methylated and unmethylated reads provides information regarding the methylation level at each CpG site. As more bisulfite sequencing data becomes available, there is a growing need to utilize such data to infer methylation aberrations related to a given disease of interest. Specifically, it is necessary to develop automated and powerful algorithms that can accurately identify differentially methylated regions between treatment groups. We adopt a Bayesian approach using the hidden Markov model to account for inherent dependence in read count data. Given the expense of sequencing experiments, few replicates are available for each treatment group. A Bayesian approach that borrows information across an entire chromosome improves the reliability of statistical inferences. The proposed hidden Markov model considers location dependence among genomic loci by incorporating correlation structures as a function of genomic distance. An iterative algorithm based on expectation-maximization is designed for parameter estimation. Methylation states are inferred by identifying the best sequence of latent states from observations. Real datasets and simulation studies based on the real datasets’ data structure are used to illustrate the reliability and success of our proposed method.