CaltechAUTHORS
  A Caltech Library Service

Controlling for conservation in genome-wide DNA methylation studies

Singer, Meromit and Pachter, Lior (2015) Controlling for conservation in genome-wide DNA methylation studies. BMC Genomics, 16 . Art. No. 420. ISSN 1471-2164. PMCID PMC4448855. doi:10.1186/s12864-015-1604-3. https://resolver.caltech.edu/CaltechAUTHORS:20170303-133219740

[img] PDF - Published Version
Creative Commons Attribution.

1MB
[img] PDF (Supplementary figures) - Supplemental Material
Creative Commons Attribution.

1MB
[img] PDF (Supplementary text) - Supplemental Material
Creative Commons Attribution.

260kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20170303-133219740

Abstract

BACKGROUND: A commonplace analysis in high-throughput DNA methylation studies is the comparison of methylation extent between different functional regions, computed by averaging methylation states within region types and then comparing averages between regions. For example, it has been reported that methylation is more prevalent in coding regions as compared to their neighboring introns or UTRs, leading to hypotheses about novel forms of epigenetic regulation. RESULTS: We have identified and characterized a bias present in these seemingly straightforward comparisons that results in the false detection of differences in methylation intensities across region types. This bias arises due to differences in conservation rates, rather than methylation rates, and is broadly present in the published literature. When controlling for conservation at coding start sites the differences in DNA methylation rates disappear. Moreover, a re-evaluation of methylation rates at intronexon junctions reveals that the magnitude of previously reported differences is greatly exaggerated. We introduce two correction methods to address this bias, an inference-based matrix completion algorithm and an averaging approach, tailored to address different underlying biological questions. We evaluate how analysis using these corrections affects the detection of differences in DNA methylation across functional boundaries. CONCLUSIONS: We report here on a bias in DNA methylation comparative studies that originates in conservation rate differences and manifests itself in the false discovery of differences in DNA methylation intensities and their extents. We have characterized this bias and its broad implications, and show how to control for it so as to enable the study of a variety of biological questions.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://dx.doi.org/10.1186/s12864-015-1604-3DOIArticle
http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-015-1604-3PublisherArticle
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4448855/PubMed CentralArticle
ORCID:
AuthorORCID
Pachter, Lior0000-0002-9164-6231
Additional Information:© Singer and Pachter; licensee BioMed Central. 2015. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Received: 15 April 2015. Accepted: 1 May 2015. Published: 30 May 2015. We thank Yael Mandel-Gutfreund, Idit Kosti and Asaf Zemach for helpful feedback, as well as Nicolas Bray and other members of the Pachter lab for many insightful discussions. L.P. and M.S. were partially funded by NIH R01 HG006129. Authors’ contributions: MS and LP conceived the study and conducted the mathematical characterization, statistical analysis and design of correction methods. MS implemented the COMPARE software and conducted the data analysis. MS and LP wrote the manuscript. Both authors read and approved the final manuscript. The authors declare that they have no competing interests.
Funders:
Funding AgencyGrant Number
NIHR01 HG006129
Subject Keywords:Averaging; Conservation; Comparative analysis; Missing data; DNA methylation; Junctions; Intron; Exon; Coding
PubMed Central ID:PMC4448855
DOI:10.1186/s12864-015-1604-3
Record Number:CaltechAUTHORS:20170303-133219740
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20170303-133219740
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:74707
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:03 Mar 2017 23:41
Last Modified:11 Nov 2021 05:29

Repository Staff Only: item control page