Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published June 2012 | Supplemental Material + Accepted Version
Journal Article Open

Systematic evaluation of factors influencing ChIP-seq fidelity


We evaluated how variations in sequencing depth and other parameters influence interpretation of chromatin immunoprecipitation–sequencing (ChIP-seq) experiments. Using Drosophila melanogaster S2 cells, we generated ChIP-seq data sets for a site-specific transcription factor (Suppressor of Hairy-wing) and a histone modification (H3K36me3). We detected a chromatin-state bias: open chromatin regions yielded higher coverage, which led to false positives if not corrected. This bias had a greater effect on detection specificity than any base-composition bias. Paired-end sequencing revealed that single-end data underestimated ChIP-library complexity at high coverage. Removal of reads originating at the same base reduced false-positives but had little effect on detection sensitivity. Even at mappable-genome coverage depth of ~1 read per base pair, ~1% of the narrow peaks detected on a tiling array were missed by ChIP-seq. Evaluation of widely used ChIP-seq analysis tools suggests that adjustments or algorithm improvements are required to handle data sets with deep coverage.

Additional Information

© 2012 Nature America, Inc. Received 23 November 2011; Accepted 26 March 2012; Published online 22 April 2012. We thank the authors of all of the algorithms that we evaluated in this study: H. Ji, R. Jothi, P. Kharchenko, W. Li, D. Nix, J. Rozowsky and A. Valouev. We thank N. Bild, D. Roqueiro and M. Sabala for help in performing PeakSeq on the Bionimbus Cloud, D. Schmidt and D. Odom for sharing their sequencing data of the ENCODE spike-in sample, A. Kundaje for sharing his unpublished results on IDR analysis of H3K36me3 in humans, N. Rashid for sharing the mappability data of Drosophila genome, M. Greenberg for support in the early stage of this project, and E. Birney, M. Snyder, J. Ahringer, M. Gerstein, M. Kellis, P. Park and other members of modENCODE consortium for helpful discussions. This work was partially funded by US National Institutes of Health (HG4069 to X.S.L., 3U01 HG004270-03S1 to X.S.L. and J.D.L., and U01HG004264 to K.P.W.). Author Contributions: Y.C. performed bioinformatic analysis. N.N. performed cell culture, ChIP experiments and library preparation with help from J.Z. J.O.M. performed library preparation and sequencing experiments. Q.L. and P.J.B. contributed code for the IDR method. Q.L. participated in writing the description of IDR method and interpretation of the IDR analysis result. M.S. performed ChIP–quantitative (q)PCR validation of the selected array-specific Su(Hw) peaks and analyzed the ChIP-qPCR data. T.L., Y.Z., T.-K.K., H.H.H., Y.R., R.M.M. and B.J.W. contributed to the early development of the project. B.J.W., K.P.W., J.D.L. and X.S.L. conceived the project. T.-K.K., H.H.H., Y.R. and R.M.M. performed pilot experiments. Y.C., J.D.L. and X.S.L. wrote the manuscript with the help from other authors. The authors declare no competing financial interests.

Attached Files

Accepted Version - nihms-397907.pdf

Supplemental Material - nmeth.1985-S1.pdf


Files (6.3 MB)
Name Size Download all
2.1 MB Preview Download
4.2 MB Preview Download

Additional details

August 22, 2023
October 17, 2023