A Caltech Library Service

Effects of sequence variation on differential allelic transcription factor occupancy and gene expression

Reddy, Timothy E. and Gertz, Jason and Pauli, Florencia and Kucera, Katerina S. and Varley, Katherine E. and Newberry, Kimberly M. and Marinov, Georgi K. and Mortazavi, Ali and Williams, Brian A. and Song, Lingyun and Crawford, Gregory E. and Wold, Barbara J. and Willard, Huntington F. and Myers, Richard M. (2012) Effects of sequence variation on differential allelic transcription factor occupancy and gene expression. Genome Research, 22 (5). pp. 860-869. ISSN 1088-9051. PMCID PMC3337432.

PDF - Published Version
Creative Commons Attribution Non-commercial.

PDF - Supplemental Material
Creative Commons Attribution Non-commercial.

PDF - Supplemental Material
Creative Commons Attribution Non-commercial.

PDF - Supplemental Material
Creative Commons Attribution Non-commercial.


Use this Persistent URL to link to this item:


A complex interplay between transcription factors (TFs) and the genome regulates transcription. However, connecting variation in genome sequence with variation in TF binding and gene expression is challenging due to environmental differences between individuals and cell types. To address this problem, we measured genome-wide differential allelic occupancy of 24 TFs and EP300 in a human lymphoblastoid cell line GM12878. Overall, 5% of human TF binding sites have an allelic imbalance in occupancy. At many sites, TFs clustered in TF-binding hubs on the same homolog in especially open chromatin. While genetic variation in core TF binding motifs generally resulted in large allelic differences in TF occupancy, most allelic differences in occupancy were subtle and associated with disruption of weak or noncanonical motifs. We also measured genome-wide differential allelic expression of genes with and without heterozygous exonic variants in the same cells. We found that genes with differential allelic expression were overall less expressed both in GM12878 cells and in unrelated human cell lines. Comparing TF occupancy with expression, we found strong association between allelic occupancy and expression within 100 bp of transcription start sites (TSSs), and weak association up to 100 kb from TSSs. Sites of differential allelic occupancy were significantly enriched for variants associated with disease, particularly autoimmune disease, suggesting that allelic differences in TF occupancy give functional insights into intergenic variants associated with disease. Our results have the potential to increase the power and interpretability of association studies by targeting functional intergenic variants in addition to protein coding sequences.

Item Type:Article
Related URLs:
URLURL TypeDescription DOIArticle CentralArticle
Marinov, Georgi K.0000-0003-1822-7273
Mortazavi, Ali0000-0002-4259-6362
Wold, Barbara J.0000-0003-3235-8130
Additional Information:© 2012 Cold Spring Harbor Laboratory Press. This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at Received August 26, 2011; Accepted February 1, 2012. We thank Chris Gunter, Greg Cooper, and the members of the Myers lab for contributions and suggestions. This work was funded by NHGRI ENCODE Grant 5U54HG004576 to R.M.M. and B.W. Support for T.E.R. was fromNIH/NIAMS fellowship 5T32AR007450. Authors’ contributions: T.E.R., J.G., K.E.V., H.F.W., and R.M.M. conceived and designed the study, T.E.R. performed and interpreted the analysis, and wrote the manuscript. J.G. carried out the cloning-based validation of the RNA-seq experiments. F.P. and K.M.N. carried out the ChIP-seq experiments and RNA-seq experiments for the clonal isolates of GM12878. L.S. and G.E.C. performed and contributed to the interpretation of the DNase I hypersensitivity experiments. K.S.K. and H.F.W. designed and created the clonal GM12878 isolates, including determining the X inactivation state. G.K.M., A.M., B.A.W., and B.W. designed and performed the RNA-seq experiments. All authors contributed to the editing of the manuscript.
Funding AgencyGrant Number
NIH Predoctoral Fellowship5T32AR007450
Issue or Number:5
PubMed Central ID:PMC3337432
Record Number:CaltechAUTHORS:20120530-114833949
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:31714
Deposited By: Jason Perez
Deposited On:30 May 2012 23:13
Last Modified:29 Oct 2019 23:03

Repository Staff Only: item control page