In this study we report on genome-wide methylation patterns generated using multiple peripheral and internal tissues from two independent sets of donors using 450k methylation chips. Although the 450k platform interrogates a small subset of the ~28M CpG sites in the human genome, it relatively comprehensively evaluates promoter regions and CpG islands, and also covers other potentially relevant features, including downstream genic and intergenic regions. A new algorithm was able to identify statistically robust tDMRs as illustrated by a statistically significant overlap in the location of tDMRs between the datasets. The biological relevance of the identified tDMRs was highlighted by the observation that they mapped to genes with tissue-specific expression and also showed hypomethylation specifically in the tissue expressing those genes. Annotation of tDMRs showed that they can occur irrespective of their position relative to genes or local CpG density. Tissue-specific DNA methylation was most evident, however, both absolutely and relatively, in regions outside CGIs or CGI flanking regions. This confirms previous studies reporting a high prevalence of CpG-poor regions near genes with tissue-specific expression both in humans [2, 3, 7] and animals [24, 25].
One of our key findings is that the role of non-CGI tDMRs may frequently involve the regulation of alternative transcription. Tissue-specific methylation was associated with alternative transcription start sites and, despite being sparsely covered by the 450k chip, mutually exclusive exons and cassette exons. A previous study adopting a descriptive approach combined with functional validation suggested a primary role for DNA methylation at CGIs in alternative transcription . Although we could confirm tissue-specific methylation at CGIs with a validated effect on alternative transcription from that study, our statistical approach highlighted the role of non-CGI regions in alternative transcription start sites. Interestingly, a recent study also supported a role for DNA methylation in controlling mutually exclusive exons underlining the validity of our results . The link between DNA methylation, non-CGI sequences and alternative transcription arising from our data is in line with their hypothesized role in vertebrate evolution .
Recent studies of differential methylation between tissues emphasized the occurrence of tDMRs outside non-CGI and CGI proximal promoters. For example, studies of animal models [5, 9] and subsequently humans underscored the occurrence of tDMRs in gene-body CGIs . Although the 450k chip comprehensively assesses methylation at CGIs, only ~4% of the tDMRs detected in our study mapped to a gene-body CGI. Another feature that attracted significant attention is CGI shores, which are the 2 kb regions flanking CGIs. Irizarry et al. reported that 76% of the tDMRs identified overlapped with CGI shores . Inspired by this work, the 450k chip was designed with the specific aim of covering CGI shores. Nevertheless, the percentage of CGI-shore tDMRs in our data was limited to ~25% of the total number of tDMRs. However, our data indicated that tissue-specific methylation at CGIs and CGI shores may be more relevant at downstream genic regions, which remain poorly studied. Of note, we found that differentially methylated CGI shores were associated with genes involved in housekeeping and developmental processes analogous to differentially methylated CGIs. tDMRs overlapping with so called CGI shelves (the regions flanking CGI shores) mapped to genes associated with tissue-specific processes, as was observed for non-CGI tDMRs. Our results indicate that the occurrence of tDMRs may be less biased towards previously suggested annotations including gene-body CGIs and CGI shores, and reinforce the potential utility of reconsidering current definitions of CGI annotations [12, 28–30].
The annotation of tDMRs has thus far primarily focussed on CG content and location relative to genes. Increasing knowledge of genome biology can give a more in-depth annotation. The ENCODE project mapped DNase I hypersensitive sites (DHSs), informative markers of regulatory DNA and transcription factor binding sites (TFBSs) across 349 cell lines . Both DHSs and TFBSs were enriched for tDMRs in our study. TFBS enrichment was observed for transcription factors (TFs) with a tissue-specific function and the TFBSs for these TFs were hypomethylated at TFBSs in the tissue in which they are expressed. These results are in accordance with the hypothesis that TF binding is associated with hypomethylation of TFBSs [31, 32].
Although the largest variation in DNA methylation was observed between tissues, it is more relevant to investigate inter-individual variation from the perspective of epigenetic epidemiology, which aims at identifying epigenetic risk factors for disease. Epidemiological studies, however, often have to rely on accessible peripheral tissues as proxies for internal organs directly involved in the aetiology of the disease of interest . Our exploration of the concordance between blood and internal tissues at CpG sites with variable DNA methylation suggested the presence of good correlations for a subset of variable CpG sites, many of which were locus and tissue-specific. Variable CpGs correlating across blood and all internal tissues may be primarily mediated by the effects of SNPs on DNA methylation  and may not necessarily represent a genuine epigenetic correlation. The initial evidence that blood DNA methylation may correlate to that of internal tissues as presented here and brain regions as reported previously  warrants investigations of more individuals and more tissues, such as the GTEx project , to work towards an atlas cataloguing those variably methylated regions in internal tissues that could potentially be studied indirectly by assessing their DNA methylation in specific peripheral tissues.