DNA demethylation by 5-aza-2′-deoxycytidine is imprinted, targeted to euchromatin, and has limited transcriptional consequences
© Ramos et al.; licensee BioMed Central. 2015
Received: 24 November 2014
Accepted: 25 February 2015
Published: 17 March 2015
DNA methylation can be abnormally regulated in human disease and associated with effects on gene transcription that appear to be causally related to pathogenesis. The potential to use pharmacological agents that reverse this dysregulation is therefore an attractive possibility. To test how 5-aza-2′-deoxycytidine (5-aza-CdR) influences the genome therapeutically, we exposed non-malignant cells in culture to the agent and used genome-wide assays to assess the cellular response.
We found that cells allowed to recover from 5-aza-CdR treatment only partially recover DNA methylation levels, retaining an epigenetic ‘imprint’ of drug exposure. We show very limited transcriptional responses to demethylation of not only protein-coding genes but also loci-encoding non-coding RNAs, with a limited proportion of the induced genes acquiring new promoter activation within gene bodies. The data revealed an uncoupling of DNA methylation effects at promoters, with demethylation mostly unaccompanied by transcriptional changes. The limited panel of genes induced by 5-aza-CdR resembles those activated in other human cell types exposed to the drug and represents loci targeted for Polycomb-mediated silencing in stem cells, suggesting a model for the therapeutic effects of the drug.
Our results do not support the hypothesis of DNA methylation having a predominant role to regulate transcriptional noise in the genome and indicate that DNA methylation acts only as part of a larger complex system of transcriptional regulation. The targeting of 5-aza-CdR effects with its clastogenic consequences to euchromatin raises concerns that the use of 5-aza-CdR has innate tumorigenic consequences, requiring its cautious use in diseases involving epigenetic dysregulation.
With the increasing recognition that disturbances in DNA methylation (5-methylcytosine (5mC)) occur in a variety of human diseases, attention is focusing on how these insights could translate into therapeutic approaches. The field of epigenetic therapeutics has its foundations in cancer biology , but the recognition that epigenetic regulatory mechanisms appear to be contributing to diseases other than cancer has prompted discussion of the use of these agents in a broader spectrum of diseases . Targets for epigenetic therapies include DNA methylation and post-translational modifications of histones, including acetylation and methylation, by targeting the enzymes that add these covalent marks. As DNA methylation is currently the best studied of all candidate epigenetic regulators in human diseases, much attention has focused on DNA methyltransferase (DNMT) inhibitors. Several agents have been described to act as DNMT inhibitors: the nucleoside inhibitors 5-azacytidine (5-aza-CR), 5-aza-2′-deoxycytidine (5-aza-CdR), and zebularine; the non-nucleoside inhibitors procaine, epigallocatechin-3-gallate (EGCG), and hydralazine; and the direct DNMT inhibitor RG108 [3,4]. Of these, 5-aza-CdR (decitabine) has been found to be the most effective at demethylating DNA  and is approved for the treatment of myelodysplastic syndrome (MDS) in human subjects.
Incorporation of 5-aza-CdR into the genome causes it to be recognized by mammalian DNMT1 which becomes irreversibly bound to the nucleoside, unable to perform its catalytic functions, and leads it to become prematurely degraded, potentially involving ubiquitin-dependent proteasomal degradation . The demethylation of the genome, especially in promoter regions, is a goal of oncological therapy, prompted by observations of the acquisition of DNA methylation at transcription start sites and the associated transcriptional silencing of tumor-suppressor genes . Resistance to 5-aza-CdR has been found to involve differences in rates of incorporation of the nucleoside into DNA . We have previously found that CD34+ hematopoietic stem and progenitor cells (HSPCs) from patients with MDS have distinctive DNA methylation patterns when compared with CD34+ HSPCs from control subjects and that treatment with 5-aza-CR induces loss of DNA methylation at promoters in these cells . In cell models of leukemia, genomic studies have indicated that 5-aza-CR and 5-aza-CdR both induce demethylation of CG dinucleotide-rich CpG islands at promoters, but these promoter changes are not associated with transcriptional effects at those genes . We have also previously observed that long-term hematopoietic stem cells (HSCs, lineage−/CD34+/CD38-/CD90+) in MDS have abnormal DNA methylation compared with the same cell type from healthy control subjects and that treatment with 5-aza-CR does not influence the levels of mosaicism for cytogenetic abnormalities in these HSCs, indicating that the therapeutic response is through effects on the functional properties of these neoplastic cells rather than their eradication . A major concern with the use of DNMT inhibitors is their potential to induce genomic rearrangements, traditionally attributed to global demethylation based on cytogenetic observations made in the immunodeficiency, centromeric region instability, facial anomalies (ICF) syndrome  but also attributable to the formation of DNMT1 adducts in cells treated by 5-aza-CdR .
The genomic response to DNMT inhibitors is one of global demethylation, but there is some heterogeneity of response of loci within the genome. Among the regions undergoing demethylation, some lose while others retain nucleosomal occupancy , indicating that transcriptional regulatory processes are not primarily driven by DNA methylation and can be decoupled. With advances in technologies that allow genome-wide studies of DNA methylation, chromatin constituents, and transcription, we now have greater opportunity for more extensive insights into the effects of DNMT inhibitors than prior studies, which tended to focus on gene promoter effects. In particular, we were interested in following up on a prior paradoxical observation that demonstrated DNA methylation to be enriched at DNase hypersensitive, early-replicating euchromatin in non-cancer cell lines , suggesting that the effects of a DNA demethylating agent would be more likely to target this gene- and transcription-rich genomic compartment. Another question raised was whether global DNA demethylation is causally associated with the emergence of cryptic promoters. This idea was prompted by a long-established hypothesis that DNA methylation evolved to allow the suppression of transcriptional noise as genome sizes expanded, helping to prevent spurious polymerase-DNA interactions . With these questions in mind, we performed a study to test how 5-aza-CdR influences DNA methylation, RNA polymerase II localization, and gene expression in cultured human embryonic kidney (HEK) 293 T cells. As our previous studies were performed in primary cell types [8,10], our preference was to avoid using the cell lines derived from malignant cells that are typically used in DNMT1 inhibitor studies. However, to allow our studies to be reproduced, we chose to use the commonly used HEK 293 T cell line which is transformed but not derived from cancer cells.
Imprinted global demethylation following 5-aza-CdR treatment
DNA demethylation by 5-aza-CdR is predominantly targeted to euchromatin
Limited transcriptional effects of DNA demethylation by 5-aza-CdR
Effects of DNA demethylation within bodies of genes
Characteristics of subset of protein-coding genes with transcriptional changes
With increasing interest in the use of epigenetically active drugs to ameliorate human diseases, it is valuable to define how these drugs work on a genomic level, an area of research that has been relatively under-explored to date, especially in non-malignant cells. We confirm previous observations that 5-aza-CdR is extremely powerful in demethylation of the genome , with the novel observation that the demethylation is targeting gene-rich, transcriptionally active euchromatin. We also observe a long-lasting imprint in cells no longer exposed to the drug, which has previously been found to occur at the majority of loci in the genome of HCT116 cells treated with 0.3 μM of 5-aza-CdR, with the demethylation imprint still present 42 days after drug exposure . This is of interest as a potential model of acquired resistance to an epigenetically active drug - if a locus is already demethylated, it has become refractory to further demethylation. In spite of the disproportionate targeting of euchromatin, our data are consistent with prior reports indicating that transcriptional effects are limited to a small subset of protein-coding genes, with a similar effect on lncRNAs. This is despite the measurable demethylation of RefSeq and RNAPII-Ser5 (P)-defined promoters and the emergence of new promoters defined by loci acquiring RNAPII-Ser5 (P). A clue to the mechanism of transcriptional activation of 5-aza-CdR comes from the discovery that the intragenic acquisition of RNAPII-Ser5 (P) appears to be associated with the activation of previously silenced genes, possibly supporting a model of limited cryptic promoter activation . The fact that we do not observe substantial gene activation is unexpected, as the common belief is that DNMT1 inhibitors should have this property. It may be significant that many previous studies have used malignant cells whereas we are using transformed HEK 293 T cells, which are not derived from a primary cancer.
In defining promoter categories, we made the unexpected discovery that the previous definition of LCG promoters  appears to over-represent those used in vivo, based on a comparison with those defined by ChIP-seq data for RNAPII-Ser5 (P). As the majority of the genome is depleted in CG dinucleotides, a random sampling of the genome included in this kind of analysis would relatively inflate the LCG proportion, suggesting that RefSeq-defined promoters in the LCG category should be viewed as annotations of questionable reliability. Another finding of interest about promoters was that RNAPII-Ser5 (P) can be located at methylated DNA and that the associated genes show patterns of expression that appear comparable to those in the remainder of the genome. This is a puzzling observation. One concern with any study that finds co-localization of genomic regulatory events is that co-localization does not necessarily imply the presence of the two events on the same alleles in the population of cells being tested. Scrutinizing the HELP-tagging data of Figure 5, it is clear that a substantial proportion of the readings are at loci of intermediate methylation, which supports a model of mixed allelic populations, allowing the RNAPII-Ser5 (P) to be present on a subset of alleles that are unmethylated. Further work will be necessary to resolve this unexpected finding.
The effect of pharmacological demethylation on promoters and transcription was limited and frequently uncoupled - promoters with methylation changes did not necessarily change the expression status of the associated gene. It has already been shown that some promoters can remain occupied by nucleosomes following pharmacological demethylation , and it is also possible that the lack of cognate transcription factors may be a reason why the acquisition of permissiveness, in terms of DNA methylation, is not accompanied by the induction of gene expression. The results of this study serve to emphasize that transcriptional regulation is a complex system that cannot readily be interpreted in terms of a single component like DNA methylation.
The relationship of DNA methylation with transcription in gene bodies has been recognized for some time to be different from that at promoters, referred to as ‘the methylation paradox’ , possibly serving a role to repress cryptic intragenic transcription , although this role has been questioned . DNA methylation is also linked to co-transcriptional splicing regulation  and to modulation of the rate of RNA polymerase passage through the transcribed region . Together, these prior observations indicate that we should expect changes in qualitative aspects of gene expression when DNA demethylation is occurring and disproportionately targeting transcribed regions of the genome. While there is some evidence for de novo establishment of RNAPII-Ser5 (P) promoters within gene bodies associated with 5-aza-CdR exposure, with limited transcriptional consequences, little evidence exists for altered primary transcript processing. While there is some evidence for activation of a small number of cryptic intragenic promoters, the effect is so small that it argues against a major reason for DNA methylation being to suppress transcriptional noise, as previously proposed .
A goal of the project was to understand why 5-aza-CdR targeted certain loci preferentially. As should be predicted, loci that are heavily methylated before treatment are more likely to become unmethylated as a response to the drug, which leads to our being able to define euchromatin as the major target compartment, as well as specific promoter subtypes that are more likely to be methylated. Unexpectedly, we found evidence that the genes that we found to change expression in HEK 293 T cells are significantly similar to those changing expression in malignant multiple myeloma and pancreatic cells exposed to 5-aza-CdR in vitro [29,30]. This indicates that there is a common set of human genes that respond to 5-aza-CdR, suggesting that the data we generated in HEK 293 T cells may be more broadly applicable to other human cell types. The genes induced to express show a strong enrichment for those implicated by Ben-Porath and colleagues  to represent the genes at which Polycomb-mediated silencing in ES cells is targeted. A model for cancer formation involving Polycomb proposes that silencing to create a pluripotent stem-cell-like epigenetic pattern is part of the induction of a self-renewal program that favors neoplasia . The preferential demethylation and gene activation by 5-aza-CdR at targets of Polycomb-mediated silencing represents a tenable model for the drug’s therapeutic effects in MDS, consistent with the emerging body of evidence that MDS has a stem cell origin . Set against this potential therapeutic benefit is the concern that 5-aza-CdR targets euchromatin, where associated chromosomal breakage would be directed towards the most gene- and transcriptionally enriched genomic compartment. This issue raises concerns about the risk:benefit ratio with DNMT1 inhibitor use in human disease, which may be justifiable in cancer treatment, as currently permitted, but may be more difficult to justify in less life-threatening disorders involving epigenetic changes.
The use of powerful genome-wide assays to study the effects of pharmacological DNMT1 inhibition has confirmed several prior findings, including limited transcriptional effects despite profound global demethylation. New findings include the observation that demethylation can persist as an epigenetic imprint following drug withdrawal and cell recovery. In addition to the limited effects of demethylation by 5-aza-CdR on protein-coding genes, we see similar limited effects on lncRNAs and primary RNA transcript processing. This is an unexpected finding given the multiple prior reports of transcriptional activation following DNMT1 inhibitor treatment, but we note that the HEK 293 T cells that we study are not derived from malignant cells as are typically studied, potentially helping to explain this difference. Our use of ChIP-seq to define promoters reveals some potential limitations to current genomic annotations of transcriptional start sites and some new insights about the CG dinucleotide content and DNA methylation patterns at these in vivo promoters. We find evidence that the limited effects on transcription of protein-coding genes in HEK 293 T cells may involve activation of intragenic promoters, while similarities observed in the genes induced to express in our study with those in other human cell types treated in the same manner suggest that our results may apply to other cell types. We find support for 5-aza-CdR having effects that reverse Polycomb-mediated silencing, suggesting a mechanism for its therapeutic effect in MDS. While HEK 293 T cells are not derived from malignant cells, neither are they primary, untransformed cells, so we have to be cautious about interpreting how our findings translate to other cell types, but the preferential targeting of euchromatin for demethylation raises the concern that 5-aza-CdR may also have clastogenic properties, requiring caution in the use of this drug clinically for non-cancer conditions involving epigenetic dysregulation.
Cell culture and 5-aza-CdR treatment
We chose to work with the widely available and well-characterized HEK 293 T transformed, non-malignant cells, in order to avoid any bias that passage number could introduce in the analysis. HEK 293 T cells were grown in DMEM supplemented with 10% fetal bovine serum and 2 mM penicillin-streptomycin at 37°C in 5% CO2. 5-aza-CdR was dissolved in water to a final concentration of 10 mg/mL and stored in aliquots at −20°C. The 5-aza-CdR treatment was optimized to establish a working concentration, using a range from 0.25 to 2.0 μM. The cells were exposed to 5-aza-CdR for 3 days to allow the drug to be incorporated into DNA. Tissue culture medium was changed every day for both control and treated cells, to maintain the drug stability during treatment.
HEK 293 T cells were grown in triplicates for each condition, including a control group with no drug exposure. To allow recovery of the cells after the 3-day treatment, we maintained the treated cells in culture with fresh media lacking drug. Cells were passaged only when reaching 80% to 90% confluence, for a total period of 1 month. Cells were washed before trypsinizing to ensure that only adherent, viable cells were passaged. DNA, RNA, protein, and cross-linked chromatin were extracted from the same cell batch for each condition.
Cell metabolism assay
Cells were seeded in 96-well plates using the same experimental conditions as for the 5-aza-CdR experiments. At the end of each time point, the WST-1 assay (Clontech) was performed following the manufacturer’s recommendations. Absorbance reads were registered after 40 min of incubation for 16 replicates at each time point and normalized by the number of cells present to quantify the net metabolic activity of the cells in culture. A total of 12 replicates were performed.
DNA, RNA, and cross-linked chromatin were extracted from the same cell batch for each condition. For DNA extraction, cells were re-suspended in lysis buffer (10 mM Tris EDTA (TE), 150 mM ethylenediaminetetraacetic acid (EDTA), and 1% sodium dodecyl sulfate (SDS)) supplemented with 10 mg/mL of RNase A and 20 mg/mL of Proteinase K and incubated at 50°C overnight. The lysed cells were phenol-chloroform extracted, and the resultant DNA was dialyzed in 0.2× SSC buffer (300 mM NaCl and 3 mM Na3C6H5O7, ph 7.0) for 24 h. The DNA sample was concentrated in the dialysis bags using polyethylene glycol, following which the quality and concentration of the DNA were measured by NanoDrop spectrophotometry. RNA was extracted using TRIzol (Invitrogen) using the manufacturer’s protocol. The quality and integrity of the RNA were measured using NanoDrop spectrophotometry and Bioanalyzer (Agilent).
For chromatin, cells were processed using the Myers Lab ChIP-seq protocol . The untreated and 1.0 μM acutely treated cells were cross-linked in culture media with 1% formaldehyde for 10 min, quenching with 0.125 M Glycine. The cells were then washed with cold PBS, collected by centrifugation at 2,000 rpm for 5 min at 4°C and re-suspended in cold lysis buffer (5 mM PIPES (pH 8.0), 85 mM KCl, 0.5% NP-40, supplemented with fresh protease inhibitor cocktail (Roche)). Following collection, the crude nuclear preparation was re-suspended in 300 μL of cold radioimmunoprecipitation assay (RIPA) buffer (1× PBS, 1% NP-40, 0.5% sodium deoxycholate, 0.1 SDS, supplemented with fresh protease inhibitor cocktail (Roche)) and processed in a Bioruptor at the high setting. HEK 293 T cells were sonicated for a total of 15 min, in cycles of 30 s on/30 s off. The sonicated mixture was spun at 16,000 × g for 15 min at 4°C, and the chromatin was collected from the supernatant. The sample volume was brought to 1 mL with RIPA buffer, and 100 μL was saved as an input control.
Luminometric methylation assay (LUMA)
To quantify global DNA methylation changes, LUMA analysis was performed . For each time point, each of the DNA samples was digested in triplicate with 20 U EcoRI and either MspI or HpaII at 37°C overnight, purified and submitted to our institutional Genomics Core Facility for pyrosequencing. The percentage of methylation was calculated by the ratio of the incorporated (C + G) nucleotides after the HpaII digestion compared with the MspI digestion and normalized to the values obtained by EcoRI digestion ((A + C)/2).
As previously described , 1 μg of DNA was used to generate HELP-tagging libraries. The indexed adapters used are listed in Additional file 1: Table S7. Genomic DNA was digested with either MspI or HpaII at 37°C overnight, purified and ligated to the first pre-annealed TruSeq-indexed Illumina adapters containing the T7 promoter sequence, as well as the EcoP15I recognition site (AE adapters ). After ligation, the DNA samples were digested with EcoP15I at 37°C overnight, end-filled, 3′ terminal A extended and ligated to the second pre-annealed Illumina adapter (AS adapter). Samples were then in vitro transcribed using the MEGAshort kit (Ambion), followed by retrotranscription (SuperScript III kit, Invitrogen) before amplification. Libraries were multiplexed for 50 bp single-end sequencing on the Illumina HiSeq 2500 platform at the institutional Epigenomics Shared Facility.
Directional RNA-seq assay
DNase-treated, rRNA-depleted (Ribozero, Epicentre) RNA was used as a template for SuperScript III first-strand cDNA synthesis (Invitrogen), using oligo-dT as well as random hexamers. Actinomycin D was added to the reaction to prevent any possible amplification from contaminating genomic DNA. During second-strand synthesis, a dU/VTP mix was used to create directional libraries. Before library preparation, cDNA samples were Covaris-fragmented to 300-bp fragments. The samples were then end-filled, 3′ terminal A extended and ligated to pre-annealed TruSeq-indexed Illumina adapters. Uracil-DNA-glycosylase (UDG) treatment preceded the PCR reaction to amplify exclusively the originally oriented transcripts. Libraries were amplified using P5 and P7 Illumina primers and gel-extracted for size selection and primer-dimer removal. Before sequencing, libraries were tested using the BioAnalyzer to assure library quality, in terms of size and primer-dimer depletion. Indexed libraries were multiplexed for 100-bp single-end sequencing on the Illumina HiSeq 2500 platform at the institutional Epigenomics Shared Facility. The indexed adapters used are listed in Additional file 1: Table S8. Given the concordance of extremely limited effects on transcription for different time points and drug dosages, and following recommendations of ENCODE for RNA-seq experiments , we allowed the four treatment conditions to act as biological replicates, increasing confidence in our findings of minimal transcriptional effects of 5-aza-CdR.
Chromatin immunoprecipitation was performed using 3 × 106 cells using the Myers Lab ChIP-seq protocol . HEK 293 T cells were crosslinked in culture media with 1% formaldehyde for 10 min, quenching with 0.125 M Glycine. The cells were then washed with cold PBS, collected by centrifugation at 2,000 rpm for 5 min at 4°C, and resuspended in cold Farnham lysis buffer (5 mM PIPES (pH 8.0), 85 mM KCl, 0.5% NP-40, supplemented with fresh protease inhibitor cocktail (Roche)). Following collection, the crude nuclear preparation was resuspended in 300 μL of cold RIPA buffer (1× PBS, 1% NP-40, 0.5% sodium deoxycholate, 0.1 SDS, supplemented with fresh protease inhibitor cocktail (Roche)) and processed in a Bioruptor at the high setting. HEK 293 T cells were sonicated for a total of 15 min, in cycles of 30 s on/30 s off. The sonicated mixture was spun at 16,000 × g for 15 min at 4°C, and the chromatin was collected from the supernatant. The sample volume was brought to 1 mL with RIPA buffer, and 100 μL was saved as an input control. Magnetic beads (Invitrogen) were washed three times with 5 mg/mL BSA in PBS and supplemented with freshly added protease inhibitors. RNAPII-Ser5 (P) antibody (5 μg, Active Motif catalog number #61085) was added to the bead slurry, incubating the mixture overnight at 4°C. The antibody-coupled beads were washed three times with PBS/BSA, added to the chromatin sample, and incubated in a rotor at 4°C overnight.
After immunoprecipitation, beads were collected by magnetic separation and washed five times with cold LiCl Wash buffer (100 mM Tris–HCl pH 7.5, 500 mM LiCl, 1% NP-40, 1% sodium deoxycholate). After a final wash with cold 1× TE buffer, the beads were resuspended in 200 μL IP elution buffer (1% SDS, 0.1 M NaHCO3, containing proteinase K and 0.2 M NaCl), and both the immunoprecipitated and input samples were de-crosslinked at 65°C overnight. ChIP products were purified with the DNA Clean and Concentrator kit (Zymo) and eluted in 60 μL of elution buffer. The efficiency of the ChIP was tested by enrichment quantification of immunoprecipitated/input DNA ratios at candidate positive loci compared with those at negative regions, using real-time quantitative PCR (RT-qPCR). The primers used are presented in Additional file 1: Table S9. For library preparation, the samples were end-filled, 3′ terminal A extended and ligated to pre-annealed TruSeq-indexed Illumina adapters. The indexed adapters used are listed in Additional file 1: Table S10. Libraries were amplified using P5 (5′-AATGATACGGCGACCACCGA-3′) and P7 (5′-CAAGCAGAAGACGGCATACGAGAT-3′) Illumina primers and gel extracted for size selection and primer-dimer removal. Before sequencing, the quality of the libraries was checked using the Agilent BioAnalyzer to confirm the correct size (250 to 500 bp) and primer-dimer depletion. Libraries were multiplexed and single-end sequenced on the Illumina HiSeq 2500 with 100-bp read length at the institutional Epigenomic Shared Facility.
HELP-tagging data analysis
Sequencing reads were aligned by the WASP pipeline  version 3.1.4 (rev. 6598), using the CASAVA aligner from Illumina (ELAND 1.7.0). Transformation of raw sequence reads to the angle calculation previously described  was performed using an MspI reference from HEK 293 T cells and scaled to create a 0 to 100 range. Data analysis was performed using a bespoke pipeline in the R environment (version 2.15.0). A median of 4.5 million reads was obtained for each sample, with 98% of them passing filter and 76% aligning to the reference genome.
RNA-seq data analysis
By calculating the ratio between control and tested samples, and then considering the genes with a ratio greater than 2× the standard deviation, we obtained a list of the candidate genes with the most extreme intronic retention.
ChIP-seq data analysis
ChIP-seq reads were aligned by the WASP Pipeline  version 3.1.5, using MACS  version 1.4.2 as the peak finder and bowtie  version 0.12.7 as the aligner. A median of 19 million reads was obtained for each sample, with 89% of them passing filter and 77% aligning to the reference genome. Data analysis was performed using a bespoke pipeline within the R environment (version 2.15.0).
Self-organizing map analysis
To examine the relationships between five different genomic variables, we used an artificial neural-learning-based approach, the self-organizing map (SOM) . We used 100-kb sliding windows with a step size of 50 kb as we have previously described . To build the SOM, we used information from HELP-tagging data for control samples and the acute values for the 1.0 μM 5-aza-CdR-treated cells, the mean of gene expression levels of all RefSeq genes within the window, and the cumulative number of HpaII sites per window. All vectors were tagged by 5-aza-CdR response and whether they belonged to one of the five quartiles of DNA methylation and gene expression levels. After training, all data were re-introduced to the grid a final time and selected annotations revealed as a dual color intensity graph in order to examine the distribution of features. Overall clustering patterns in the data were also examined using a U-matrix representation of the grid, which represents a similarity graph where a linear grayscale is used to indicate how similar a node vector is to its immediate neighbors in vector space.
Replicating a previously published approach , we categorized promoters based on CG dinucleotide content. A cutoff of 0.366 was established at the local minimum of the bimodal distribution for the observed/expected CG ratio of all RefSeq promoters (±1.5 kb flanking the TSS), while the cutoff for the RNAPII-Ser5 (P) promoters was 0.401.
Public availability of data
All genome-wide data are available from the GEO resource at http://www.ncbi.nlm.nih.gov/geo/ under the accession number GSE62590.
CpG, cytosine-guanine dinucleotide
massively parallel sequencing of chromatin immunoprecipitated DNA
embryonic stem (cells)
fragments per kilobase per million reads
Gene Set Enrichment Analysis
trimethylation of lysine 36 of histone H3
high-CG dinucleotide observed/expected ratio
hematopoietic stem and progenitor cell
immunodeficiency, centromeric region instability, facial anomalies (syndrome)
Intronic Retention Score
low-CG dinucleotide observed/expected ratio
long non-coding RNA
Luminometric Methylation Assay
- RNAPII-Ser5 (P):
serine 5-phosphorylated RNA polymerase II
massively parallel sequencing of ribonucleic acid (RNA)
The work was performed with the involvement of the Center for Epigenomics of the Albert Einstein College of Medicine. Brent Calder is thanked for advice regarding computational genomics analytical approaches.
- Mund C, Lyko F. Epigenetic cancer therapy: proof of concept and remaining challenges. Bioessays. 2010;32:949–57.View ArticlePubMedGoogle Scholar
- Wiech NL, Fisher JF, Helquist P, Wiest O. Inhibition of histone deacetylases: a pharmacological approach to the treatment of non-cancer disorders. Curr Top Med Chem. 2009;9:257–71.View ArticlePubMedGoogle Scholar
- Chuang JC, Yoo CB, Kwan JM, Li TW, Liang G, Yang AS, et al. Comparison of biological effects of non-nucleoside DNA methylation inhibitors versus 5-aza-2′-deoxycytidine. Mol Cancer Ther. 2005;4:1515–20.View ArticlePubMedGoogle Scholar
- Stresemann C, Brueckner B, Musch T, Stopper H, Lyko F. Functional diversity of DNA methyltransferase inhibitors in human cancer cell lines. Cancer Res. 2006;66:2794–800.View ArticlePubMedGoogle Scholar
- Ghoshal K, Datta J, Majumder S, Bai S, Kutay H, Motiwala T, et al. 5-Aza-deoxycytidine induces selective degradation of DNA methyltransferase 1 by a proteasomal pathway that requires the KEN box, bromo-adjacent homology domain, and nuclear localization signal. Mol Cell Biol. 2005;25:4727–41.View ArticlePubMed CentralPubMedGoogle Scholar
- Fahy J, Jeltsch A, Arimondo PB. DNA methyltransferase inhibitors in cancer: a chemical and therapeutic patent overview and selected clinical studies. Expert Opin Ther Pat. 2012;22:1427–42.View ArticlePubMedGoogle Scholar
- Qin T, Jelinek J, Si J, Shu J, Issa JP. Mechanisms of resistance to 5-aza-2′-deoxycytidine in human cancer cell lines. Blood. 2009;113:659–67.View ArticlePubMed CentralPubMedGoogle Scholar
- Figueroa ME, Skrabanek L, Li Y, Jiemjit A, Fandy TE, Paietta E, et al. MDS and secondary AML display unique patterns and abundance of aberrant DNA methylation. Blood. 2009;114:3448–58.View ArticlePubMed CentralPubMedGoogle Scholar
- Klco JM, Spencer DH, Lamprecht TL, Sarkaria SM, Wylie T, Magrini V, et al. Genomic impact of transient low-dose decitabine treatment on primary AML cells. Blood. 2013;121:1633–43.View ArticlePubMed CentralPubMedGoogle Scholar
- Will B, Zhou L, Vogler TO, Ben-Neriah S, Schinke C, Tamari R, et al. Stem and progenitor cells in myelodysplastic syndromes show aberrant stage-specific expansion and harbor genetic and epigenetic alterations. Blood. 2012;120:2076–86.View ArticlePubMed CentralPubMedGoogle Scholar
- Ehrlich M. DNA hypomethylation, cancer, the immunodeficiency, centromeric region instability, facial anomalies syndrome and chromosomal rearrangements. J Nutr. 2002;132:2424S–9.PubMedGoogle Scholar
- Maslov AY, Lee M, Gundry M, Gravina S, Strogonova N, Tazearslan C, et al. 5-aza-2′-deoxycytidine-induced genome rearrangements are mediated by DNMT1. Oncogene. 2012;31:5172–9.View ArticlePubMed CentralPubMedGoogle Scholar
- Pandiyan K, You JS, Yang X, Dai C, Zhou XJ, Baylin SB, et al. Functional DNA demethylation is accompanied by chromatin accessibility. Nucleic Acids Res. 2013;41:3973–85.View ArticlePubMed CentralPubMedGoogle Scholar
- Suzuki M, Oda M, Ramos MP, Pascual M, Lau K, Stasiek E, et al. Late-replicating heterochromatin is characterized by decreased cytosine methylation in the human genome. Genome Res. 2011;21:1833–40.View ArticlePubMed CentralPubMedGoogle Scholar
- Bird AP. Gene number, noise reduction and biological complexity. Trends Genet. 1995;11:94–100.View ArticlePubMedGoogle Scholar
- Karimi M, Johansson S, Stach D, Corcoran M, Grander D, Schalling M, et al. LUMA (LUminometric Methylation Assay)–a high throughput method to the analysis of genomic DNA methylation. Exp Cell Res. 2006;312:1989–95.View ArticlePubMedGoogle Scholar
- Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, et al. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 2009;37:e123.View ArticlePubMed CentralPubMedGoogle Scholar
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28:511–5.View ArticlePubMed CentralPubMedGoogle Scholar
- Standards, guidelines and best practices for RNA-Seq. [http://genome.ucsc.edu/ENCODE/protocols/dataStandards/RNA_standards_v1_2011_May.pdf]
- Komarnitsky P, Cho EJ, Buratowski S. Different phosphorylated forms of RNA polymerase II and associated mRNA processing factors during transcription. Genes Dev. 2000;14:2452–60.View ArticlePubMed CentralPubMedGoogle Scholar
- Saxonov S, Berg P, Brutlag DL. A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc Natl Acad Sci U S A. 2006;103:1412–7.View ArticlePubMed CentralPubMedGoogle Scholar
- Ioshikhes IP, Zhang MQ. Large-scale human promoter mapping using CpG islands. Nat Genet. 2000;26:61–3.View ArticlePubMedGoogle Scholar
- Maunakea AK, Nagarajan RP, Bilenky M, Ballinger TJ, D’Souza C, Fouse SD, et al. Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature. 2010;466:253–7.View ArticlePubMed CentralPubMedGoogle Scholar
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102:15545–50.View ArticlePubMed CentralPubMedGoogle Scholar
- Ben-Porath I, Thomson MW, Carey VJ, Ge R, Bell GW, Regev A, et al. An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat Genet. 2008;40:499–507.View ArticlePubMed CentralPubMedGoogle Scholar
- Meissner A, Mikkelsen TS, Gu H, Wernig M, Hanna J, Sivachenko A, et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature. 2008;454:766–70.PubMed CentralPubMedGoogle Scholar
- Lorincz MC, Dickerson DR, Schmitt M, Groudine M. Intragenic DNA methylation alters chromatin structure and elongation efficiency in mammalian cells. Nat Struct Mol Biol. 2004;11:1068–75.View ArticlePubMedGoogle Scholar
- Simon JM, Hacker KE, Singh D, Brannon AR, Parker JS, Weiser M, et al. Variation in chromatin accessibility in human kidney cancer links H3K36 methyltransferase loss with widespread RNA processing defects. Genome Res. 2014;24:241–50.View ArticlePubMed CentralPubMedGoogle Scholar
- Sato N, Fukushima N, Maitra A, Matsubayashi H, Yeo CJ, Cameron JL, et al. Discovery of novel targets for aberrant methylation in pancreatic carcinoma using high-throughput microarrays. Cancer Res. 2003;63:3735–42.PubMedGoogle Scholar
- Heller G, Schmidt WM, Ziegler B, Holzer S, Mullauer L, Bilban M, et al. Genome-wide transcriptional response to 5-aza-2′-deoxycytidine and trichostatin a in multiple myeloma cells. Cancer Res. 2008;68:44–54.View ArticlePubMedGoogle Scholar
- Smid M, Wang Y, Zhang Y, Sieuwerts AM, Yu J, Klijn JG, et al. Subtypes of breast cancer show preferential site of relapse. Cancer Res. 2008;68:3108–14.View ArticlePubMedGoogle Scholar
- Rickman DS, Millon R, De Reynies A, Thomas E, Wasylyk C, Muller D, et al. Prediction of future metastasis and molecular characterization of head and neck squamous-cell carcinoma based on transcriptome and genome analysis by microarrays. Oncogene. 2008;27:6607–22.View ArticlePubMedGoogle Scholar
- Sweet-Cordero A, Mukherjee S, Subramanian A, You H, Roix JJ, Ladd-Acosta C, et al. An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat Genet. 2005;37:48–55.PubMedGoogle Scholar
- Yang X, Han H, De Carvalho DD, Lay FD, Jones PA, Liang G. Gene body methylation can alter gene expression and is a therapeutic target in cancer. Cancer Cell. 2014;26:577–90.View ArticlePubMedGoogle Scholar
- Maunakea AK, Chepelev I, Cui K, Zhao K. Intragenic DNA methylation modulates alternative splicing by recruiting MeCP2 to promote exon recognition. Cell Res. 2013;23:1256–69.View ArticlePubMed CentralPubMedGoogle Scholar
- Jones PA. The DNA, methylation paradox. Trends Genet. 1999;15:34–7.View ArticlePubMedGoogle Scholar
- Jjingo D, Conley AB, Yi SV, Lunyak VV, Jordan IK. On the presence and role of human gene-body DNA methylation. Oncotarget. 2012;3:462–74.PubMed CentralPubMedGoogle Scholar
- Widschwendter M, Fiegl H, Egle D, Mueller-Holzner E, Spizzo G, Marth C, et al. Epigenetic stem cell signature in cancer. Nat Genet. 2007;39:157–8.View ArticlePubMedGoogle Scholar
- Elias HK, Schinke C, Bhattacharyya S, Will B, Verma A, Steidl U. Stem cell origin of myelodysplastic syndromes. Oncogene. 2014;33:5139–50.View ArticlePubMedGoogle Scholar
- Myers lab ChIP-seq protocol. [http://myers.hudsonalpha.org/documents/Myers%20Lab%20ChIPseq%20Protocol%20v041610.pdf]
- Suzuki M, Jing Q, Lia D, Pascual M, McLellan A, Greally JM. Optimized design and data analysis of tag-based cytosine methylation assays. Genome Biol. 2010;11:R36.View ArticlePubMed CentralPubMedGoogle Scholar
- McLellan AS, Dubin RA, Jing Q, Broin PO, Moskowitz D, Suzuki M, et al. The Wasp system: an open source environment for managing and analyzing genomic data. Genomics. 2012;100:345–51.View ArticlePubMedGoogle Scholar
- Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 2010;26:873–81.View ArticlePubMed CentralPubMedGoogle Scholar
- Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9:R137.View ArticlePubMed CentralPubMedGoogle Scholar
- Langmead B. Aligning short sequencing reads with Bowtie. Current protocols in bioinformatics/editoral board. 2010; Chapter 11:Unit 11.7. doi:10.1002/0471250953.bi1107s32. PubMed PMID: 21154709; PubMed Central PMCID: PMC3010897.
- Kohonen T, Schroeder MR, Huang TS, editors. Self-organizing maps. Secaucus, NJ, USA: Springer-Verlag New York, Inc; 2001.Google Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.