BSAS was developed to quantitatively and accurately measure 5-mC levels in genomic regions of interest. For many hypothesis-driven investigations or validation of genome-wide methylation studies, analysis of specific gene promoters, CpG islands or differentially methylated regions is needed. Combining existing approaches in bisulfite conversion with tagementation-based library preparation and benchtop sequencing produces a fast, accurate and customizable approach applicable to a wide variety of epigenetic studies. Using BSAS for targeted quantitative DNA methylation analysis, we demonstrated a 16-fold (mouse controls) and 5-fold (rat controls) decrease in the error of methylation quantitation over traditional direct bisulfite PCR amplicon Sanger sequencing and ESME. Whole genome DNA methylation standards were used to show this increase in accuracy of BSAS methylation quantitation across the dynamic range (from 0% to 100%) over the traditional approach in two regions from rat and mouse genomes. The increased accuracy in the BSAS method was attributed to the decrease in quantitation error over the traditional approach, seen in the standard curves generated. This decrease in error was most likely due to the digital quantitation of NGS as opposed to analog nature of the Sanger sequencing and the sequencing depth achieved in the BSAS method. Rho promoter methylation differences corroborated mRNA expression from retina and cerebellar tissue, with BSAS generating more precise data. In general, sequencing depth in the BSAS method ranged from 1,000 × saturation of the targeted region to well over 500,000 × saturation, at any given CpG site measured. These levels of depth were able to generate increased accuracy in quantitation of the methylation controls, in both the rat and mouse sets. Achieving 1,000 × sequencing depth was sufficient for accurate methylation quantitation. Using the theoretical model presented, the confidence interval does not improve significantly at depths greater than 1,000 ×. Therefore, when designing BSAS experiments, a target of 1,000 × would be a sufficient depth for accurate methylation quantitation. Our empirical data fit well within the theoretical model; however, there was a slight inflation of the confidence interval outside the expected for both the rat and mouse data sets. The source of error was most likely the methylation standards. Overall, the digital quantitation of BSAS improved the quantitation and statistical power over the Sanger method, which has an analog output with CpG methylation quantitation being a function of the area under the curves for the C and T traces. Thus, because BSAS is digital and reaches sequencing depths required for accurate quantitation, BSAS was superior to the Sanger method.
In addition to the better quantitation with BSAS, there were multiple benefits of using this method. There was no need to use sequencing primers as there is in pyrosequencing and direct bisulfite amplicon Sanger sequencing, which can limit these methods of sequencing in the quality of sequence obtained. The use of sequencing primers also limits these methods to looking at only one direction of one target region, limiting the throughput capabilities of pyrosequencing and Sanger sequencing reactions, and read length. Pyrosequencing is often limited to shorter reads (approximately 100 bp) requiring multiple tiled reads to achieve the coverage observed with BSAS . In BSAS, the sequencing step is random and unbiased, thus multiple target genomic regions in one sample, and multiple samples, can be sequenced together on one flow cell. The use of cloning was not needed in BSAS; this reduced the overall time of the method considerably, as well as significantly improving the ease of library construction. Additionally, because of the digital nature of BSAS, when quantifying regions of interest, there is no need to generate a standard curve for each target. In the analog methods, standard curves are necessary for quantitation, because the quantitation is based on an assay signal, or output, not counting. This greatly reduces the amount of samples that need to be run with BSAS compared to pyrosequencing or Sanger.
Previously, targeted methylation analysis approaches, which have incorporated NGS into their protocols, depend on targeting of CpG sites by using hybridization arrays, padlock probe sets, and capture probes for whole genome methylation analysis [15–17]. Massively parallel PCR amplification with the Raindance technology followed by NGS has also been described . These methods are well suited to ‘medium’ scale discovery but the complexity of the methods, requirement for additional specialized equipment and costs limit their application, especially for highly targeted studies with large numbers of samples. The BSAS approach only required standard molecular biology equipment and access to a benchtop sequencer. Where all of these methods excel is in their use of NGS for digital methylation quantitation. There are multiple reports profiling methylomes using NGS on tissues such as the human placenta , cancer cells , rodent animal models , rodent animal models , and disease states such as diabetes . These findings break new ground in understanding DNA methylation across large regions of the genome, but a targeted approach of methylation analysis, like BSAS, is highly applicable to quantitative analysis of certain genes or regions, especially when a large number of samples are required. In particular, the rapid and cost effective nature of tagmentation-based library preparation and benchtop sequencing make BSAS an easily adopted method. An alternative approach to NGS-based focused methylation analysis is the MassARRAY mass spectrometry approach . This approach has been used successfully for a number of studies but does require specific instrumentation, and non-sequence data can lead to ambiguity in determining base-specific methylation . Therefore, whole methylome studies are useful for broad discovery efforts, while BSAS is a tool for answering hypothesis driven research questions of specific target genes or genomic regions identified in initial methylome analyses.
The BSAS method demonstrated the utility of the Nextera XT NGS library generation technology, which greatly reduces the amount of input DNA (1 ng), decreases library generation time (approximately 2 to 4 h), and increases the throughput of library generation by performing the protocol in a 96-well plate. Tagmentation also removes the need for stepwise DNA shearing, end repair, 3’ adenylation, and adapter ligation, combining these steps into one. The feasibility and benefits of tagmentation-based library preparations have been discussed previously . Another benefit of NGS and Nextera XT library generation is the dual indexing libraries. Dual indexing allowed for a high level of sample multiplexing; 96 samples are capable of being multiplexed onto 1 single-lane flow cell. Within each sample, multiple target regions can be analyzed. The high level of multiplexing possible with BSAS, both in number of samples and regions of interest, increases the throughput over traditional methods. The current cost of tagmentation-based library construction is lower compared to ligation-based library construction methods. Provided sufficient flow-cell capacity, with BSAS additional time and costs are limited with increasing sample size. By comparison, Sanger and pyrosequencing require significant time and cost for each additional sample.
Supplemental to the simpler and more rapid library generation, we established the utility of the Illumina MiSeq in targeted DNA methylation analysis. While previous reports have used the MiSeq for high output cytosine modification validation  or comparing with traditional NGS library construction methods  we show the MiSeq as a tool for precise orthogonal absolute 5-mC quantitation and validation. The increased availability, short run times, and low cost of running benchtop sequencers make them an attractive tool for targeted analyses. The Illumina Miseq is currently capable of generating up to 15 Gb of sequence and 50 million paired-end reads passing quality filters on a single-lane flow cell. Additionally, sensitive optics and more precise base calling allow for low-diversity samples to be sequenced on the MiSeq. This decreases the amount of control libraries, like PhiX, that have to be added to improve base calling diversity, and reduces the amount of sequence lost to control sequences. Based on the performance of the method and the most conservative error rate model, more than 2 kb of target region(s) in 96 samples can be analyzed with high accuracy with 1 flow cell in a week by this method. Our findings show the utility of the MiSeq in future targeted epigenetic studies. Not only can the MiSeq generate enough data to sufficiently and accurately quantify DNA methylation, it can also be scaled in a cost-effective manner, depending on the amount of sequence (for example, number of samples, targets, and size of targets) through the use of the different size single-lane flow cells. Additionally, the ability of the MiSeq base-calling algorithms to handle low diversity samples, make it an excellent tool for sequencing bisulfite-converted, cytosine deficient DNA.
Limitations of the BSAS method, as described here, could arise from (1) PCR bias in the original bisulfite specific PCR, (2) bias in the transposome-mediated DNA fragmentation and adapter ligation, (3) and the inability of BSAS to distinguish between 5-mC and 5-hmC. PCR bias is not an issue when primers are generated properly. Parameters for designing bisulfite PCR primers that avoid bias have been addressed in the literature . In our method, we observed no PCR bias in the data presented. Upon quantitative methylation analysis of the CpG sites targeted here, there was also no bias in the strand direction of the called variants. Additionally, the counts at the CpG sites were required to be in both the forward and reverse reads. Furthermore, we observed no GC bias in the tagmentation reaction within our amplicons sequenced using the Nextera XT protocol. As a result of targeting genomic regions, GC bias is less likely to occur. Previous studies validating the tagmentation-based library generation protocol demonstrated no GC bias in transposome tagmentation . Additionally, previous studies have shown no differential GC bias between traditional ligation chemistry-based library generation methods when compared to tagmentation-based methods with bisulfite conversion . The limitation of this method is the drop off in sequence depth at the end of the amplicons sequenced. This was attributed to the reduced likelihood of the transposome to insert at the ends of the amplicons. Finally, BSAS does not allow the quantitation of both 5-mC and 5-hmC. Our analysis may include both of these modifications. Future modifications of the protocol to be able to distinguish from these two modifications will be beneficial as studies of epigenetics progress. For example, the use of oxidizing agents, or glucosylation coupled to bisulfite conversion and analysis through the BSAS for separate 5-mC and 5-hmC quantitation [29, 30].