Evaluation of whole-genome DNA methylation sequencing library preparation protocols
Epigenetics & Chromatin volume 14, Article number: 28 (2021)
With rapidly dropping sequencing cost, the popularity of whole-genome DNA methylation sequencing has been on the rise. Multiple library preparation protocols currently exist. We have performed 22 whole-genome DNA methylation sequencing experiments on snap frozen human samples, and extensively benchmarked common library preparation protocols for whole-genome DNA methylation sequencing, including three traditional bisulfite-based protocols and a new enzyme-based protocol. In addition, different input DNA quantities were compared for two kits compatible with a reduced starting quantity. In addition, we also present bioinformatic analysis pipelines for sequencing data from each of these library types.
An assortment of metrics were collected for each kit, including raw read statistics, library quality and uniformity metrics, cytosine retention, and CpG beta value consistency between technical replicates. Overall, the NEBNext Enzymatic Methyl-seq and Swift Accel-NGS Methyl-Seq kits performed quantitatively better than the other two protocols. In addition, the NEB and Swift kits performed well at low-input amounts, validating their utility in applications where DNA is the limiting factor.
The NEBNext Enzymatic Methyl-seq kit appeared to be the best option for whole-genome DNA methylation sequencing of high-quality DNA, closely followed by the Swift kit, which potentially works better for degraded samples. Further, a general bioinformatic pipeline is applicable across the four protocols, with the exception of extra trimming needed for the Swift Biosciences’s Accel-NGS Methyl-Seq protocol to remove the Adaptase sequence.
The dynamic interplay between various epigenetic modifications influence cellular differentiation, lineage specification, tissue development, and can also promote oncogenic states through changes in histone modifications and DNA methylation. In mammals, DNA methylation typically occurs at the 5’ position of CpG dinucleotides (denoted 5-mC) and remains the single best studied epigenetic mark, partly due to its robustness through most storage conditions and histological preparations, such as formalin-fixed, paraffin-embedded (FFPE) samples .
DNA methylation can be assayed with various principles , such as bisulfite conversion , restriction enzyme digestion, or differential affinity for methylated DNA binding proteins. Array-based and sequencing-based methods built upon these principles have been developed and benchmarked [4,5,6]. The current gold-standard approach to examine genome-wide DNA methylation composition and differences is through chemical modification of unmethylated cytosines using sodium bisulfite . Bisulfite deaminates unmethylated cytosines (Cs) to uracils that are converted to thymines (Ts) during PCR amplification. Methylated cytosines (mCs) remain unaltered through this process. The end result yields stable genetic differences between methylated (C) and unmethylated cytosines (T), reflecting the underlying DNA methylation landscape, effectively turning the epigenetic difference into a genetic difference, which can then be studied using conventional genome-scale methods, such as microarrays or sequencing. Various generations of bisulfite-based microarrays have been used to profile hundreds of thousands of human samples due to the low cost and easy, standardized data processing and analysis. With dropping sequencing cost in recent years, the popularity of sequencing-based methods has been on the rise .
Whole-genome bisulfite sequencing (WGBS) provides the most comprehensive single base resolution DNA methylation maps. It was successfully applied to Arabidopsis thaliana in 2008 [9, 10] and then to humans in 2009 . In these early methods, adapter-ligated library material undergoes bisulfite conversion, leading to sheared and degraded library fragments and overall lower quantities and diversity of sequenceable material. A post-bisulfite adapter tagging (PBAT) method [12, 13] was developed to overcome this hurdle, effectively decreasing the input range to nanogram level. Notably, this method has been used for single-cell WGBS profiling . More recently, Swift Biosciences has produced a kit that is an alternative approach to the post-bisulfite library preparation. The alternative approach maintains the low DNA input capabilities of PBAT, while also including a low-complexity sequence on the 3′ end of the ssDNA during library preparation that serves as a scaffold for sequencing adapter attachment (Accel-NGS Methyl-Seq protocol, Swift Biosciences).
The conditions needed for bisulfite conversion are known to be harsh on the DNA and cause degradation. In recent years, it has become clear that this conversion can also be achieved with an enzymatic approach. 5-mCs can be converted to 5-hydroxymethylcytosine (5-hmC), then to 5-formylcytosine (5-fC), and eventually to 5-carboxylcytosine (5-caC) by the ten–eleven translocation (TET) family dioxygenases . Further, the apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3A (APOBEC3A) deaminates methylated and unmethylated Cs into thymines and uracils, respectively . While both 5-mCs and Cs are affected by this process, the TET-oxidized methylcytosines, 5-hmC, 5-fC, and 5-caC are minimally impacted . Based on this principle, an enzymatic methyl-seq (EM-seq) method was recently developed by New England Biolabs. Their method uses TET2 to oxidize methylated cytosines and subsequent APOBEC3A treatment to convert unmethylated cytosines to uracils . WGBS and EM-seq, collectively referred to as whole-genome methylation sequencing (WGMS), both convert a 5-mC/C difference to a C/T difference; therefore, analysis tools developed for WGBS are also applicable to EM-seq.
In this study, we extensively benchmark the performance of three most commonly used protocols for bisulfite-based whole-genome DNA methylation profiling including the KAPA Hyper Prep kit (Kapa), the Miura and Ito post-bisulfite adapter tagging (PBAT) method, and the Swift Biosciences Accel-NGS Methyl-Seq DNA library kit (Swift), as well as the new EM-seq protocol from New England Biolabs, the NEBNext Enzymatic Methyl-seq kit (NEB), on fresh–frozen human tissue samples. For each technique, we evaluate input quantities, read mapping statistics, library complexity, insert size, cytosine retention, as well as reproducibility between replicates. We also present bioinformatic analysis pipelines for each of these library types.
Benchmarking studies often use cell lines due to the largely isogenic background and reduced biological variance to probe the reproducibility of the method. However, it is often of interest to apply these approaches to complex tissue sources for primary research. Thus, we chose to use frozen normal primary human solid tissue to benchmark these kits in a more common research scenario. We used human fallopian tube samples, which are believed to host the presumed cell-of-origin for high-grade serous ovarian cancer . The results presented are based on two snap-frozen primary patient fallopian tube samples (denoted Biological Replicates A and B). DNA from each sample was prepared using one of the four library preparation protocols, as summarized in Table 1. With the exception of the PBAT protocol, two aliquots from the same DNA extraction were used to produce technical replicates for each protocol (denoted Technical Replicates 1 and 2). The PBAT protocol does not have a technical replicate due to the poor quality of the initial sequencing run and not enough leftover library material to generate additional sequencing information for either replicate. In addition to generating samples using the suggested amount of DNA input, a smaller DNA input (10 ng each) was used in the NEB and Swift protocols to test the effectiveness of low DNA inputs on these protocols, which performed best in the initial testing. (NEB and Swift state they can go as low as 10 ng  and 100 pg , respectively.) Table 1 shows the details for each protocol used in this study.
The first set of metrics compared between the four preparations is related to the quality of raw reads received from the sequencer, including base quality and adapter contamination, and the effects of trimming on those reads. In general, the raw base qualities, percentage of reads with adapter contamination, and percentage of bases trimmed are comparable between the Kapa, NEB, and Swift protocols (Fig. 1 and Additional file 1: Figure S1). However, the PBAT protocol suffers from a higher percentage of low-quality bases along the length of the read, leading to a higher percentage of trimmed bases during the trimming stage. Because the PBAT protocol does not contain an amplification step during library preparation, the higher percentage of trimmed bases relative to the other preparations has a larger effect on the amount of usable data available from this protocol.
Following adapter and quality trimming of the raw reads, the reads were mapped to the human genome and the fraction of optimally aligned (defined as MAPQ \(\ge 40\)), sub-optimally aligned (MAPQ \(<40\)), and not aligned read fragments was calculated. For Sample A, the NEB and Swift protocols had about the same fraction of read fragments that were optimally aligned (\(\sim 85\)%), regardless of the amount of input DNA (Fig. 2A). The Kapa protocol was slightly behind (\(\sim 80\)%), while the PBAT protocol was closer to \(\sim 75\)%. The lower percentage of optimally aligned reads for the PBAT protocol, when coupled with the substantially lower number of read fragments, means there is much less data to use when performing analyses using this protocol.
Another metric that can speak to the quality of a library preparation is the insert size, which relates to the size of the sequenced DNA fragments. For these experiments, DNA used in the Kapa, NEB and Swift libraries were generated in a single reaction, then split into individual aliquots for library generation. Given the uniformity of input, it would be expected that the resulting libraries would also have identical size profiles. However, bisulfite preparations (Kapa and Swift) led to shorter fragments compared to the enzymatic preparation (NEB), which retained a wider range of fragment lengths that are generally longer than the bisulfite fragments (Fig. 2B). The retention of a smaller range of shorter fragments by the bisulfite preparations suggest degradation of the DNA during library generation, which is consistent with the known tendency of bisulfite conversion to degrade samples. Whereas the other samples were sheared prior to conversion, the PBAT samples were only sheared by the bisulfite process itself. This process is quite destructive, leaving shorter fragments than the other bisulfite conversion protocols. The Swift protocol had the longest and more consistent insert size out of all bisulfite-based methods.
With regards to the fraction of reads with MAPQ \(\ge 40\) marked as duplicates (Fig. 2C), the standard-input NEB, PBAT, and Swift protocols each had about 10% of duplicate reads. The low-input NEB protocol was slightly higher at \(\sim 15\)%. The Kapa (standard-input) and low-input Swift protocols had the highest percentage of reads, sitting closer to 25%.
The library complexity is a metric that can be used to determine if a library has reached a saturation point, where sequencing deeper will only gain a marginal amount of unique (i.e., not duplicate) reads. While none of the samples were sequenced to saturation, the Kapa and low-input Swift samples showed a lower level of complexity compared with the NEB and standard-input Swift samples (Fig. 2D). Due to the PBAT samples having much fewer reads, it is difficult to ascertain where the PBAT library complexity ranks compared to data derived from the NEB or Kapa protocols at higher read depths. At the PBAT sequencing depth in this study, it appears to have a trend similar to the Kapa protocol complexities, implying an overall lower level of library complexity relative to Swift and NEB data.
To determine the uniformity of reads distributed across the genome, the ratio of the observed read coverage to the expected coverage was calculated across several regions (Fig. 3A–B and Additional file 1: Figure S7). In general, there is consistent coverage across the genic, intergenic, and repeat-masked regions, with each sample having less than a 5% departure from expected uniformity (closer to 1.0 is better) in these regions (Additional file 1: Figure S7A–C). In contrast, exonic regions, all CpGs, and CpG islands show greater heterogeneity across kits and larger departures from uniform coverage (Fig. 3A–B and Additional file 1: Figure S7D). PBAT had a much higher observed rate of coverage compared to expected. The other protocols all have lower coverage than would be expected, with the NEB samples having the closest to uniform coverage and the Kapa samples having the lowest coverage of these regions. The undercoverage of CpGs by the Kapa protocol is consistent with a prior study . One example of the differences in coverage uniformity across the protocols can be found in the EPCAM promoter region (Fig. 4).
When performing WGMS, a library preparation’s coverage of cytosines, particularly in a CpG context, is important to assess the DNA methylation landscape of a given sample. When comparing the protocols used in this analysis, the Kapa protocol has the lowest percentage of CpGs covered in a number of different regions on these samples (Fig. 3C–D and Additional file 1: Figure S8), which is consistent with previous work . Generally, the other preparations have coverage over 85% at 150 million mapped reads, with the exception of the Swift samples’ coverage of CpG islands, which is in the 75-80% range. It should be noted that the low DNA input runs of the NEB protocol produced coverage levels that are consistent with the standard DNA input runs across samples and technical replicates.
These methods, bisulfite- or enzyme-based, all distinguish DNA methylation states by converting unmethylated cytosines into uracils and subsequently thymines during PCR, while sparing methylated cytosines. Therefore, effective conversion of unmethylated cytosines, but not methylated cytosines, is key to the accuracy of these methods. Using mitochondrial DNA, which is consistently unmethylated, or spike-in controls, such as unmethylated lambda phage or methylated pUC19 vectors, the effectiveness of the conversion can be tested. Fig. 5 shows the results of cytosine conversion on lambda phage (Fig. 5A), pUC19 (Fig. 5B), and mitochondrial DNA (Fig. 5C). Generally, the cytosine conversion behaved as expected, with the exception of mitochondrial DNA in the PBAT protocol. The beta values show a broad distribution centered slightly below 0.2 for Sample A and about 0.25 for Sample B. Interestingly, the mitochondrial DNA-based incomplete-conversion rate was different from that based on lambda phage for PBAT. This is likely because mitochondrial DNA is circular and can become supercoiled. Unlike the other protocols, there is no mechanical shearing of the DNA in PBAT, which could explain this difference. This also shows the limitation of using mitochondrial DNA as negative controls.
The read-averaged cytosine retention for all protocols reflect what would be expected for a WGMS run, namely CpG retention above 70% and the other cytosine contexts around 1% (Fig. 6). Moreover, the technical replicates showed consistent retention, while the two biological replicates showed a bigger difference. Across all protocols, Sample A consistently had higher CpA retention, but not CpC or CpT retention. It is known that biological CpH retention tends to occur in the CpA dinucleotide context . The higher CpA retention, in contrast to CpC and CpT retention, shows Sample A likely has true CpH methylation, a process previously thought to be largely restricted to embryonic stem cells and neurons . Our results also confirm that, unlike CpA methylation, CpC and CpT retention likely do not reflect true biological methylation, but incomplete conversion, at least in mammalian samples. The Swift and NEB protocols generally showed the lowest amount of such technical artifact (both below 0.5%), except for one replicate of the low input Swift preparation. Similar results can be seen for the base-averaged cytosine retention (Additional file 1: Figure S10).
The M-bias plots  (Additional file 1: Figure S11) show a consistently lower CpG retention across the entire read length for both reads 1 and 2 of the PBAT protocol. For the Kapa protocol, there is a consistently higher CpG retention rate for read 1 than the other preparations. However, read 2 tends to be more in line with the retention rate seen in the others. For CpH retention, each protocol has approximately the same level of retention, with slight deviations in the first 5 bp. Due to the adapter trimming performed (see "Methods" for details), retention rates can behave erratically at the end of reads depending on the base content of the adapter that is trimmed. It should also be noted that, by default, the aligner used in this analysis (see "Methods" section) does not include cytosines in the first and last 3 bp of a read when determining methylation beta values, so erratic behavior on the ends of the reads is not included in methylation-related metrics.
To compare the consistency of CpG beta values, Spearman correlation coefficients were calculated between technical replicates of each protocol (Fig. 7), as well as between preparations of the same sample (Additional file 1: Table S1). Again, all samples were subsampled to 150 million reads (equivalent of \(\sim\)4.8X coverage) to ensure fair comparison. The correlation would be higher if the number of reads increases. The correlations between the NEB technical replicates, both the standard and low-input samples, had the highest correlation at 0.9 or higher. The Kapa protocol had the lowest correlation (i.e., less consistency in beta values between replicates) of just under 0.75. The standard-input Swift sample did better than the low-input sample, with correlations of 0.873 and 0.814, respectively. A correlation for the PBAT protocol could not be calculated because PBAT did not have a technical replicate. When comparing preparations to one another, the Kapa protocol had low correlations, with the maximum correlation of 0.62 to the Swift protocol (Additional file 1: Table S1). Only two of the other ten correlations were 0.8 or below, both of which were between the PBAT and Swift protocols. Of the NEB and Swift protocols, the NEB protocol had the highest cross-protocol correlations, with overall greater correlations compared to the Swift protocols.
To further compare the consistency of CpG beta values, a principal component analysis (PCA) was performed (Fig. 8). The PCA shows the data generally splits along two variables: the sample the data came from and the protocol used to generate the library. With respect to the sample split, Sample A clusters in the upper right of the plot, while Sample B clusters in the lower left. The protocol split occurs consistently in both samples, with PBAT and Kapa each separated into their own clusters, while Swift and NEB yielded similar results.
Because of the different approach to cytosine conversion used by the NEB protocol relative to others (enzymatic versus chemical), the difference in beta values between the standard-input NEB and Swift protocols were compared to look for bias in the methylation level of the NEB protocol. After calculating the beta value differences for Sample A, only four CpGs with \(|diff| > 0.5\) for both Technical Replicate 1 and 2 were found (Additional file 1: Figure S13). However, in Sample B, only one (which is not one of the four CpGs in Sample A) was seen, so this difference was taken to be sample-dependent and not due to an inherent bias in the enzymatic conversion process.
After comparing the four different WGMS library preparations, we noted that the NEBNext Enzymatic Methyl-seq kit (EM-seq) performed better in almost every metric, both with a standard amount (200 ng) and a low amount (10 ng) of input DNA. The NEB samples had higher quality libraries, with larger insert sizes and higher library complexity than the other kits, while having at least a comparable, if not more uniform, distribution of reads across the genome. The larger insert sizes lead to better mappability during alignment (i.e., a higher fraction of optimally aligned reads). Moreover, it also facilitates read-level analysis, such as epi-polymorphism  and allele-specific DNA methylation analyses. The higher library complexity implies more unique information could be gleaned from deeper sequencing of these libraries. Furthermore, the low-input NEB samples showed little reduction in library complexity relative to the standard-input NEB samples. Because of this, it may be possible to push the lower threshold of DNA input below NEB’s recommended limit of 10 ng. When comparing methylation-related metrics, the NEB samples were among the highest in percentage of CpGs covered, with consistent cytosine retention across all reads and a sub-one percent CpH retention level.
Among traditional bisulfite-based methods, the Swift Biosciences Accel-NGS Methyl-Seq DNA library kit performed best, which is consistent with a previous study that compared Swift with two other bisulfite-based protocols, TruSeq and QIAseq . Its results were comparable enough to the NEB protocol, particularly at the standard-input level (100 ng), such that conclusions drawn from both protocols should be fairly consistent. Its performance slightly trailed that of the NEB as the input amount dropped in terms of library complexity, but nonetheless performed well. It is noteworthy that the comparisons in this study were performed on high-quality DNA from snap-frozen samples. For more challenging specimens such as FFPE samples, or non-conventional samples, bisulfite-based methods may outperform enzymatic methods. In addition, Swift and Kapa both use a uracil tolerant polymerase, which increases processivity for deaminated nucleotides present in FFPE and ancient DNA samples. Moreover, the adaptase module in Swift makes it even more effective for FFPE, as it is designed to work with single stranded material. Therefore, it likely can better tolerate nicked strands from the FFPE process (as well as those generated from bisulfite conversion). Indeed, in our preliminary studies, the Swift kit performed best for FFPE samples (data not shown).
All kitted protocols had comparable operability in the lab. Kapa and Swift took roughly 7 h to process, which can fit in a single workday. The NEB enzymatic protocol was a 9-h protocol but had several convenient stopping points noted to easily break processing into 2 days. Actual hands-on time for all three protocols was about 4.5 h. The PBAT protocol was the only technically difficult and time-consuming one, mostly because of the requirement to make and quality-control solutions. This protocol took between 14 and 16 h to complete, about 7 of which were hands on.
One notable difference separating the four DNA methylation library generation methods is library uniformity, particularly in CpG island regions (CGIs). Kapa library coverage was underrepresented and PBAT overrepresented in these regions, while Swift and NEB generated relatively uniform libraries (Figs. 3 and 4). We propose that mechanistic differences in library generation account for this phenomenon, with both the relative timing of bisulfite conversion and the choice of random priming versus adapter ligation affecting the CGI coverage in the downstream dataset. In the Kapa protocol, sheared and adapter-ligated libraries undergo bisulfite conversion, causing preferential strand breakage in CGI regions and inhibiting post-bisulfite conversion PCR steps, as both priming regions are no longer present on the library fragment. In the PBAT protocol, libraries are not sheared. Instead, the bisulfite conversion induces random DNA breaks [25,26,27,28], again more prevalently in the CGI regions, making these small fragments more available for continued library generation than larger fragments. In addition, CGIs are G:C rich; during the random priming step of the PBAT library creation process, these higher G:C regions generate more stable primer annealing conditions and increase the likelihood of extension and incorporation into the final library due to a slightly higher regional melting temperature and exhibition of G:C clamping features . The more even CGI coverages exhibited by the Swift and NEB libraries are also protocol related: in the Swift protocol, bisulfite conversion precedes manual shearing and adapter ligation, avoiding the strand-breakage issue and generating relatively diverse libraries, and NEB’s process completely avoids the deleterious bisulfite conversion step through enzymatic reactions.
It should be noted that, while the PBAT protocol favors CpG islands, we do not necessarily suggest using this protocol if one is specifically targeting CpG island regions using sequencing. As they are not PCR amplified, PBAT libraries produce relatively little sequenceable material and, as a result, it was difficult to get enough depth to complete our analysis. This is evident in our lack of replication of the PBAT library, as our technical replicate did not produce enough information for a thorough analysis despite exhausting the library during the sequencing run. Further, as shown here, both NEB and Swift, to a lesser degree, are able to push the input threshold well below that of PBAT, while maintaining better quality libraries and more uniform coverage across the genome.
The clinical utility of DNA methylation has been long recognized, explored, and established, particularly for cell-free DNA (cfDNA)-based early detection of neoplasm. The current gold-standard technologies for DNA methylation profiling are bisulfite-based. Harsh bisulfite treatment is known to cause heavy degradation of the template, which is often scarce to begin with for clinical cfDNA samples. cfMeDIP-seq  has been developed to overcome this limitation, pushing the lower input limit to 1–10 ng. However, it is associated with similar limitations of common affinity-based methods [2, 30]. As the enzymatic conversion by the NEB protocol still demonstrates a library complexity at 10 ng comparable to that of regular bulk DNA-based methods, the NEB protocol may prove to be a good alternative approach for clinical early detection work. In addition, with its template-preserving nature and excellent performance at low-input levels, the NEB protocol could hold more promise for single-cell DNA methylation profiling.
In this study, we compared four commonly used WGMS library preparation protocols, including three bisulfite-based protocols and one enzyme-based protocol. Table 1 shows a summary of the protocols used, while Table 2 summarizes some of the results found in the analysis. We found the NEBNext Enzymatic Methyl-seq and the Accel-NGS Methyl-Seq kits performed quantitatively better than the other two protocols at the standard-input level of DNA for each kit. We found the NEB kit to perform comparably across biological and technical replicates for two different amounts of DNA input, whereas the Swift kit showed some decline with the lower amount of input. Based on these results, we recommend use of the NEBNext Enzymatic Methyl-seq kit for whole-genome DNA methylation sequencing.
Fallopian tube sample preparation
Fallopian tubes from two primary patients were delivered in saline from a local hospital. Upon arrival, samples were washed using sterile 1x Phosphate Buffered Saline and snap-frozen in liquid nitrogen. At a later date, the fallopian tubes were thawed and minced. DNA was extracted from the minced fallopian tubes using the tissue protocol from Qiagen’s DNeasy Blood & Tissue Kit (69504). Following extraction, dsDNA was quantified using Invitrogen’s Qubit 3.0 Fluorometer.
Whole-genome methylation sequencing libraries
Methylated pUC19 and unmethylated lambda phage DNA (0.0005% and 0.01%, respectively) were added to each high molecular weight genomic DNA sample. These are included as methylation controls in the NEB protocol and were added to the DNA used in each protocol for consistency. The DNA was then sheared to approximately 350 bp in average size for all prepared libraries, with the exception of the post-bisulfite adapter tagging (PBAT) libraries, where the DNA was not initially sheared.
Libraries were prepared from the KAPA Hyper Prep kit (v6.17) (Roche Sequencing, Cat. #KK8504) with an input of 300 ng of sheared DNA following the manufacturer’s protocol with the following modifications. Illumina TruSeq Nano adapters at a concentration of 10 \(\mu\)M were used. The post ligation cleanup elution was reduced to 20 \(\mu\)L and the entire DNA elution went into the EZ DNA Methylation-Gold kit (Zymo Research, Cat. #D5005). The bisulfite converted DNA was eluted in 20 \(\mu\)L and 10 cycles of library amplification were performed using the KAPA HiFi HotStart Uracil+ ReadyMix (Roche Sequencing, Cat. #KK2800).
Libraries were prepared from the Accel-NGS Methyl-Seq DNA library kit (v3.0) (Swift Biosciences, Cat. #30024) with an input of either 10 ng or 100 ng of sheared DNA following manufacturer’s protocol with the following modifications. The DNA was bisulfite converted using the EZ DNA Methylation-Gold kit (Zymo Research, Cat. #D5005) with an elution volume of 15 \(\mu\)L. Following adapter ligation, either 8 cycles (10 ng DNA input) or 4 cycles (100 ng DNA input) of library amplification were performed.
Libraries were prepared from the NEBNext Enzymatic Methyl-seq kit (New England Biolabs, Cat. #E7120L) using an input of either 10 ng or 200 ng of sheared DNA and libraries were made according to the manufacturer’s protocol. The denaturation method used was 0.1 N sodium hydroxide, according to the protocol, and either 8 cycles (10 ng DNA input) or 4 cycles (200 ng DNA input) of PCR amplification were performed.
Libraries were prepared from the PBAT method described by Miura and Ito  using an input of 100 ng of genomic DNA that went directly into the EZ DNA Methylation-Gold kit (Zymo Research, Cat. #D5005) with an elution volume of 21 \(\mu\)L. Then, 10 \(\mu\)L of the bisulfite converted DNA was used for making the PBAT libraries as previously described in  with the modification of using KAPA HiFi HotStart Uracil+ ReadyMix (Roche Sequencing, Cat. #KK2800) for the DNA template extension step.
KAPA pure beads (Roche Sequencing, Cat. #KK8001) were used for cleanup steps for all prepared libraries.
Quality and quantity of the finished libraries were assessed using a combination of the Agilent High Sensitivity DNA chip (Agilent Technologies, Inc., Cat. #5067-4626), QuantiFluor® dsDNA System (Promega Corp., Cat. #E2670), and Kapa Illumina Library Quantification qPCR assay (Roche Sequencing, Cat. #KK4824). 100 bp paired-end sequencing was performed on an Illumina NovaSeq6000 sequencer using an S4, 200 bp sequencing kit (Illumina Inc., San Diego, CA, USA), with 10% PhiX. Base calling was done by Illumina RTA3 and output of NCS was demultiplexed and converted to FASTQ format with Illumina Bcl2fastq (v1.9.0).
Alignment and methylation extraction
Upon receipt of the FASTQ files, the files were trimmed using Trim Galore  version 0.6.4_dev (using Cutadapt version 2.10). Default inputs were used, other than: –illumina –trim-n –paired –cores 4 –fastqc –fastqc_args "–noextract". In addition, the FASTQ files for the Swift samples included –clip_R2 14, due to the Adaptase™ method used by Swift Biosciences .
The trimmed FASTQ files were aligned to GRCh38  using BISCUIT  version 0.3.16. An index for the reference genome was created using biscuit index GRCh38.p13.genome.fa, followed by aligning each sample to the indexed reference. The alignment step used the default options for biscuit align, with these exceptions: -M -t 20 -R sample_specific_read_group. Each sample received its own read group (-R tag). The aligned reads were duplicate marked using Samblaster  version 0.1.25, with the -M flag and defaults. The reads were then coordinate sorted and indexed using Samtools  version 1.10 (with htslib 1.10.2). Default options were used, with the following exceptions -@ 20 -m 5G -o sample_name.sorted.markdup.bam -O BAM (sort) and -@ 20 (index).
The extraction of cytosine methylation information proceeded as follows. Pileup VCF files were generated using biscuit pileup with default parameters. bgzip and tabix (included with htslib version 1.10.2) were used to compress and index the VCF files. Default parameters were used for bgzip and tabix, with the exception of -p vcf in the call for tabix. The VCF files were then processed through biscuit vcf2bed, bedtools sort (bedtools  version 2.29.2), and biscuit mergecg to create coordinate sorted BED files containing CpG methylation beta value information. For each command, the default parameters were used. After creating the BED files, they were compressed and indexed using bgzip and tabix, with default parameters being used in both cases, with tabix also including -p bed.
Any code not explicitly stated in this and subsequent sections can be found on GitHub .
FASTQ and alignment quality control
Quality-control data were collected from a number of sources and viewed using MultiQC  version 1.9. Quality control for the FASTQ files was generated by FastQC  version 0.11.9 and Cutadapt during the trimming process. Command arguments used can be found in the previous section. Statistics on the percentage of duplicate marked reads were produced by Samblaster. Library complexity was calculated by Preseq  version 2.0.3 using these options, c_curve -B -P -v -o sample.complex.ccurve.txt, where “sample” is the name of each sample. Quality controls from Samtools were generated with samtools stats and samtools flagstat, with default parameters and -@ 20. BISCUIT includes a custom BASH script to generate quality-control statistics related to data aligned by BISCUIT. This script was run with this command, QC.sh -v samp.pileup.vcf.gz -o samp_QC hg38_assets GRCh38.p13.genome.fa samp samp.sorted.markdup.bam. In each case, “samp” corresponds to the name of the processed sample.
The hg38_assets mentioned in the BISCUIT quality-control script command can be found in a zip file on the BISCUIT GitHub release page .
Library protocol comparison analysis
To collect statistics related to the raw reads stored in the FASTQ files, a custom Python (version 3.7.6) script was written. It uses the gzip, glob, and time Python base modules and these additional Python packages: argparse (version 1.1), numpy (version 1.18.1), and biopython (version 1.76). Statistics regarding the trimmed reads were collected from log files generated by Cutadapt, as described previously.
For a number of the analyses, the aligned BAM files were subsampled before calculating the corresponding metric. The BAMs were subsampled using samtools view -hbu -F 0x4 -q 40 sample.bam | samtools view -hbu -s FRAC -. “FRAC” was calculated as
for each sample. The subsampled BAMs were sorted and indexed using Samtools. Pileup VCF files and merged CpG BED files were generated in a similar manner to the original BAMs, as described previously.
Using the subsampled BAMs, the average coverage and percentage of covered CpGs within different genomic regions, including CpG islands, exons, genes, and repeat-masked regions, were calculated using custom scripts. The CpGs that fell in each region were determined by intersecting a BED file containing CpG coordinates and the coverage at those locations with a BED file containing the region’s coordinates. The coverage was determined using bedtools genomecov, while the intersection was done using bedtools intersect. The average coverage was calculated by taking a weighted average of the coverage for each CpG. The percentage of covered CpGs was calculated by
The scripts used to calculate these values made use of GNU parallel .
The observed coverage to expected coverage ratio was calculated as \(ratio = Obs / Exp\), where:
This formulation assumes all bases are not equally accessible when sequencing. The expected coverage takes into account this difference in accessibility by including mappability scores based on the Bismap k100 multi-read mappability scores . The observed coverage does not include these scores, as mappability is assumed to be inherently included when performing DNA sequencing. Because Bismap did not include a mappability score for every base in the genome, the expected and observed coverage calculations were restricted to those bases that included a mappability score. Only mapped reads (FLAG field in BAM does not include 0x4 flag) with MAPQ score \(\ge 40\) were included in this calculation. The values for each observed to expected ratio were calculated using custom BASH scripts that used bedtools and GNU awk (version 4.0.2).
The beta values for the lambda phage and pUC19 methylation controls were extracted using the same method as the genomic methylation extraction (see Alignment and Methylation Extraction for the details), with the one exception being that only one read was required to cover a CpG. The coverage requirement difference was due to the fractional amount of lambda phage and pUC19 that were included in the sequencing compared with the amount of genomic DNA. The mitochondrial DNA beta values were taken directly from the genomic methylation BED file.
To calculate correlations between samples, the genome was binned into 100 kb bins and the average beta value calculated for CpGs in each window. The bins were determined via bedtools makewindows -w 100000 -g GRCh38.p13.genome.fa.fai | sort -k1,1 -k2,2n. The “.fai” file was generated via samtools faidx GRCh38.p13.genome.fa. The average beta value for each bin was calculated via bedtools map -a bins.bed -b sample.bed -c 4,5 -o mean | gzip. bins.bed contains the 100 kb bins, while sample.bed is the merged CpG BED file generated from the subsampled BAMs. Correlations between the 100 kb averaged beta values were calculated using the Spearman correlation coefficient, as implemented in Python’s scipy package (version 1.4.1).
The PCA was performed using the average beta values in 100 kb bins, which were calculated in the same way as the correlation analysis. After calculating the beta values, they were transformed into smoothed M-values via:
where M is the number of methylated cytosines and U is the number of unmethylated cytosines at a given CpG. The smoothing factor, k, eliminates the infinities that occur at 0 and 1 in the logit transformation. The numpy logit function was used to apply the transformation. This conversion turns the beta-distributed beta values into the more Gaussian-distributed M-values. After converting to M-values, the data was standardized using the StandardScaler function, as implemented in scikit-learn (sklearn version 0.24.0). The PCA was performed using scikit-learn’s PCA function, keeping only the first two components (n_components=2).
The analysis for methylation bias in the NEBNext Enzymatic Methyl-seq kit was performed by extracting CpGs that had more than 20 reads covering them for both the NEB and Swift protocols for Sample A replicate 1 and 2 or the NEB and Swift protocols for Sample B replicate 1 and 2. Methylation bias was determined by requiring both Equations 6 and 7 to be true.
This was done separately for the standard DNA input NEB and Swift protocols of Samples A and B.
The CpG and other genomic region BED files mentioned in this section can be found on the GitHub release page for the analysis code.
Availability of data and materials
Sequencing data are from human tissue and will be available through dbGaP. Software used in processing the data is available on GitHub at https://github.com/jamorrison/wgms_kit_comparison/releases/tag/v1.0.4.
Polymerase chain reaction
Whole-genome bisulfite sequencing
Whole-genome (DNA) methylation sequencing
Post-bisulfite adapter tagging
Apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3A
Laird PW. The power and the promise of DNA methylation markers. Nat Rev Cancer. 2003;3:253–66.
Laird PW. Principles and challenges of genome-wide DNA methylation analysis. Nat Rev Genet. 2010;11:191–203.
Clark SJ, Harrison J, Paul CL, Frommer M. High sensitivity mapping of methylated cytosines. Nucleic Acids Res. 1994;22:2990–7.
Bock C, Tomazou EM, Brinkman AB, Müller F, Simmer F, Gu H, et al. Quantitative comparison of genome-wide DNA methylation mapping technologies. Nat Biotechnol. 2010;28:1106–14.
Harris RA, Wang T, Coarfa C, Nagarajan RP, Hong C, Downey SL, et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat Biotechnol. 2010;28:097–1105.
The BLUEPRINT consortium. Quantitative comparison of DNA methylation assays for biomarker development and clinical applications. Nat Biotechnol. 2016;34:726–37.
Clark SJ, Statham A, Stirzaker C, Molloy PL, Frommer M. DNA methylation: Bisulphite modification and analysis. Nat Protoc. 2006;1:2353–64.
Zhou W, Dinh HQ, Ramjan Z, Weisenberger DJ, Nicolet CM, Shen H, et al. DNA methylation loss in late-replicating domains is linked to mitotic cell division. Nat Genet. 2018;50:591–602.
Lister R, O’Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, et al. Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis. Cell. 2008;133:523–36.
Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008;452:215–9.
Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009;462:315–22.
Miura F, Enomoto Y, Dairiki R, Ito T. Amplification-free whole-genome bisulfite sequencing by post-bisulfite adaptor tagging. Nucleic Acids Res. 2012;40:e136.
Miura F, Ito T. Post-Bisulfite Adaptor Tagging for PCR-Free Whole-Genome Bisulfite Sequencing. In: DNA Methylation Protocols. New York, NY: Springer New York; 2018. p. 123–136.
Smallwood SA, Lee HJ, Angermueller C, Krueger F, Saadeh H, Peat J, et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat Methods. 2014;11:817–20.
Vaisvila R, Ponnaluri VKC, Sun Z, Langhorst BW, Saleh L, Guan S, et al. EM-seq: Detection of DNA Methylation at Single Base Resolution from Picograms of DNA. bioRxiv. 2019. https://doi.org/10.1101/2019.12.20.884692.
Labidi-Galy SI, Papp E, Hallberg D, Niknafs N, Adleff V, Noe M, et al. High grade serous ovarian carcinomas originate in the fallopian tube. Nat Commun. 2017;8:1093.
New England Biolabs Inc . NEBNext Enzymatic Methyl-seq (EM-seq) Technical Report; 2019. Available from: https://www.neb.com/products/e7120-nebnext-enzymatic-methyl-seq-kit#Product%20Information.
Swift Biosciences. Swift Protocol: Accel-NGS Methyl-Seq DNA Library Kit; 2020. Available from: https://swiftbiosci.com/wp-content/uploads/2020/02/PRT-019-Methyl-Seq-Protocol-Rev-3.pdf.
Nair SS, Luu PL, Qu W, Maddugoda M, Huschtscha L, Reddel R, et al. Guidelines for whole genome bisulphite sequencing of intact and FFPET DNA on the Illumina HiSeq X Ten. Epigenetics & Chromatin. 2018;11:24.
Schultz MD, He Y, Whitaker JW, Hariharan M, Mukamel EA, Leung D, et al. Human body epigenome maps reveal noncanonical DNA methylation variation. Nature. 2015;523:212–6.
He Y, Ecker JR. Non-CG Methylation in the Human Genome. Annu Rev Genomics Hum Genet. 2015;16:55–77.
Hansen KD, Langmead B, Irizarry RA. BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol. 2012;13:R83.
Landan G, Cohen NM, Mukamel Z, Bar A, Molchadsky A, Brosh R, et al. Epigenetic polymorphism and the stochastic formation of differentially methylated regions in normal and cancerous tissues. Nat Genet. 2012;44:1207–14.
Zhou L, Ng HK, Drautz-Moses DI, Schuster SC, Beck S, Kim C, et al. Systematic evaluation of library preparation methods and sequencing platforms for high-throughput whole genome bisulfite sequencing. Sci Rep. 2019;9:10383.
Munson K, Clark J, Lamparska-Kupsik K, Smith SS. Recovery of bisulfite-converted genomic sequences in the methylation-sensitive ddPCR. Nucleic Acids Res. 2007;35:2893–903.
Tanaka K, Okamoto A. Degradation of DNA by bisulfite treatment. Bioorg Med Chem Lett. 2007;17:1912–5.
Grunau C, Clark SJ, Rosenthal A. Bisulfite genomic sequencing: systematic investigation of critical experimental parameters. Nucleic Acids Res. 2001;29:E65.
Mill J, Petronis A. Profiling DNA methylation from small amounts of genomic DNA starting material: efficient sodium bisulfite conversion and subsequent whole-genome amplification. Methods Mol Biol. 2009;507:371–81.
Dieffenbach CW, Lowe TM, Dveksler GS. General concepts for PCR primer design. PCR Methods Appl. 1993;3(3):S30–7.
Shen SY, Singhania R, Fehringer G, Chakravarthy A, Roehrl MHA, Chadwick D, et al. Sensitive tumour detection and classification using plasma cell-free DNA methylomes. Nature. 2018;563:579–83.
Krueger F. TrimGalore; 2019. Available from: https://github.com/FelixKrueger/TrimGalore.
Swift Biosciences. Tail Trimming for Better Data: Accel-NGS Methyl-Seq, Adaptase Module and 1S Plus DNA Library Kits; 2018. Available from: https://swiftbiosci.com/wp-content/uploads/2019/02/16-0853-Tail-Trim-Final-442019.pdf.
GENCODE. Human Release 32; 2019. Available from: https://www.gencodegenes.org/human/release_32.html.
Zhou W. BISCUIT: BISulfite-seq CUI Toolkit; 2020. Available from: https://github.com/huishenlab/biscuit.
Faust GG, Hall IM. SAMBLASTER: fast duplicate marking and structural variant read extraction. Bioinformatics. 2014;30:2503–5.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9.
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.
Morrison J. WGMS Kit Comparison Source Code; 2021. Available from: https://github.com/jamorrison/wgms_kit_comparison/releases/tag/v1.0.4.
Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32:3047–8.
Andrews S. FastQC: A Quality Control Tool for High Throughput Sequence Data; 2010. Available from: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
Daley T, Smith AD. Predicting the molecular complexity of sequencing libraries. Nat Methods. 2013;10:325–7.
Zhou W. BISCUIT Version 0.3.16 Release Page; 2020. Available from: https://github.com/huishenlab/biscuit/releases/tag/v0.3.16.20200420.
Tange O. GNU Parallel 20200522 (‘Kraftwerk’). Zenodo. 2020. https://doi.org/10.5281/zenodo.3841377.
Karimzadeh M, Ernst C, Kundaje A, Hoffman MM. Umap and Bismap: quantifying genome and methylome mappability. Nucleic Acids Res. 2018;46:e120.
Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative Genomics Viewer. Nat Biotechnol. 2011;29:24–6.
Computation for the work described in this paper was supported by the High Performance Cluster and Cloud Computing (HPC3) Resource at the Van Andel Research Institute.
This study was funded by the National Institutes of Health/National Cancer Institute [R37 CA230748] and the Ovarian Cancer Research Fund (now Ovarian Cancer Research Alliance) .
Ethics approval and consent to participate
This study received ethics approval from the Van Andel Research Institute Institutional Review Board (IRB #19017).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Morrison, J., Koeman, J.M., Johnson, B.K. et al. Evaluation of whole-genome DNA methylation sequencing library preparation protocols. Epigenetics & Chromatin 14, 28 (2021). https://doi.org/10.1186/s13072-021-00401-y