Here we show that four mutants of Dam (R95A, R116A, N126A, and N132A) each reduce the noise seen in DamID for the transcription factor Tcf7l2 substantially, and for two of these (R95A and N126A) we confirm that this is the case across the whole genome, resulting in less background methylation and higher spatial resolution. We strongly suspect that these conclusions will also apply to the other two mutants.
We are not sure precisely what causes the background methylation observed with wild-type Dam, and hence, why these mutants show an increased signal-to-noise ratio. Based on the observations in [19] of such mutations, it could be a combination of reduced methylation rate leading to only longer-lived interactions being recorded, lower processivity preventing spreading methylation, or reduced DNA binding preventing it from dragging its linked transcription factor to a new location. The observation that unfused Dam mutants closely resemble the wild-type Dam-Tcf7l2 favours the last of these: that wild-type Dam binds DNA strongly enough to drag Tcf7l2 to locations that Dam normally prefers. If the improved signal was instead due to disrupted processivity, then the correlation between wild-type Dam and Dam-Tcf7l2 should be stronger than that between mutant Dam and wild-type Dam-Tcf7l2. Alternatively, if the cause was a reduced methylation rate only capturing longer-lived interactions, then one would expect the mutant Dam only samples to show less total methylation than the corresponding Dam-Tcf7l2—the opposite was observed.
A caveat to our results is that these Dam constructs were expressed from a dox-inducible promoter at high level, in contrast to the recommended method of using low expression from a leaky uninduced promoter. It is possible that there exists a lower concentration and duration of Dam-Tcf7l2 with similar signal-to-noise properties as the N126A and R95A variants. During out initial test of wild type Dam-Tcf7l2, however, we found no concentration or duration of dox exposure that further improved the enrichment at Tcf7l2 bound sites (by qPCR) nor did the uninduced promoter provide detectable signal (these observations may be specific to the quick replication of mESCs diluting away methylation that is produced too slowly). Furthermore, previous studies all show low spatial resolution and high correlation between unfused Dam and fusions with transcription factors despite attempts to maintain low levels of expression.
Out of these, the most comparable is a recent DamID experiment by Cheetham et al. [10] profiling Oct4 binding in mESCs, due to the explicit comparison to ChIP-seq and similarly focal DNA binding of Tcf7l2 and Oct4 with ChIP-seq peaks of \(\sim \) 100 bp. Despite maintaining very low expression of Dam-Oct4 fusion through translation reinitiation, a comparison with Oct4 ChIP-seq shows methylation at many disparate sites and a low spatial resolution similar to what we observe for Dam-Tcf7l2 wild-type (50% decay at > 500 bp). While a portion of these may be true Oct4 binding events, the high specificity of ChIP-seq for transcription factor binding, combined with the high correlation (median of 0.77) to unfused Dam, indicates that this is mostly driven by Dam-specific effects. This matches our observations for Tcf7l2 fused to wild-type Dam, which is more strongly correlated with unfused Dams than the N126A or R95A Dam-Tcf7l2. Thus, it seems unlikely that the increase in signal-to-noise seen with the Dam mutants is achievable through further optimisation of Dam fusion expression. More generally, the strong DNA binding and processivity of wild type Dam [16,17,18] indicates that for any protein with similar or weaker DNA affinity, fusing it to Dam will result in off-target methylation regardless of the level of total methylation. Very low expression also adds an additional source of cell-to-cell variability due to stochastic fluctuations inherent with few mRNAs.
The spreading methylation of wild type Dam spans across multiple GATCs. The more localised methylation by mutant Dam, however, makes the frequency of GATCs the new limit for spatial resolution. We addressed this by developing a DamID-seq protocol that captures individual methylated sites, rather than reading out the correlation between adjacent pairs, which increases how frequently methylation is sampled across the genome; several Tcf7l2 binding sites were detected by only a single GATC. Additionally, this protocol reduces the number of steps required by using the initial ligated adapter directly for sequencing, instead of separating amplification of methylated fragments from later sequencing library preparation (as in [7]), and produces a more interpretable output of read count at each GATC instead of being smeared out into a peak. Further increasing the frequency with which binding can be detected could be achieved by combining these Dam mutants with K9A, which allows Dam to methylate at sequences other than GATC and detecting the resulting methylation by immunoprecipitation [22, 23].
The recommended method for dealing with background activity is to express an unfused Dam control and hope that it recapitulates the off-target methylation of the fusion construct. Interestingly, when we tried this we instead got a decrease in signal with respect to Tcf7l2 ChIP-seq. Since both background Dam methylation and transcription factor binding tend to occur within open chromatin regions, the unfused Dam control is already partially predictive of binding sites. Confounding factors, such as differences in background methylation rates between unfused Dam and Dam-Tcf7l2 due to higher diffusion of the small unfused Dam, would result in normalisation creating false negatives.
A previous paper has proposed the Dam mutant L122A to increase the signal to noise of DamID. They however report a higher correlation (\(\sim \) 0.7) between unfused Dam and the Dam transcription factor fusion compared to ours (Fig. 4) and provide no evidence for the claim of increased signal-to-noise of the L122A mutant [12]. Additionally, this mutant was reported to show a preference for methylating already hemimethylated sites [20, 24]. While of interest as a possible way to maintain Dam methylation through DNA replication, preferential propagation of existing methylation throughout cell division would abolish independence between individual methylation events, confounding any statistical inference.
In this study, we focused on a specific transcription factor, Tcf7l2, and showed that mutations in Dam improved detection of its binding to DNA. Owing to the absence of any unique features of Tcf7l2—it neither binds particularly strongly nor has easy to predict binding—we would expect that these benefits should apply generally to other transcription factors. Since the correlation between unfused Dam mutants and wild-type Dam-Tcf7l2 suggests that off-target effects are due to strong DNA-binding of Dam, rather than processivity or kinetics, DamID generally would be most reliable for strongly binding proteins, such as CTCF, pioneer factors, or Cas9, while mutant Dam would have the most benefit for more transiently binding proteins.
With the growing appreciation of cellular heterogeneity, it is of interest to study transcription factor binding in finer resolution than the bulk cell cultures or tissues that are required by ChIP-seq. DamID provides unique benefits for measuring protein-DNA interactions in such situations, as the construct can be expressed in response to certain perturbations or in specific cell types—including within a whole organism—and easily isolated later due to the persistence of adenine methylation throughout further experimental processing. Due to the presence of artefacts in ChIP-seq, DamID is also of use in independently verifying binding sites, particularly those lacking a clear motif to explain binding. Since the noisiness of DamID has been a constant barrier to applying it more broadly, we hope that these improvements to its specificity and sensitivity for transcription factor binding will aid in the development of such experiments.