Automated in situ profiling of chromatin modifications resolves cell types and gene regulatory programs

Our understanding of eukaryotic gene regulation is limited by the complexity of protein-DNA interactions that comprise the chromatin landscape and by inefficient methods for characterizing these interactions. We recently introduced CUT&RUN, an antibody-targeted nuclease-cleavage method that profiles DNA-binding proteins, histones and chromatin modifying proteins in situ with exceptional sensitivity and resolution. Here we describe an automated CUT&RUN platform and apply it to characterize the chromatin landscapes of human cell lines. We find that CUT&RUN profiles of histone modifications crisply demarcate active and repressed chromatin regions, and we develop a continuous metric to identify cell-type specific promoter and enhancer activities. We test the ability of automated CUT&RUN to profile frozen tumor samples, and find that our method readily distinguishes two diffuse midline gliomas by their subtype-specific gene expression programs. The easy, cost-effective workflow makes automated CUT&RUN an attractive tool for high-throughput characterization of cell types and patient samples.


Abstract
Our understanding of eukaryotic gene regulation is limited by the complexity of protein-DNA interactions that comprise the chromatin landscape and by inefficient methods for characterizing these interactions. We recently introduced CUT&RUN, an antibodytargeted nuclease-cleavage method that profiles DNA-binding proteins, histones and chromatin modifying proteins in situ with exceptional sensitivity and resolution. Here we describe an automated CUT&RUN platform and apply it to characterize the chromatin landscapes of human cell lines. We find that CUT&RUN profiles of histone modifications crisply demarcate active and repressed chromatin regions, and we develop a continuous metric to identify cell-type specific promoter and enhancer activities. We test the ability of automated CUT&RUN to profile frozen tumor samples, and find that our method readily distinguishes two diffuse midline gliomas by their subtype-specific gene expression programs. The easy, cost-effective workflow makes automated CUT&RUN an attractive tool for high-throughput characterization of cell types and patient samples.
Cells establish their distinct identities by altering activity of the cis-regulatory DNA elements that control gene expression 1,2 . Promoter elements lie near the 5' transcriptional start sites (TSSs) of all genes, whereas distal cis-regulatory elements such as enhancers often bridge long stretches in the DNA to interact with select promoters and direct cell-type-specific gene expression 1,2 . Defects in the nuclear proteins that recognize these cis-regulatory elements underlie many human diseases that often manifest in specific tissues and cell-types 3-7 . To provide a reference for molecular diagnosis of patient samples, efforts are underway to generate a comprehensive atlas of cells in the human body 8,9 . Characterizing cell-type specific chromatin landscapes is essential for this atlas; however, technical limitations have prevented implementation of traditional approaches for genome-wide profiling of chromatin proteins on the scales necessary for this project.
Despite the growing awareness that epigenetic derangements underlie many human diseases 10 , very few methods for high-throughput profiling of epigenomic information are available. Realizing the clinical potential of epigenomic technologies requires robust, scalable approaches that can profile large numbers of patient samples in parallel. Chromatin immunoprecipitation with antigen-specific antibodies combined with massively parallel sequencing (ChIP-seq) has been used extensively for epigenome profiling, but this method is labor-intensive, prone to artifacts 11 , and requires high sequencing depth to distinguish weak signals from genomic background noise.
The combination of these factors has prevented implementation of ChIP-seq in clinical laboratory settings. Recently, we introduced CUT&RUN as an alternative chromatin profiling technique that uses factor-specific antibodies to tether micrococcal nuclease (MNase) to genomic binding sites 12,13 . The targeted nuclease cleaves chromatin around the binding sites, and the released DNA is sequenced using standard library preparation techniques, resulting in efficient mapping of protein-DNA interactions. CUT&RUN has very low backgrounds, which greatly reduces sample amounts and sequencing costs required to obtain high-quality genome-wide profiles 12,14 .
Here we modify the CUT&RUN protocol to profile chromatin proteins and modifications in a 96-well format on a liquid handling robot. By applying this method to the H1 human embryonic stem cell (hESC) line and K562 leukemia cell line, we demonstrate AutoCUT&RUN can be used to identify cell-type specific promoter and enhancer activities, providing a means to quantitatively distinguish cell-types based on their unique gene regulatory programs. In addition, we show that this method is able to define chromatin features from frozen solid tumor samples, setting the stage to analyze typical clinical specimens. AutoCUT&RUN is ideal for high-throughput studies of chromatin-based gene regulation, allowing for examination of chromatin landscapes in patient samples and expanding the toolbox for epigenetic medicine.

An automated platform for genome-wide profiling of chromatin proteins
To adapt CUT&RUN to an automated format we equipped a Beckman Biomek FX liquid handling robot for magnetic separation and temperature control (Fig. 1a). First, cells are bound to Concanavalin A-coated magnetic beads, allowing all subsequent washes to be performed by magnetic separation. Bead-bound samples are then incubated with antibodies, and up to 96 samples are arrayed in a plate (Fig. 1a). Successive washes, tethering of a proteinA-MNase fusion protein, cleavage of DNA, and release of cleaved chromatin fragments into the sample supernatant are performed on the Biomek ( Supplementary Fig. 1a). A major stumbling block to automating epigenomics protocols is they typically require purification of small amounts of DNA prior to library preparation.
To overcome this hurdle, we developed a method to polish the DNA ends in chromatin fragments for direct ligation of Illumina library adapters ( Supplementary Fig. 1a). Endpolishing and adapter ligation are performed on a separate thermocycler and deproteinated CUT&RUN libraries are purified on the Biomek using Ampure XP magnetic beads both before and after PCR enrichment. This AutoCUT&RUN protocol allows a single operator to generate up to 96 libraries in 2 days that are ready to be pooled and sequenced (Fig. 1a) (Appendix 1).
To test the consistency of AutoCUT&RUN, we simultaneously profiled two biological replicates of H1 hESCs and K562 cells using antibodies targeting four histone modifications that mark active chromatin states (H3K4me1, H3K4me2, H3K4me3, and H3K27ac) and one repressive modification (H3K27me3). Comparing the global distribution of reads for each histone mark, we found that samples highly correlate with their biological replicate, and cluster together in an unbiased hierarchical matrix (Fig.   1b). Additionally, the genome-wide profiles of the four active histone marks clustered together within a given cell type, and separated away from the repressive histone mark H3K27me3 (Fig. 1b). These profiles represent antibody-specific signals, as all five are poorly correlated with an IgG negative control. Together, these results indicate that AutoCUT&RUN chromatin profiling reproducibly captures the cell-type specific distributions of histone marks.
Histones are tightly associated with DNA in chromatin, so we also examined whether AutoCUT&RUN can be applied to mapping DNA-binding transcription factors that have lower residence times. We tested the performance of AutoCUT&RUN with two transcription factors, the histone locus-specific gene regulator NPAT, and the insulator protein CTCF 15,16 . AutoCUT&RUN profiles of both NPAT and CTCF are highly specific for their expected targets in both H1 and K562 cells ( Supplementary Fig. 1b, c), and the signal sensitivity of CTCF in K562 cells was comparable to previous results 13 . Thus, AutoCUT&RUN is suitable for high-throughput, genome-wide profiling of diverse DNA binding proteins.
AutoCUT&RUN recapitulates these features of hESCs; we observed that H1 cells have increased H3K27ac as compared to the lineage-restricted K562 cell line, whereas domains of the repressive histone mark H3K27me3 are rare in H1 cells, but prevalent in K562 cells (Fig. 2a). We also observed extensive overlap between H3K27me3 and H3K4me2 signals in H1 cells, but not K562 cells (Fig. 2a, b). Thus, Auto CUT&RUN profiles are consistent with the specialized chromatin features found in hESCs.
Post-translational modifications to the H3 histone tail closely correlate with transcriptional activity 21 . To determine whether our AutoCUT&RUN profiles of histone modifications are indicative of transcriptional activity, we examined the distribution of the five histone marks around the transcriptional start sites (TSSs) of genes, rank-ordered according to RNA-seq expression data (Fig. 2c, d) 22 . We find the active mark H3K4me3 is the most highly correlated with expression in both cell types (r = 0.70 and 0.81 for H1 and K562 respectively), followed by H3K4me2 and H3K27ac (Fig. 2c, d). The repressive histone mark H3K27me3 is anti-correlated with expression (r = -0. 16 and -0.53 in H1 and K562 respectively) (Fig. 2c, d). We conclude AutoCUT&RUN for these marks recapitulates transcriptional activity, providing a strategy to identify cell-type specific gene regulatory programs.

Modeling cell-type specific gene expression from AutoCUT&RUN profiles
To use AutoCUT&RUN data to compare cell-types and distinguish their gene regulatory programs we wanted to develop a continuous metric that incorporates both active and repressive chromatin marks. RNA-seq has been widely used to identify cell-type specific gene expression programs 22 , so we used RNA-seq data as a reference for training a weighted linear regression model that incorporates normalized H3K4me2, H3K27ac, and H3K27me3 read counts to assign promoters a relative activity score. We initially focused our analysis on genes with a single TSS that could be unambiguously assigned RNA-seq values. H3K4me2 was selected over H3K4me3 and H3K4me1 because H3K4me2 is uniquely applicable for modeling the activity of both proximal and distal cisregulatory elements (see below). When applied to K562 cells, promoter chromatin scores correlate very well with RNA-seq values (r=0.83) (Fig. 3a), providing a comparable power for predicting gene expression as similar models that used up to 39 histone modifications mapped by ChIP-seq (r=0.81) 21 . In addition, our weighted model trained on K562 cells performs well when applied to H1 cells ( Supplementary Fig. 2a, b), indicating that the linear model and data quality are sufficiently robust to assign promoter scores to diverse cell types.
Next, we examined if AutoCUT&RUN accurately identifies promoters with celltype specific activity. By calling promoter scores that were enriched more than two-fold in either H1 or K562 cells we identified 2,168 cell-type specific genes and approximately 40% of these genes (865) were also differentially-enriched between H1 and K562 cells according to RNA-seq ( Fig. 3b-d). However, promoter activity modeling did not capture transcriptional differences for 1149 genes (Fig 3d, Supplementary Fig. 2c, d), implying that these genes are differentially expressed without changes in the chromatin features included in our model. This differential sensitivity between methods suggests the three histone marks included in our chromatin model may more accurately predict the cell type specific expression of certain classes of genes than others. Indeed, we find the 865 cell-type specific genes identified by both promoter activity modeling and RNA-seq are highly enriched for developmental regulators, whereas the genes called by either promoter scores or RNA-seq alone are not nearly as enriched for developmental GO terms (Fig. 3d, Supplementary Fig. 2e-g, Supplementary table 1). In addition, only 35 genes display contradictory cell-type specificities according to promoter chromatin scores and RNA-seq (Fig. 3d). This demonstrates AutoCUT&RUN profiling of these widely studied modification to the H3 histone tail can be applied to accurately identify cell-type specific developmental regulators.
To determine whether AutoCUT&RUN data recapitulates the expression of celltype-specific transcription factors, we expanded our analysis to include all promoters.
We find that components of the hESC pluripotency network (NANOG, SOX2, SALL4,  Table 1) 23,24 . This method also identifies activities of alternative promoters (e.g. at the OTX2 and TAL1 genes), providing an indication of the specific gene isoforms that are expressed in a given cell-type (Fig. 3e). We conclude that AutoCUT&RUN allows identification of master regulators of cellular identity, providing a powerful tool to characterize cell-types in a high-throughput format.

Profiling tumors by AutoCUT&RUN
Traditional methods to profile protein-DNA interactions (e.g. ChIP-seq) are generally unable to handle clinically-relevant samples, which often contain small amounts of starting material and have been flash-frozen. To test whether AutoCUT&RUN is suitable for profiling frozen tumor specimens, we obtained two diffuse midline glioma (DMG) patient-derived cell lines (VUMC-10 and SU-DIPG-XIII) that were autopsied from similar regions of the brainstem, but differ in their oncogenic backgrounds 25 . Both of these DMG cell lines readily form xenografts in murine models, and we applied AutoCUT&RUN to profile histone modifications in VUMC-10 and SU-DIPG-XIII xenografts that were seeded in the brains of mice, and then resected upon tumor formation and frozen under typical clinical conditions (Fig. 4a). For comparison, on the same AutoCUT&RUN plate we profiled the parental DMG cell lines grown in culture ( Fig. 4a). Again, we found that replicates were highly concordant, so we combined them for further analysis. Importantly, cell culture samples were highly correlated with the same mark profiled in the corresponding frozen xenografts, and AutoCUT&RUN on xenograft tissues and cell culture samples produced similar data quality (Fig. 4b,   Supplementary Fig. 3). Thus, AutoCUT&RUN reliably generates genome-wide chromatin profiles from frozen tissue samples.
Stratification of patient malignancies is becoming increasingly dependent on molecular diagnostic methods that distinguish tumor subtypes derived from the same tissues. Our VUMC-10 and SU-DIPG-XIII samples provide an excellent opportunity to explore the potential of using AutoCUT&RUN to classify tumor specimens according to their subtype-specific regulatory elements. By applying our model to these samples, we identified 5,006 promoters that show differential activity between VUMC-10 and SU-DIPG-XIII cells (Fig. 5a, Supplementary Table 1). Consistent with the glial origins of these tumors, both the VUMC-10 and SU-DIPG-XIII specific promoters are significantly enriched for genes involved in nervous system development ( Supplementary Fig. 4a, b).
Genes involved in cell-signaling are also overrepresented in SU-DIPG-XIII cells ( Supplementary Fig. 4b); for example, the promoters of the PDGFR gene as well as its ligand PDGF are highly active in SU-DIPG-XIII cells (Fig. 5a). This is consistent with the observation that DMGs frequently contain activating mutations in PDGFR-α that promote tumor growth 5 . In addition, one promoter of the SMAD3 gene, a component of the TGF-β signaling pathway 26 , is specifically active in SU-DIPG-XIII cells, whereas two different SMAD3 promoters are active in VUMC-10 cells (Fig. 5a, Supplementary Fig.   3). In comparison, our model indicates that only 388 promoters differ between VUMC-10 xenografts and cultured cells, and 1,619 promoters differ between SU-DIPG-XIII samples (Fig. 5b, Supplementary Fig. 4c). In addition, comparing promoter chromatin scores in an unbiased correlation matrix also indicates DMG xenografts are far more similar to their corresponding cell culture samples than they are to other DMG subtypes or to H1 or K562 cells (Fig. 5c). This suggests that AutoCUT&RUN can be applied to identify promoters that display tumor subtype-specific activity, providing a reliable method to assign cellular identities to frozen tumor samples, as well as an indication of the signaling pathways that may be driving tumor growth and potential susceptibility to therapeutic agents.

High-throughput mapping of cell-type specific enhancers
The cell-type specific activities of gene promoters are often established by incorporating signals from distal cis-regulatory elements, such as enhancers 1,2 . Similar to promoters, enhancers also display H3K4me2 27 , and active enhancers are typically marked by H3K27ac, whereas repressed enhancers are marked by H3K27me3 20,28,29 . Therefore, we reasoned the AutoCUT&RUN profiles we used to model promoter activity should also allow identification of cell-type specific enhancers. To investigate this possibility, we first compared our H1 data to available chromatin accessibility maps generated by ATAC-seq, which are enriched for both active promoters and enhancers 30,31 . Of the marks we profiled, we find H3K4me2 peaks show the highest overlap with ATAC-seq (Fig 6a, Supplementary Fig. 5a), and identify 36,725/52,270 ATAC-seq peaks (~70%).
Interestingly, H3K4me2 defines an additional 71,397 peaks that were not called by ATAC-seq (Fig. 6a, Supplementary Fig. 5a). Many of these H3K4me2-specific peaks show a low, but detectable ATAC-seq signal ( Supplementary Fig. 5b), indicating they may correspond to repressed promoters and enhancers. Consistent with this interpretation, on average H3K4me2+/ATAC-TSSs have higher H3K27me3 signals than H3K4me2+/ATAC+ TSSs ( Supplementary Fig. 5c). H3K4me2+/ATAC+ peaks that overlap with annotated TSSs are enriched for H3K4me3, while those peaks that do not overlap TSSs are enriched for H3K4me1 (Fig. 6b&c, Supplementary Fig. 5d), suggesting that many of these distal peaks are enhancers 20,32 . Thus, mapping sites of H3K4me2 by AutoCUT&RUN provides a sensitive method for defining the repertoire of active cis-regulatory elements that control gene expression programs.
Finally, we examined whether AutoCUT&RUN can be used to identify cell-typespecific enhancers. To expand the number of putative enhancer sites, we compiled a list of non-TSS peaks called on H3K4me2 profiles from all six cell lines and xenograft samples. Using our linear regression model, we then assigned these elements chromatin scores and examined their correlations between different cell types. We find that these chromatin scores of DMG cell culture samples and xenografts are highly correlated (r=0.75 and 0.87 for SU-DIPG-XIII and VUMC-10 samples respectively) (Fig.   6d). In contrast, the chromatin scores of SU-DIPG-XIII cells show a weak positive correlation with VUMC-10 cells (e.g. r=0.19), indicating tumor subtype-specific differences. For example, the different enhancers near the SOX2 pluripotency gene are active in SU-DIPG-XIII and VUMC-10 cells (Fig. 6e), indicating that SU-DIPG-XIII cells resemble a more primitive neural stem cell-type than VUMC-10 cells, as has been previously suggested 33 . Thus, AutoCUT&RUN provides a stringent method for stratifying cell-types and tissue samples.

Discussion
We adapted the CUT&RUN technique to an automated platform by developing direct ligation of chromatin fragments for Illumina library preparation, and implementing magnetic separation for the wash steps and library purification. AutoCUT&RUN generates 96 genome-wide profiles of antibody-targeted chromatin proteins in just two days, dramatically increasing the throughput and potential scale of studies to interrogate the chromatin landscape at a fraction of the cost of comparable lower-throughput methods. We show that profiling just three histone modifications (H3K27ac, H3K27me3 and H3K4me2) is sufficient to determine the cell-type specific activities of developmentally regulated promoters and enhancers, providing a powerful quantitative metric to compare the epigenetic regulation of different cell-types. This summary metric of chromatin features could be used to assess new cell types and tissue samples and to place them within a reference map of both healthy and diseased cell types. The automated workflow reduces technical variability between experiments, generating consistent profiles from biological replicates and from different sample types.
To continue optimizing AutoCUT&RUN one could envision hardware modifications and computational development. By screening various antibody collections, the repertoire of nuclear proteins that can be efficiently profiled using AutoCUT&RUN would expand dramatically. In addition, the current AutoCUT&RUN protocol is optimized for a popular liquid handling robot, but a custom robot incorporating a reversibly magnetic thermocycler block would allow the CUT&RUN reaction and library preparation to be carried out in place, streamlining the protocol even further. Last, metrics distinguishing cell types could be improved by incorporating additional aspects of the data, such as using a combination of both enhancer and promoter activities.
The excellent reproducibility of profiling frozen tissue samples by AutoCUT&RUN has the potential to transform the field of epigenomic medicine 10 . Compared to other genomics approaches that are currently used for patient diagnosis, AutoCUT&RUN has the unique capacity to profile chromatin proteins within diseased cells. For example, cancers caused by fusions between chromatin proteins could be profiled by AutoCUT&RUN to provide a molecular diagnosis based on their chromatin landscapes, while simultaneously mapping the loci that are disrupted by the mutant fusion protein.
This could provide a powerful tool for patient stratification, as well as a direct read-out of whether chromatin-modulating therapies such as histone deacetylase or histone methyltransferase inhibitors are having their intended effects.

AutoCUT&RUN
A detailed AutoCUT&RUN protocol is provided (Appendix 1). Briefly, cells or tissue samples are bound to Concanavalin A coated magnetic beads (Bangs Laboratories, ca. no. BP531), permeabilized with digitonin, and bound with a protein specific antibody as previously described 12

Patient-derived Xenografts
All mouse studies were conducted in accordance with Institute of Animal Care and Use Committee-approved protocols. NSG mice were bred in house and aged to 2-3 months prior to tumor initiation. Intracranial xenografts were established by stereotactic injection of 100,000 cells suspended in 3uL at a position of 2 mm lateral and 1mm posterior to lambda. Symptomatic mice were euthanized and their tumors resected for analysis.

Annotation and Data Analysis
We aligned paired-end reads using Bowtie2 version 2.2.5 with options: --local --verysensitive-local --no-unal --no-mixed --no-discordant --phred33 -I 10 -X 700. For mapping spike-in fragments, we also used the --no-overlap --no-dovetail options to avoid crossmapping of the experimental genome to that of the spike-in DNA 36 . Files were processed using bedtools and UCSC bedGraphToBigWig programs 37,38 .
To examine correlations between the genome-wide distributions of various samples, we generated bins of 500 bp spanning the genome, creating an array with approximately 6 million entries. Reads in each bin were counted and the log 2 transformed values of these bin counts were used to determine a Pearson correlation score between different experiments. Hierarchal clustering was then performed on a matrix of the Pearson scores.
To examine the distribution of histone mark profiles around promoters, a reference list of genes for build hg19 were downloaded from the UCSC table browser (https://genome.ucsc.edu/cgi-bin/hgTables) and oriented according to the directionality of gene transcription for further analysis. Genes with TSSs within 1 kb of each other were removed, as were genes mapping to the mitochondrial genome, creating a list of 32,042 TSSs. RNA-sequencing data was obtained from the ENCODE project for H1 and K562 cells (ENCSR537BCG and ENCSR000AEL). RNA reads were counted using featureCounts (http://bioinf.wehi.edu.au/featureCounts/), and converted to Fragments Per Kilobase per Million mapped reads (FPKM) and assigned to the corresponding TSS as a gene expression value. ATAC-sequencing data for H1 cells was obtained from gene omnibus expression (GEO) (GSE85330) and mapped to hg19 using bowtie2.
Mitochondrial DNA accounted for ~50% of the reads and were removed in this study.
All heat maps where generated using DeepTools 39 . All of the data was analyzed using either bash or python. The following packages were used in python: Matplotlib, NumPy, Pandas, Scipy, and Seaborn.

Training the linear regression model
To ensure the accuracy of fitting histone modification data at promoters to RNA-seq values, genes with more than one promoter were removed from the previously generated TSS list. The genes RPPH1 and RMRP were expressed at extremely high levels in H1 cells, and so were considered to be outliers and were removed to avoid skewing the regression, leaving a list of n=12,805 genes.
To assign a relative CUT&RUN signal to promoters for each histone mark, denoted by C, base pair read counts +/-1 kb of the TSS were normalized by both sequencing depth over the promoters being scored and the total number of promoters examined. The prior normalization is to account for both sequencing depth and sensitivity differences among antibodies, and the latter normalization is included so that the model can be applied to different numbers of cis-regulatory elements without changing the relative weight of each element. FPKM values were used for RNA-seq. The linear regression model was trained to fit profiles of histone marks to RNA expression values as previously described 21 . Briefly, we used a linear combination of histone data fitted to the RNA-seq expression values: = ! ! + ⋯ + ! ! , where ! is the weight for each histone modification and ! is denoted by ! = ln ( ! + ! ), where C is the normalized base-pair counts described above and is a pseudo-count to accommodate genes with no expression. The RNA-seq values were similarly transformed as ! = ln ( ! + !,! ). Logarithmic transformations were used to linearize the data. A minimization step was then performed to calculate pseudo-counts and weights for each histone modification that would maximize a regression line between CUT&RUN data and RNA-seq.
We expected that the histone marks H3K27ac, H3K27me3, and H3K4me2 would provide the least redundant information. The optimized three histone mark model for K562 cells is described by: This equation was used to generate all chromatin activity scores.

Calling chromatin domain for overlap analysis
To compare the global chromatin landscape of H1 and K562 cells, chromatin domains were called using a custom script that enriched for regions relative to an IgG CUT&RUN control. Enriched regions among marks were compared and overlaps were identified by using bedtools intersect. Overlapping regions were quantified by the number of common base pairs, and these were used to generate the Venn diagrams.

Venn Diagrams
All Venn Diagrams were generated using the BaRC webtool, publicly available from the Whitehead Institute (http://barc.wi.mit.edu/tools/venn/). Each list of genes was classified by gene ontology (http://geneontology.org/) to identify statistically enriched biological processes.

Calculating cell-type specific promoter activity scores
To examine the relative similarities between cell-types based on their promoter activities, scores for all promoters >1 kb apart were used to generate an array, and Spearman correlations were calculated for each pair-wise combination of the samples.
Hierarchal clustering of the Spearman correlation values was used to visualize the relative similarities between cell-types.

Peak calling on AutoCUT&RUN and ATAC-seq data
Biological replicates profiled by AutoCUT&RUN were highly correlated (Fig. 1b), so replicates were joined prior to calling peaks. The tool MACS2 was used to call peaks 40 .

Calculating cell-type specific enhancer activity scores
To assemble a list of distal cis-regulatory elements in the human genome, we used MACS2 to call peaks on H3K4me2 profiles from each of our samples using the same flags described in the 'Peak calling on AutoCUT&RUN and ATAC-seq' methods section.
To distinguish between TSSs and putative enhancers, peaks <2.5 kb away from an annotated TSS were removed, and windows +/-1kb around these putative enhancers were assigned chromatin activity scores using the algorithm trained to predict promoter activity. Correlation matrices comparing the enhancer scores between samples were generated in the same manner as the correlation matrix comparing promoter scores between samples.

Acknowledgments
We

Competing Interests
There are no competing interests.         • Roche Complete Protease Inhibitor EDTA-Free tablets (Sigma-Aldrich, cat. no.
• CAUTION: Digitonin is toxic and PPE including a mask and gloves should be worn when weighing out the powder. A digitonin stock may be prepared by dissolving in dimethylsulfoxide (DMSO) but be aware that DMSO can absorb through the skin.

Binding buffer
Mix 20 mL of Binding Buffer in a 50 mL conical tube. Store the buffer at 4 o C for up to 6 months. 2) Centrifuge 3 min 600 x g at room temperature and withdraw liquid.
3) Resuspend in 1.5 mL room temperature Wash Buffer by gently pipetting and transfer if necessary to a 2 mL tube.

4)
Centrifuge 3 min 600 x g at room temperature and withdraw liquid.

5)
Repeat steps 3 and 4 two more times.
• CRITICAL STEP: Thorough washing removes free sugars and other molecules that can compete for binding to the Concanavalin A coated-beads, ensuring efficient binding and recovery of the cells of interest.

6)
Resuspend in 1 mL room temperature Wash Buffer by gently pipetting.

7)
While gently vortexing the cells at room temperature, add the Concanavalin A coatedbead suspension.

8)
Place on tube nutator at room temperature for 5-10 min.

9)
Mix well by vigorous inversion to ensure the bead-bound cells are in a homogenous suspension and divide into aliquots in 0.6-mL low-bind tubes, one for each antibody to be used.

10)
Place on the magnet stand to clear and pull off and discard the liquid.

11)
Place each tube at a low angle on the vortex mixer set to low (~1100 rpm) and squirt 150 µL of the Antibody Buffer per sample along the side while gently vortexing to allow the solution to dislodge most or all of the beads. Tap to dislodge the remaining beads.
• CRITICAL STEP: Permeabilizing the cells with digitonin and chelating divalent cations with EDTA serves to quickly halt metabolic processes and prevent endogenous DNAse activity. This helps to preserve the native chromatin state and reduce background noise in the final CUT&RUN libraries. Thus, it is recommended to work quickly to get cells into Antibody Buffer.

12)
Mix in the primary antibody to a final concentration of 1:100 or to the manufacturer's recommended concentration for immunofluorescence.
• CRITICAL STEP: To evaluate success of the procedure without requiring sequencing, include in parallel a positive control antibody (e.g. anti-H3K27me3) and a negative control antibody (e.g. anti-mouse IgG). Do not include a no-antibody control, as the lack of tethering may allow any unbound pA-MNase to act as a "time-bomb" and digest accessible DNA, resulting in a background of DNA-accessible sites.

13)
Place on the tube nutator at room temperature for 2 hrs.
• PAUSE POINT Antibody incubation may proceed overnight at 4 o C.

Bind secondary antibody (as required)
• TIMING 1 hr-overnight • CRITICAL STEP: The binding efficiency of Protein A to the primary antibody depends on host species and IgG isotype. For example, Protein A binds well to rabbit and guinea pig IgG but poorly to mouse and goat IgG, and so for these latter antibodies a secondary antibody, such as rabbit anti-mouse is recommended.

14)
Remove liquid from the cap and side with a quick pulse on a micro-centrifuge.
• CRITICAL STEP: After mixing, but before placing a tube on the magnet stand, a very quick spin on a micro-centrifuge (no more than 100 x g) will minimize carry-over of antibody and pA-MN that could result in overall background cleavages during the digestion step.

15)
Place on the magnet stand to clear (~30 s) and pull off all of the liquid.

16)
Add 150 µL Digitonin Buffer to wash, mix by inversion, or by gentle pipetting using a 1 mL tip if clumps persist, and remove liquid from the cap and side with a quick pulse on a micro-centrifuge.

18)
Place on the magnet stand to clear and pull off all of the liquid.

19)
Place each tube at a low angle on the vortex mixer set to low (~1100 rpm) and squirt

40)
Remove pre-annealed 0.15 µM TruSeq adapters from freezer and allow to thaw on ice.
• CRITICAL STEP: To facilitate multiplexing during sequencing each adapter includes a sixnucleotide index or "barcode". For subsequent data analysis make sure to keep track of which adapter is used for each sample. To allow for addition of adapters using a Multi- Biomek deck and seal with an adhesive cover for subsequent analysis.

68)
Determine the size distribution of libraries by Agilent 4200 TapeStation analysis.

69)
Quantify library yield using dsDNA-specific assay, such as Qubit.

Data processing and analysis
• TIMING 1 d (variable)
• CRITICAL STEP: Separation of sequenced fragments into ≤120 bp and ≥150 bp size classes provides mapping of the local vicinity of a DNA-binding protein, but this can vary depending on the steric access to the DNA by the tethered MNase.