Chromatin accessibility dynamics of Chlamydia-infected epithelial cells

Chlamydia are Gram-negative, obligate intracellular bacterial pathogens responsible for a broad spectrum of human and animal diseases. In humans, Chlamydia trachomatis is the most prevalent bacterial sexually transmitted infection worldwide and is the causative agent of trachoma (infectious blindness) in disadvantaged populations. Over the course of its developmental cycle, Chlamydia extensively remodels its intracellular niche and parasitises the host cell for nutrients, with substantial resulting changes to the host cell transcriptome and proteome. However, little information is available on the impact of chlamydial infection on the host cell epigenome and global gene regulation. Regions of open eukaryotic chromatin correspond to nucleosome-depleted regions, which in turn are associated with regulatory functions and transcription factor binding. We applied Formaldehyde-Assisted Isolation of Regulatory Elements enrichment followed by sequencing (FAIRE-Seq) to generate temporal chromatin maps of C. trachomatis-infected human epithelial cells in vitro over the chlamydial developmental cycle. We detected both conserved and distinct temporal changes to genome-wide chromatin accessibility associated with C. trachomatis infection. The observed differentially accessible chromatin regions, including several Clusters of Open Regulatory Elements (COREs) and temporally-enriched sets of transcription factors, may help shape the host cell response to infection. These regions and motifs were linked to genomic features and genes associated with immune responses, re-direction of host cell nutrients, intracellular signaling, cell-cell adhesion, extracellular matrix, metabolism and apoptosis. This work provides another perspective to the complex response to chlamydial infection, and will inform further studies of transcriptional regulation and the epigenome in Chlamydia-infected human cells and tissues


Introduction
Members of the genus Chlamydia are Gram-negative, obligate intracellular bacterial pathogens responsible for a broad spectrum of human and animal diseases [1]. In humans, Chlamydia trachomatis is the most prevalent bacterial sexually transmitted infection (STI) [2], causing substantial reproductive tract disease globally [3], and is the causative agent of trachoma (infectious blindness) in disadvantaged populations [4]. All members of the genus exhibit a unique biphasic developmental cycle where the non-replicating infectious elementary bodies (EBs) invade host cells and differentiate into replicating reticulate bodies (RBs) within a membrane-bound vacuole, escaping phagolysomal fusion [5]. Chlamydia actively modulates host cell processes to establish this intracellular niche, using secreted effectors and other proteins to facilitate invasion, internalisation and replication, while countering host defence strategies [6,7]. At the end of the developmental cycle, RBs condense into EBs, which are released from the host cell by lysis or extrusion to initiate new infections [8].
Bacterial interactions with mammalian cells can induce dynamic transcriptional responses from the cell, either through bacterial modulation of host cell processes or from innate immune signalling cascades and other cellular responses [9][10][11]. In addition, effector proteins specifically targeting the nucleus (nucleomodulins) can influence cell physiology and directly interfere with transcriptional machinery including chromatin remodelling, DNA replication and repair [12]. Host cell epigenetic-mediated transcriptional regulatory changes, including histone modifications, DNA methylation, chromatin accessibility, RNA splicing, and non-coding RNA expression [13][14][15] may also be arbitrated by bacterial proteins and effectors. Consistent with host cell interactions with other bacterial pathogens, C. trachomatis infection alters host cell transcription over the course of its developmental cycle [16] and may also modulate the host cell epigenome. For example, NUE (NUclear Effector), a C. trachomatis type III secreted effector with methyltransferase activity, enters the host nucleus and methylates eukaryotic 4 / 41 histones H2B, H3 and H4 in vitro [17]. However, the ultimate gene targets of NUE activity or the affected host transcriptional networks are uncharacterised, as is the influence of chlamydial infection on the host cell epigenome in general.
To examine the impact of chlamydial infection on host cell chromatin dynamics, we applied FAIRE-Seq (Formaldehyde-Assisted Isolation of Regulatory Elements sequencing) [18] to C. associated with regulatory factor binding sites. In FAIRE-Seq, libraries are generated from these enriched fragments, followed by sequencing and read mapping to a reference genome [18], allowing patterns of chromatin accessibility to be identified [19]. We identify infectionresponsive changes in chromatin accessibility over the chlamydial developmental cycle, and identify several candidate host transcription factors that may be relevant to the cellular response to chlamydial infection.

Cell culture, infection and experimental design
HEp-2 cells (American Type Culture Collection, ATCC No. CCL-23) were grown as monolayers in 6 x 100mm TC dishes until 90% confluent. Monolayers were infected with C.
trachomatis serovar E in SPG as previously described [20]. Additional monolayers were mockinfected with SPG only. The infection was allowed to proceed 48 hours prior to EB harvest, as previously described [20]. C. trachomatis EBs and mock-infected cell lysates were subsequently used to infect fresh HEp-2 monolayers. Fresh monolayers were infected with C.
trachomatis serovar E in 3.5 mL SPG buffer for an MOI ~ 1 as previously described [20], using centrifugation to synchronize infections. Infections and subsequent culture were performed in the absence of cycloheximide or DEAE dextran. A matching number of HEp-2 monolayers were also mock-infected using uninfected cell lysates. Each treatment was incubated at 25°C for 2h and subsequently washed twice with SPG to remove dead or nonviable EBs. 10 mL fresh medium (DMEM + 10% FBS, 25μg/ml gentamycin, 1.25μg/ml Fungizone) was added and cell monolayers incubated at 37°C with 5% CO 2 . Three biological replicates of infected and mock-infected dishes per time were harvested post-infection by scraping and resuspending cells in 150μL sterile PBS. Resuspended cells were stored at -80°C.
We note that the experimental design used here cannot distinguish Chlamydia-mediated effects from infection-specific or non-specific host cell responses. Further experiments with inactivated Chlamydia or selected gene knock-outs or knock-downs will help to elucidate the extent of specific Chlamydia-mediated interference with the host cell epigenome. We also note that the use of in vitro immortalized HEp-2 epithelial cells means that, despite their utility and widespread use in chlamydial research, the full diversity of host cell responses that are likely to be found within in vivo infections will not be captured.

FAIRE enrichment and sequencing
Formaldehyde-crosslinking of cells, sonication, DNA extraction of FAIRE-enriched fractions and Illumina library preparation was performed as previously described [18]. Libraries were sequenced on the Illumina 2500 platform at the Genome Resource Centre, Institute for Genome Sciences, University of Maryland School of Medicine.
Peak calling of open chromatin regions was performed using MACS2 (2.1.1) [26] in pairedend mode, with additional parameters of '-no-model -broad -q 0.05' and MACS2 predicted extension sizes. All replicates were called separately, with significant peaks determined against the software-predicted background signal. Any peaks that fell within ENCODE blacklisted regions (regions exhibiting ultra-high signal artefacts) [27], or were located on non-standard chromosomes such as (ChrMT and ChrUn) were removed.
Consensus peak sets were created by combining significant peaks from the infected and mockinfected replicates for each time using Diffbind [28]. Peaks were removed if they appeared in less than two replicates. Reads were counted under each peak within each consensus peak set; the resulting read depths were normalised to their relative library sizes. Peaks with less than 3 mapped reads after normalisation were also removed. The resulting count matrices from each consensus peak set were used to look at the differences in chromatin accessibility between Annotation of the set of differential chromatin accessible regions was performed with Homer (v4.9) [29] and separated into three main categories: Intragenic, Promoter and Intergenic.
Intergenic: located >1kbp upstream of the transcriptional start site (TSS), or downstream from the transcription termination site (TTS); Promoter: located within 1kb upstream or 100bp downstream of the TSS (all promoter regions taken from RefSeq); and, Intragenic: annotated to a 3'UTR, 5'UTR, intron, exon, TTS, miRNA, ncRNA or a pseudogene. To identify enhancers, all intergenic regions were compared against experimentally validated enhancer regions from HeLa cells (S3 and S4) using Enhancer-atlas [30] and dbSuper [31].
Clusters of Open Regulatory Elements (COREs) were identified from the set of significant differential chromatin accessible regions using CREAM [32]. A window size of 0.5 and a peak range of 2:5 was initially set to separate COREs encompassing multiple genes from COREs overlapping individual genes. Subsequent filtering removed COREs with < 3 peaks and limited peak width to < 500,000 bp. Each CORE was visually inspected in the Integrative Genomics Viewer (IGV) to identify COREs that overlapped a single gene and to ensure all peaks had a fold-change of at least > 2 or < -2.
Motif analysis was performed with Homer [29]. Target sequences were regions with significant differential chromatin accessibility as identified by DESeq2, while the number of background sequences were randomly sampled regions throughout the human genome. Additional parameters included using a hypergeometric distribution, allowing for two mismatches and searching for motifs between 8-14 bp long. Motif enrichment was also performed with Homer [29], followed by filtering and assessment of human tissue specificity of the enriched transcription factors (TF) (p-value < 0.05, >5% of target sequences). For significant de novo TFs, motif matrices were compared against the Jaspar [33] and TomTom [34] databases, where enriched TFs were discarded unless the Homer annotation matched top hits in either database, and were also human-tissue specific.
Sequence data is available from the NCBI GEO archive GSE132448.

Chromatin accessibility landscapes of Chlamydia-infected and mock-infected cells
We applied FAIRE-Seq to C. trachomatis serovar E-infected and mock-infected human HEp-2 epithelial cells in triplicate at 1, 12, 24, and 48 hours post-infection (hpi). Following initial quality control measures, a single C. trachomatis-infected replicate was identified as an outlier and was removed from further analysis. The remaining replicates were mapped to the human genome (GRCh38), resulting in 52,584,839 mapped reads for mock-infected replicates and 98,802,927 mapped reads for Chlamydia-infected replicates (151,387,766 in total) ( Table 1) hours (mock-infected). The remaining peak sets exhibit tight clustering between mock-infected and infected conditions respectively at each time. Total consensus peak numbers increased across the chlamydial developmental cycle, independent of the total mapped reads over time.

C. trachomatis infection is associated with temporal changes to chromatin accessibility in host cells
We identified genomic regions with significant differences in chromatin accessibility between At 12 hours, the number of significant differentially accessible regions was lower (8%), compared to the other times (64% at 1 hpi, 43% at 24 hpi and 72% at 48 hpi). The number of mapped reads was similar for all 12 hour replicates across conditions, and similar to other times, suggesting minimal bias from the variability of the underlying mapped reads ( Table 1) and significant peaks ( Figure 1A). In addition, each replicate had consistent peak coverage across the human genome (Additional File 1). Furthermore, 12 hour peak annotation is similar to other times ( Figure  Most infection-associated differential chromatin accessible regions map to intergenic and intronic regions ( Figure 3B-C, Additional File 2), consistent with other chromatin accessibility studies [35,36], and the overall distribution of protein-coding genes within the human genome [37]. The distribution of differential chromatin-accessible regions around TSSs (+/-5kb) at 12 and 48 hpi suggests that the majority of differential chromatin accessible regions are in proximity to TSSs. However, at 1 hpi there is no obvious distribution, while 24 hpi exhibits a bi-modal distribution (Figure 3D), suggesting that additional mechanisms, such as alternative splicing that may be contributing to the regulatory response to infectionassociated events.

Differential chromatin accessibility at promoters and enhancers identify infectionassociated host regulatory activity
The proportion of all differentially accessible regions mapping to promoter regions is 4 (0.5%) at 1 hpi, 14 (4.8%) at 12 hpi, 21 (1.5%) at 24 hpi and 265 (8.5%) at 48 hpi ( Figure 4A). which has a promoter exhibiting an increase in chromatin accessibility, is a key regulator of copper transport into phagosomes as part of a host cell response to intracellular infection [38,39].
Fifteen promoter-specific differentially accessible regions are found at two or more times. Two promoter regions are associated with genes encoding sorting nexin 16 (SNX16) and oligosaccharyltransferase complex subunit (OSTC) respectively ( Figure 4B). The promoter region of OSTC exhibits increased chromatin accessibility at 24 and 48 hours; OSTC is linked to cellular stress responses [40]. Conversely, SNX16 shows a reduction in chromatin accessibility at both 1 and 48 hpi. Sorting nexins are a family of phosphatidylinositol binding proteins sharing a common PX domain that are involved in intracellular trafficking. Sorting nexins are a key component of retromer, a highly conserved protein complex that recycles host protein cargo from endosomes to plasma membranes or the Golgi [41]. Retromer is targeted by several intracellular pathogens, including Chlamydia, as a key strategy for intracellular survival [42]. The C. trachomatis effector protein, IncE, binds to sorting nexins 5 and 6, disrupting retromer-mediated host trafficking pathways [42] and potentially perturbing the endolysomalmediated bacterial destruction capacity of the host cell [43]. However, SNX16 is a unique member of this family, containing a coiled-coil domain in addition to a PX domain, and is not associated with retromer [44]. SNX16 is instead associated with the recycling and trafficking of E-cadherin [44], which mediates cell-cell adhesion in epithelial cells, and is associated with a diversity of tissue specific processes, including fibrosis and epithelial-mesenchymal transition (EMT) [45]. Separately, C. trachomatis infection has been shown to downregulate Ecadherin expression via increased promotor methylation, potentially contributing to EMT-like changes [46]. Thus, downregulation of SNX16, as inferred by the observed reduction in promotor-associated chromatin accessibility may contribute to chlamydial fibrotic scarring outcomes. In other bacterial pathogens, modulation of E-cadherin is a known virulence mechanism where it is degraded by proteases, such as HtrA, disrupting tight and adherens junctions to facilitate invasion through the epithelial barrier [47,48]. Although chlamydial HtrA has been detected outside the inclusion and in exported blebs [49], E-cadherin has not yet been identified as a chlamydial HtrA target. Nevertheless, HtrA has been shown to be critical for in vivo chlamydial infections, indicating that this functionality may be revealed in the future [50].
Changes in chromatin accessibility of regions overlapping tissue-specific enhancers from experimentally validated databases were examined, identifying 211 enhancers and seven "super-enhancers" (Figure 5A). All super-enhancers exhibited an increase in chromatin accessibility, and were associated with genes mediating cell growth (KLF5), cell structure and signalling (FLNB, PTP4A2 and MSN), and innate immunity (IER3) (Additional File 3).
Infection-responsive chromatin accessible regions occurring at three or more times over the chlamydial developmental cycle (all exhibiting an increase in chromatin accessibility) identified known enhancers that influence DNA/RNA-polymerase activity (AFF1, POLR2M, TCEB1, CHMP4C and POLL), including elongation factors, chromatin remodelling and DNA repair ( Figure 5B). The manipulation of these genes and underlying functions are suggestive of nucleomodulin activity, which are a class of bacterial effectors that directly target the host cell nucleus to manipulate host defences and machinery [12]. One example of a C. trachomatis specific nucleomodulin is NUE, which is directed to the nucleus and performs methyltransferase activity [17]. However, as noted above, our experimental design does not distinguish Chlamydia-mediated effects from infection-specific or non-specific host cell responses.
In addition, three enhancer-linked genes that recur three or more times over the developmental cycle and show an increase in chromatin accessibility, are involved in ubiquitination and protein quality control (KLHL8, FBXO3 and EDEM3). The eukaryotic ubiquitination modification marks proteins for degradation and regulates cell signalling of a variety of cellular processes, including innate immunity and vesicle trafficking [51]. The deposition of ubiquitin onto intracellular pathogens is a conserved mechanism found in a diverse range of hosts [52].
In Chlamydia, host cell ubiquitin systems can mark chlamydial inclusions for subsequent destruction [53] and there is emerging evidence that various Chlamydia species, using secreted effectors and other proteins, are able to subvert or avoid these host ubiquitination marks for intracellular survival [53,54]. Our observation of increased chromatin accessibility of enhancer elements linked to ubiquitination genes, putatively augmenting expression of these genes, further highlights the complex role of ubiquitination in chlamydial infection.

Conserved and time-specific host responses to infection over the chlamydial developmental cycle
Differential chromatin accessible regions that are present at all four times during infection demonstrate a conserved host cell response to chlamydial infection ( Figure 2B). Time-specific differential chromatin accessibility is also evident over the chlamydial developmental cycle ( Figure 2B). To investigate the conserved host cell response, we focused upon 63 of the 120 differential chromatin accessible regions (intragenic, promoter or enhancer regions) identified above, excluding the likely ambiguous intergenic regions ( Figure 6A). 56 Figure 6A). The remaining conserved differentially accessible regions were associated with genes involved in infection-relevant cellular processes, including C8A as part of the complement cascade, and lipase activity from LIPI that is essential for chlamydial replication [55]; while multiple genes (HDAC2, HNRNPUL1, NCOA7 and YAP1) are known transcriptional regulators. We also examined any differential chromatin accessible regions that appeared across three times. This identified further effects of infection on the complement cascade. Key components of the membrane attack complex (MAC) and complement activation pathways exhibit increased differential chromatin accessibility (C8B at 1, 12 and 24 hours and CFHR5 at 24 and 48 hours). Conversely, C6 exhibits decreased chromatin accessibility at 48 hours.
We identified unique differentially accessible regions across the chlamydial developmental cycle (Figure 6B). At 1, 12 and 24 hpi, there are a relatively small number of significant differential chromatin accessible regions. In contrast, 48 hpi exhibits over 1,400 regions, further reflecting the diverse processes associated with the end of the in vitro developmental cycle as indicated previously. As with the conserved differential regions above, we focused on differential chromatin accessibility within promoters, enhancers and intragenic regions (50 at 1 hpi, 17 at 12 hpi, 27 at 24 hpi and 866 at 48 hpi) (Figure 6B, Additional File 4).  [61]. Notably, epidermal growth factor receptor (EGFR), a member of the ErbB family, is the target receptor for C. pneumoniae Pmp21 as an EGFR-dependent mechanism of host cell entry [62]. The C. trachomatis Pmp21 ortholog, PmpD, also has adhesin-like functions [63], however the host ligands are unknown.
Nevertheless, EGFR inhibition results in small, immature C. trachomatis inclusions, with calcium mobilisation and F-actin assembly disrupted [64], indicating the functional importance of EGFR and the ErbB signaling pathway for C. trachomatis attachment and development.
Three enriched biological processes share the term 'cell-cell adhesion via plasma membrane adhesion molecules' (GO:0098742, GO:0016339 and GO:0007157). Several genes common to these categories with infection-responsive differential chromatin accessibility are associated with cadherins (CDH4, CDH12, CDH17, CDH20, FAT4 and PTPRD). Disruption of cadherin function has been described in C. trachomatis infection, and is linked to the alteration of adherens junctions and the induction of epithelial-mesenchymal transition (EMT) events that may underlie chlamydial fibrotic outcomes [46,65]. Altered chromatin accessibility for other cadherin-relevant loci over the chlamydial developmental cycle is apparent in this data, are one of three sub-types of Smads that inhibit intracellular signalling of TGF-β by various mechanisms including receptor-mediated inhibition [68]. In addition, Smad2 contains two closed chromatin accessibility regions at an enhancer and intragenically respectively. Smad2 is part of the R-Smad sub-family that regulates TGF-β signalling directly [69,70]. TGF-β has been hypothesised to be a central component of dysregulated fibrotic processes in Chlamydiainfected cells, provoking runaway positive feedback loops that generate excessive ECM deposition and proteolysis, potentially leading to inflammation and scarring [16]. We also identify down-regulation of the ontology 'Kinesin binding (GO:0019894). Kinesins belong to a class of motor proteins that move along microtubule filaments (from the centre of the cell outwards) supporting cell functions including transport and cell division [71]. C. trachomatis expresses an effector protein (CEP170) that recruits host microtubules into the vicinity of the mature inclusion, enabling microtubule-dependent traffic to be re-directed to the inclusion [72].

Clusters of Open Regulatory Elements
Clusters of Open Regulatory Elements (COREs) are multiple areas of open chromatin in close proximity to each other, which may represent regions of coordinated chromatin accessibility linked to multiple regulatory elements [73]. We focused on differential chromatin regions spanning less than 500k bp that contain a cluster of at least three regions (Figure 8A). This identified 18 COREs across three times post-infection consisting of regions with the same foldchange direction and overlapping a single gene (Figure 8B). A CORE is apparent at 1 hpi in proximity to laminin (LAMA2). Laminins are a component of the extracellular matrix and basement membranes that influence cell differentiation, migration, and adhesion. As noted above, dysregulation of ECM moieties has been hypothesised to be a key mechanism of chlamydial scarring, in which immune-mediated positive feedback loops are induced on infection as part of an early, aberrant wound response to chlamydial infection, creating inflammatory accumulations of ECM constituents [16]. Combined with the observed chromatin accessibility changes to several cadherin and cadherin-associated genes and TGF-β-mediated Smad signalling in this work, a CORE within the laminin gene provides further support for the key role of dysregulated ECM in chlamydial disease outcomes.
At 48 hours, eleven COREs were identified, overlapping six protein-coding and five noncoding genes. Two of these genes (DNAH14 and MYo9A) belong to broad families of cytoskeletal motor proteins (dyneins and myosins), with relevance to chlamydial infection. Some members of the myosin family may be involved with chlamydial extrusion through a breakdown of the actin cytoskeleton followed by the release of EB's at the end of the lifecycle [74]. However, MYo9A itself has not been previously linked to chlamydial infection.
Similarly, dynein-based motor proteins have been shown to move the chlamydial inclusion via the internal microtubule network to the MTOC (Microtubule-Organizing Centre); the close proximity to the MTOC is thought to facilitate the transfer of host vesicular cargo to the chlamydial inclusion [75]. However, DNAH14 is an axonemal dynein that causes sliding of microtubules in the axonemes of cilia and flagella, and is typically only expressed in cells with those structures [76]; it is not clear what role it would have in chlamydial infection. A third CORE overlaps DGKB, a diacylglycerol kinase that metabolises 1,2,diacylglycerol (DAG) to produce phosphatidic acid (PA), a key precursor in the biosynthesis of triacylglycerols and phospholipids, and a major signalling molecule [77]. Chlamydia obtains and redirects hostderived lipids through multiple pathways [78], and as further identified in this CORE and enriched gene ontologies (above).

Identification of transcription factor motifs
Putative transcription factor (TFs) motifs were identified from all significant differential chromatin accessible regions at each time post-infection (Additional File 5). Ten significant TF motifs were identified, spanning the developmental cycle ( Table 2). IRF3 (Interferon Regulatory Factor) motifs are enriched at 1 hpi; IRF3 is a key transcriptional regulator of type I interferon (IFN)-dependent innate immune responses and is induced by chlamydial infection.
The type I IFN response to chlamydial infection can induce cell death or enhance the susceptibility of cells to pro-death stimuli [79], but may also be actively dampened by Chlamydia [80,81]. Specificity Protein 1 (Sp1) is a zinc-finger TF that binds to a wide range of promoters with GC-rich motifs. Sp1 may activate or repress transcription in a variety of cellular processes that include responses to physiological and pathological stimuli, cell differentiation, growth, apoptosis, immune responses, response to DNA damage and chromatin remodelling [82,83].
The majority of TF motifs enriched at 48 hours correspond to Krüppel-like-factors (KLFs).
KLFs are zinc-finger TFs in the same family as Sp1, which is also enriched at 48 hours. The members of this large family orchestrate a range of paracrine and autocrine regulatory circuits and are ubiquitously expressed in reproductive tissues [84]. Dysregulation of KLFs and their dynamic transcriptional networks is associated with a variety of uterine pathologies [85]. We find motif enrichment for five distinct KLFs (KLF5, KLF6, KLF9, KLF10 and KLF14) at 48 hours, in addition to further KLFs at 12 and 48 hours (KLF 3 and KLF 4) without the initial filtering steps (Additional File 5). KLF5 is a transcriptional activator found in various epithelial tissues and is linked to regulation of inflammatory signalling, cell proliferation, survival and differentiation [86]. KLF6 is also a transcriptional activator ubiquitously expressed across a range of tissues and plays a crucial role in regulating genes involved with tissue development, differentiation, cell cycle control, and proliferation [87]. Target genes include collagen α1, keratin 4, TGFβ type I and II receptors, and others [88]. KLF9, 10 and 14 act as transcriptional repressors and are ubiquitously expressed across a range of tissues [89].
KLF9 is a tumour suppressor [90] and regulates inflammation, while KLF10 has a major role in TGF-β-linked inhibition of cell proliferation, inflammation and initiating apoptosis [91].
Histone deacetylases (HDACs) modify the core histones of the nucleosome, providing an important function in transcriptional regulation [95], and many bacterial pathogens subvert HDACs to suppress host defences [15]. KLF9, 10 and 14 share the co-factor Sin3A (SIN3 Transcription Regulator Family Member A) [60], which is also a core component of the chromatin-modifying complex mediating transcriptional repression [66]. The Sin3a/HDAC complex is made up of two histone deacetylases HDAC1 and HDAC2. HDAC2 has increased chromatin accessibility at all four time points, and HDAC9 has increased chromatin accessibility at 1, 24 and 48 hours, further supporting the potential for histone modifications to be a component of the host cell response to chlamydial infection, or to be targets of chlamydial effectors [17].

Conclusions
We describe comprehensive changes to chromatin accessibility upon chlamydial infection in epithelial cells in vitro. We identify both conserved and time-specific infection-responsive changes to a variety of features and regulatory elements over the course of the chlamydial developmental cycle that may shape the host cell response to infection, including promotors, enhancers, COREs, and transcription factor motifs. Some of these changes are associated with genomic features and genes known to be relevant to chlamydial infection, including innate reproductive tract pathologies in both men and women [85], thus our discovery of enriched KLF binding motifs in response to chlamydial infection is compelling, given the scale and burden of chlamydial reproductive tract disease globally [3]. 21 / 41 In summary, this is the first genome-scale analysis of the impact of chlamydial infection on the human epithelial cell epigenome, encompassing the chlamydial developmental cycle at early, mid and late times. This has yielded a novel perspective of the complex host epithelial cell response to infection, and will inform further studies of transcriptional regulation and epigenomic regulatory elements in Chlamydia-infected human cells and tissues. Examination of the multifaceted human epigenome, and its potential subversion by Chlamydia, using in vivo mouse models of infection and ex vivo human reproductive tract tissues, will continue to shed light on how the host cell response contributes to infection outcomes.  Identifying significant peaks and creating consensus peaksets A) Significant peaks per replicate (p-value < 0.05). B) Consensus peaks were created for each time by combining significant peaks from Chlamydia-infected and mock-infected conditions, retaining peaks which appeared in > 2 replicates. C) PCA plots demonstrating tight clustering within each consensus peak set grouping infected and mock-infected replicates.    Significant differential peaks annotated as intergenic were compared against experimentally validated tissue-specific enhancers. A) All enhancer regions across each time. Seven super enhancers were identified and are denoted with a star (*). B) Enhancers overlapping three or more times. Red and blue shading indicate fold-changes, while grey indicates that no significant peaks were associated with that enhancer. Some enhancers contain more than one peak, explaining why there are multiple fold-changes at some times.

Figure 6. Conserved host cell response to infection
A) 120 Differentially accessible regions found in all four times were extracted, representing a conserved host cell response to infection. Intergenic regions were removed due to the ambiguity of annotating to the closest feature. If a gene contained more than one peak within a specific time, the different fold changes are split out evenly within the column at that time. B) Venn diagram highlighting the number of time-specific differential regions. Intergenic regions were also removed for the same reasons, with the remaining enhancers, promoters and intragenic regions separated by their chromatin accessibility.

Figure 7. Enrichment of time-specific differential chromatin regions
Annotated time-specific differential chromatin regions associated with 1 hour A), 12 hours B) and 24 hours C). Where genes have been grouped into annotated categories, multiple underlying sources were used for verification. D) At 48 hours, a substancial increase in genes allowed Gene Ontology (GO) enrichment. All three GO categories were enriched, with the top ten p-values across the categories displayed.

Figure 8. COREs (Clusters of Open Regulatory Elements)
A) Number of COREs at each time post-infection using significant differential peaks, separated by width and the number of peaks within each CORE. COREs have a maximum width of 500,000 bp and > 3 peaks. B) 18 significant COREs were identified across three times postinfection. For each CORE, the genomic location, associated number of peaks, where they fall within proximity to a genomic feature, fold-changes, and genetic biotype are shown. Figure 1 1  Target sequences are significant differential peaks and background sequences are randomly selected throughout the genome to determine significance. A star (*) denotes a de-novo motif where various sources were used to annotate the corresponding transcription factor.   ) showing that all replicates contain significant peaks genomewide (human genome) without any visual chromosomal bias.

Additional File 2.xlsx Annotation of all significant peaks
Annotation of all the significant peaks, with tabs separating genomic features and fold-change regulation.

Additional File 3.docx Enrichment of Super-enhancer genes
Super enhancer-linked genes separated by time and biological activity.

Additional File 4.xlsx Time specific regions
The list of time-specific differential chromatin accessible regions. It should be noted that some genes in these lists are repeated at each time due to multiple peaks occurring at an annotated interval, that enhancers can affect more than one gene, and single genes can be affected by more than one enhancer.

Additional File 5.xlsx Complete list of motifs and transcription factors
The complete list of significant motifs and enriched transcription factors.