The bromodomain-containing protein Ibd1 links multiple chromatin-related protein complexes to highly expressed genes in Tetrahymena thermophila

Background The chromatin remodelers of the SWI/SNF family are critical transcriptional regulators. Recognition of lysine acetylation through a bromodomain (BRD) component is key to SWI/SNF function; in most eukaryotes, this function is attributed to SNF2/Brg1. Results Using affinity purification coupled to mass spectrometry (AP–MS) we identified members of a SWI/SNF complex (SWI/SNFTt) in Tetrahymena thermophila. SWI/SNFTt is composed of 11 proteins, Snf5Tt, Swi1Tt, Swi3Tt, Snf12Tt, Brg1Tt, two proteins with potential chromatin-interacting domains and four proteins without orthologs to SWI/SNF proteins in yeast or mammals. SWI/SNFTt subunits localize exclusively to the transcriptionally active macronucleus during growth and development, consistent with a role in transcription. While Tetrahymena Brg1 does not contain a BRD, our AP–MS results identified a BRD-containing SWI/SNFTt component, Ibd1 that associates with SWI/SNFTt during growth but not development. AP–MS analysis of epitope-tagged Ibd1 revealed it to be a subunit of several additional protein complexes, including putative SWRTt, and SAGATt complexes as well as a putative H3K4-specific histone methyl transferase complex. Recombinant Ibd1 recognizes acetyl-lysine marks on histones correlated with active transcription. Consistent with our AP–MS and histone array data suggesting a role in regulation of gene expression, ChIP-Seq analysis of Ibd1 indicated that it primarily binds near promoters and within gene bodies of highly expressed genes during growth. Conclusions Our results suggest that through recognizing specific histones marks, Ibd1 targets active chromatin regions of highly expressed genes in Tetrahymena where it subsequently might coordinate the recruitment of several chromatin-remodeling complexes to regulate the transcriptional landscape of vegetatively growing Tetrahymena cells. Electronic supplementary material The online version of this article (10.1186/s13072-018-0180-6) contains supplementary material, which is available to authorized users.


Background
Eukaryotic cells possess multiple levels of regulation of mRNA transcription by RNA polymerase II. Many coactivators of transcription exert their function through chromatin-modifying activities. In budding yeast, the SAGA histone acetyl transferase complex co-activates transcription by acetylating specific lysine residues in the N-terminus of histone H3 within the nucleosome, which can then serve as a platform to recruit the SWI/SNF complex via the bromodomain (BRD) present in SNF2/ Brg1 [1]. The BRD specifically binds acetyl-lysine (Kac) within proteins such as histones [2]. When recruited to a genomic region, the SWI/SNF complex co-activates transcription in part by hydrolyzing ATP via the Snf2 subunit and remodeling nucleosomes to make promoter sequences available to be bound by general transcription factors (TFs) such as TFIID. Some other histone-modifying complexes that function in promoting transcription include the NuA4 histone acetyl transferase that acetylates nucleosomal H4 [3], and the Set1 and Set2 histone methyl transferases that methylate nucleosomal H3K4 and H3K36 [4], respectively. Additional protein domains that function in transcription complexes by recognizing some of the diverse histone post-translational modifications (PTMs) include the methyl lysine-recognizing PHD and chromodomains [5]. Other ATP-dependent chromatin-remodeling complexes that function in transcription include the SWR complex that exchanges core H2A in the nucleosome for the transcription-friendly histone H2A variant Htz1 [6,7] and the INO80 complex one function of which is to catalyze the reverse reaction [8].
A typical eukaryotic nucleus is composed of regions of transcriptionally inert heterochromatin as well as euchromatic areas which are considered competent for transcription. The ciliate protozoan Tetrahymena thermophila is a unique model system for studying transcription since it segregates germ-line-specific silent (micronucleus-MIC), and somatic transcriptionally active (macronucleus-MAC) chromatin into two distinct nuclei contained within its single cell. The different chromatin structures of the MAC and MIC have their origins in the sexual phase (conjugation) of the life cycle [9]. After pairing, the MIC in each of the two cells undergoes meiosis, generating four haploid meiotic products, only one of which is retained. This gametic nucleus divides mitotically, and one of the two resulting identical haploid nuclei is reciprocally exchanged and fuses with that of its partner to form a genetically identical diploid zygotic nucleus in each cell. The zygotic nucleus divides twice, resulting in four identical products at which point two begin to develop into new MACs (NM). MAC development in the NM of each exconjugant involves extensive programmed DNA rearrangements/irreversible genome silencing that is directly linked to ncRNA-based changes in chromatin structure. These DNA rearrangements include site-specific chromosome fragmentation as well as the deletion of MIC-limited sequences called internal eliminated sequences (IESs) that together result in the loss of ~ 15% of the germ-line genome [10]. IES deletion begins with the bidirectional transcription of RNAs from the meiotic MIC [11,12]. Meiosis is the only stage of the Tetrahymena life cycle where the MIC is transcribed [11,13]. This meiotic MIC-specific transcription is catalyzed by RNAPII [13]. A global MIC-specific nuclear run-on analysis showed that meiotic MIC-specific transcription is biased toward IES DNA, implying that initiation/ start-site selection of the MIC-specific transcription is regulated and not simply a result of global or random transcription [12,14]. The underlying molecular mechanisms underlying any transcription in Tetrahymena remain poorly understood.
We previously characterized a SNF2-related gene in T. thermophila [15]. Despite high primary sequence similarity of Brg1 Tt to the budding yeast Snf2 and human Brg1 through most of the protein, Brg1 Tt does not possess a recognizable BRD, and its C-terminal region, unlike the entire protein, is dispensable for growth and development [15] raising the possibility that SWI/SNF Tt functions independently of histone acetylation. Here we report a unique BRD-containing protein, Ibd1, which is a component of SWI/SNF Tt during vegetative growth but not during conjugation. Recombinant Ibd1 recognizes several Kac marks on histones that are correlated with active transcription in Tetrahymena. AP-MS analysis of Ibd1 revealed it to interact with protein complexes in addition to SWI/SNF Tt including SWR Tt , SAGA Tt , as well as with a novel putative H3K4-specific histone methyltransferase. ChIP-Seq analysis of Ibd1 suggests a role for the protein during transcription. We suggest that Ibd1 coordinates high levels of transcription of highly expressed genes in T. thermophila.

Identification of T. thermophila SWI/SNF complex
We previously cloned and characterized the Snf2/Brg1 ortholog in T. thermophila [15] and predicted it to be a component of a SWI/SNF complex, similar to the situation in Saccharomyces cerevisiae [16] and human cells [17]. We used an affinity purification coupled to mass spectrometry (AP-MS) to identify T. thermophila SWI/ SNF. Specifically, we profiled and compared the set of interacting proteins of two distinct putative SWI/SNF Tt components, Snf5 Tt (TTHERM_00304150), a core subunit of yeast and human SWI/SNF complexes [18], and Snf5 Tt -interacting protein Saf5 Tt (TTHERM_00241840). Our comparative sequence analysis shows Snf5 Tt to be highly similar to that of yeast and animal cells across most of the protein (see Additional file 1). We generated stable T. thermophila cell lines expressing FZZ epitope-tagged SNF5 Tt and SAF5 Tt from their respective macronuclear chromosomal loci by homologous recombination-mediated gene replacement [19]. The FZZ epitope tag contains two protein A moieties and one 3xFLAG separated by a TEV cleavage site [20], permitting tandem affinity purification of an FZZ fusion protein, which permits subsequent analysis of co-purifying proteins by Western blotting and/or mass spectrometry [21]. The SNF5 Tt -FZZ and SAF5 Tt -FZZ tagging constructs (see Additional files 1, 2) were used to transform growing T. thermophila strains using biolistic transformation. Gene replacement of the WT SNF5 Tt and SAF5 Tt that occurs by homologous recombination [22] and 'phenotypic assortment' (reviewed in [23]) generates homozygosity in the polyploid MAC for the chromosome containing the SNF5 Tt -FZZ or SAF5 Tt -FZZ gene locus. Western blotting using an FZZ-specific antibody demonstrated expression of the epitope-tagged Snf5 Tt or Saf5 Tt in whole-cell extracts from Snf5 Tt -FZZ-and Saf5 Tt -FZZ-expressing strains, respectively (Fig. 1a, left panel, lanes 2 and 4; b, lanes 3 and 4) compared to that of untagged strains (Fig. 1a, left panel, lanes 1 and 3; b, lanes 1 and 2). Indirect immunofluorescence on Snf5 Tt -FZZ and Saf5 Tt -FZZ in growing T. thermophila showed localization to the transcriptionally active MAC and not to the silent MIC (Fig. 1c), identical to what we observed previously for Brg1 Tt [15], consistent with the hypothesis that Snf5 Tt and Saf5 Tt are a member of a Brg1 Tt -containing SWI/SNF Tt . A Brg1 Tt -specific antibody [15] demonstrated co-purification of Brg1 Tt with Snf5 Tt -FZZ and Saf5 Tt -FZZ affinity purified from wholecell extracts from Snf5 Tt -FZZ expressing (Fig. 1a, lanes 3-6) and Saf5 Tt -FZZ expressing (Fig. 1b) but not from untagged strains during vegetative growth.
We next performed a gel-free LC-MS/MS-based analysis for each of Snf5-FZZ and Saf5 Tt -FZZ of the respective affinity purifications to define their sets of interacting proteins. To provide statistical rigor to our AP-MS analyses, all interaction data were filtered using Significance Analysis of INTeractome express (SAINTexpress) which uses semiquantitative spectral counts to assign a confidence value to individual protein-protein interactions [24]. Application of SAINTexpress to the AP-MS data for two biological replicates of Snf5 Tt -FZZ and Saf5 Tt -FZZ affinity purifications from vegetatively growing T. thermophila filtered against numerous control AP-MS experiments revealed sets of interaction partners that pass the cutoff confidence value and are listed in Table 1. Our previous analysis [15] of the sequenced T. thermophila MAC genome predicted the existence of three potential SWI/ SNF proteins in addition to Brg1 Tt and Snf5 Tt : Swi1 Tt (TTHERM_00243900), Swi3 Tt (TTHERM_00584840) and Snf12 Tt (TTHERM_00925560). The SAINTexpress analysis of the MS data for Snf5 Tt -FZZ and Saf5 Tt -FZZ (Table 1) Fig. 1a, b, Swi1 Tt , Swi3 Tt and Snf12 Tt (Table 1). Saf5 Tt possesses two tandem plant homeodomains (PHD domain). One known function of PHD domains is to mediate specific interactions with methylated lysine on histone proteins to positively regulate transcription [25]. PHD domain-containing proteins are not known to be present in core yeast SWI/SNF but are observed in several animal SWI/SNF complexes [26]. The two PHD domains of Saf5 Tt are in the same position and are highly similar to those of zebrafish DPF3 and mammalian proteins mBAF45a and hBAF45a (see Additional file 2) both of which are members of a cell type-specific SWI/SNF complex [26,27]. DPF3 is part of the BAF chromatin-remodeling complex in zebrafish, and it is involved in regulation of muscle development and recognizes histones carrying both specific histone acetylation and methylation marks [27]. Snf5 Tt -FZZ additionally co-purified tetrin A (TTHERM_00006320), an insoluble cytoskeletal protein unique to ciliates [28]. We have previously noted a variable affinity of the M2 anti-FLAG antibody for this protein as was previously observed for other cytoskeletal proteins [29] and therefore decided not to follow-up on it here. Both Snf5 Tt -FZZ and Saf5 Tt -FZZ co-purified with 5 other proteins with no clear orthologs in other described SWI/SNF complexes. The first of these 5 proteins, Saf1 (SWI/SNF-associated factor 1, Table 1), is predicted to have a coiled coil and a transmembrane domain. Saf1 appears to have a homolog in Paramecium tetraurelia (XP_001441480.1) that also possesses the coiled coil domain but not a transmembrane domain. The next 3 proteins, Saf2 Tt , Saf3 Tt and Saf4 Tt , are T. thermophila-specific, meaning that they do not have identifiable known homologs in any other organism. However, all three possess clusters of glutamines in their primary sequence suggestive of a role in transcription [30]. The fifth protein SAINTexpress analysis revealed to co-purify with Snf5 Tt -FZZ and Saf5 Tt -FZZ is TTHERM_00729230 (Table 1), which possesses a canonical BRD. We named this protein Ibd1 (Interactive BromoDomain Protein 1). We suggest the 11 proteins Swi1 Tt , Swi3 Tt , Snf5 Tt , Snf12 Tt , Brg1 Tt and Ibd1 in addition to Saf1-5 Tt and together define the first known ciliate SWI/SNF complex.

Ibd1-and BRD-containing proteins in T. thermophila
The BRD is highly conserved across eukaryotic species, present in functionally diverse proteins including histone acetyl transferases (HATs), ATP-dependent chromatinremodeling complexes, helicases, methyl transferases and transcriptional regulators [31]. Dysfunctional BRDcontaining proteins have previously been linked to the development of several human pathologies and are now actively pursued as therapeutic targets [32]. Our finding  The amino acid sequences of the predicted BRDs were aligned using MUSCLE. The phylogenetic analysis was carried out using the neighbor joining method with 1000 bootstrap replicas (confidence > 90% for all nodes), c predicted structure (left) of the Ibd1 BRD shown in ribbon diagram with rainbow color scheme. Blue represents the N-terminus, whereas red shows the C-terminus of the predicted structure. The superimposition (right) was carried out using the BRD of human SMARCA2 protein (PDB: 5DKC) which is shown in violet color backbone format. Note The identified Tetrahymena BRD-containing proteins were named based on the domain architecture if no clear human ortholog was available. BroW1, bromo-WD40 domain protein; BrEt, bromo-Et domain protein; BrAn, Bomo-Ank domain protein; Brop1-6, BRD-containing protein that a unique BRD-containing protein co-purifies with Snf5 Tt -FZZ and Saf5 Tt -FZZ prompted us to determine the full repertoire of BRD-containing proteins in Tetrahymena. Our query for BRDs in the Tetrahymena genome database, www.ciliate.org [33], identified 14 proteins (Fig. 2a). Consistent with human BRD-containing proteins [34], the Tetrahymena putative BRD-containing proteins appear functionally diverse and their BRDs can be found in combination with a variety of other domains (Fig. 2a). However, unlike humans and yeast, where multiple BRDs can be present within the same protein [34,35], the T. thermophila BRDs are present as single copy. To classify the T. thermophila BRD-containing proteome, we carried out a phylogenetic analysis and categorized the set of proteins into three groups based on their BRD similarity ( Fig. 2b). 'Group I' contains two proteins, Mll1 and BroP-3. The 'Group II' (Fig. 2b) can be further categorized into two subgroups such that 'Group II-A' contains only two proteins including Chd1 Tt and BroW1 Tt , whereas 'Group II-B' has six proteins including Snf5 Tt -interacting Ibd1 (or BroP5; see figure legend for nomenclature). Five out of the 6 proteins found in 'Group II-B' contain no recognizable domains other than BRDs (Fig. 2a, b). The similarities in the domain architecture and grouping pattern suggest that the 'Group II-B' proteins (which includes Ibd1) might be functionally more similar to each other than to those found within the other groups. Group III contains four proteins including Gcn5 Tt and three proteins that possess an ET (extra-terminal) domain in addition to a BRD. In many eukaryotes, including yeast and humans, bromodomain proteins containing two BRDs followed by an ET domain are referred to as the BET protein family [36]. BRDs generally function to recognize Kac motifs on histones or non-histone proteins to regulate various cellular processes including transcription [34]. The ET domains in contrast are thought to recruit effector proteins which in turn can regulate the transcriptional activity [37]. Structural conservation of a protein often yields insights into its functions. To gain insight into the function of Ibd1, we predicted the three-dimensional structure of its BRD and observed that it folds similarly to the known BRD structures. For example, the predicted structure can be superimposed to the C-terminal BRD of human SMARCA2 (Fig. 2c). This suggests that the Ibd1 protein may have a similar function in transcription to that of canonical SNF2 proteins in the yeast and animal SWI/SNF complex through recognition of a similar/same Kac substrate in histones.

Ibd1 recognizes Kac and interacts with multiple chromatin-related proteins
Our finding of a distinct BRD-containing protein in SWI/SNF Tt is consistent with the fact that a BRD in the catalytic subunit (Snf2/Brg1) has important functions in eukaryotic SWI/SNF complexes. We aligned the primary sequence of the BRD of Ibd1 to those of Gcn5 Tt , yGcn5p, yBDF1, yBDF2 and ySWI2/SNF2 which are functional BRD-containing proteins (see Additional file 3). The alignment showed a number of conserved amino acids in the BRD including the highly conserved asparagine (N) that makes contact with Kac [34,38], suggesting that Ibd1 as other BRD-containing proteins is likely to bind this mark. We expressed, purified and incubated recombinant 6xHIS-Ibd1 with a commercially available peptide array that includes a large number of possible histone posttranslational modifications, including many histone acetylation sites. Recombinant 6xHIS-Ibd1 displayed strong specificity for acetylated H3K9 and H3K14, acetylated H2AK9 and H2AK13 and tri-acetylated H4K5, H4K8 and H4K12 (Table 2; see Additional file 4 for Raw Data), which are all acetylation patterns associated with the transcriptionally active MAC in T. thermophila [39,40]. When incubated on the same peptide array, control recombinant histone methyltransferase 6xHIS-G9a recognized monoand di-methylated H3K9 ( Table 2; see Additional file 4 for Raw Data), as previously demonstrated [41]. We generated a stable line expressing Ibd1-FZZ from its MAC locus. The IBD1-FZZ tagging construct (see Additional file 2) was used to transform growing T. thermophila strains using biolistic transformation. After selection and phenotypic assortment, Western blotting demonstrated expression of Ibd1-FZZ in wholecell extracts of transformed strains (Fig. 3a). Similar to Snf5 Tt -FZZ, Ibd1-FZZ also co-purifies with Brg1 Tt as assessed by Western blotting of affinity-purified material (Fig. 3b). Gel-free LC-MS/MS-based analysis on affinity-purified proteins identified 28 high-confidence Ibd1-FZZ co-purifying proteins ( Table 3). Comparison of the interaction partners recovered from the purification of Snf5 Tt -FZZ, Saf5 Tt -FZZ and Ibd1-FZZ-interacting proteins ( Fig. 3c; Table 3), showed 11 common proteins that co-purify with Ibd1, Saf5 Tt and Snf5 Tt including Swi1 Tt , Swi3 Tt , Snf5 Tt , Snf12 Tt and Brg1 Tt , Ibd1 and Saf1-5 Tt that together we hypothesize from a putative T. thermophila SWI/SNF complex.
The other 17 high-confidence Ibd1-interacting proteins ( Fig. 3c; Table 3) could be divided into three groups, based on similarity to predicted S. cerevisiae orthologs: 1 the SAGA Tt histone acetyl transferase co-activator complex containing Gcn5 Tt , Ada2 Tt and a PhD-containing protein, designated Aap1 Tt (Ada2-associated protein 1), 2 the SWR Tt ATP-dependent chromatin-remodeling complex that in yeast and human cells deposits histone variant Htz1/H2A.Z onto chromatin (Swr1 Tt , Yaf9 Tt , Rvb1 Tt , RvB2 Tt , Swc2 Tt and Swc4 Tt ), Swc5 Tt (C-terminal BCNT domain), two actin-like and three predicted Swc4-associated proteins (Sap1-3) Tt , one of which possess an AT-hook (Sap1 Tt ), the other two (Sap2 Tt and Sap3) contain no recognizable domains, and 3 a putative H3K4 methyl transferase (Atrx3/Set1-like). Sap3 Tt shares similarity only on a small portion of the protein with hypothetical proteins in P. tetraurelia and Pseudocohnilembus persalinus. Sap4 Tt shares similarity throughout the entire protein with a hypothetical protein in P. persalinus. The Ibd1 protein therefore appears to be a component of several chromatin-remodeling complexes (SWI/SNF Tt , SAGA Tt , SWR Tt ) and one containing an Atrx3/Set1-like HMT.
To further delineate the Ibd1 protein interaction network, we generated separate stable lines expressing Ada2 Tt -FZZ and Swc4 Tt -FZZ from their respective MAC loci following an identical strategy as outlined above. SAINTexpress analysis of AP-MS data from growing cells showed that Ada2 Tt co-purifies with Ibd1 in addition to the Ibd1-interacting Aap1 Tt and Gcn5 Tt . Additionally, Ada2 Tt co-purified with three PHD domain-containing proteins (Aap2 Tt , Aap3 Tt and Aap4 Tt ; Fig. 3c; Table 3) and four T. thermophila-specific hypothetical proteins (Aap5 Tt , Aap6 Tt , Aap7 Tt and Aap8 Tt ; Fig. 3c; Table 3) that we did not find to co-purify with Ibd1-FZZ. We suggest that the Ada2-interacting proteins together represent a Tetrahymena SAGA Tt complex ( Fig. 3c; Table 3).
SAINTexpress analysis of Swc4 Tt -FZZ AP-MS revealed it to co-purify a subset of Ibd1-interacting proteins that were predicted to be SWR Tt complex proteins ( Fig. 3c; Table 3). Swc4 Tt -FZZ further interacts with T. thermophila orthologs of the Tra1 and Tra2 PI3 kinases ( Fig. 3c; Table 3), neither of which co-purified with Ibd1. In yeast, Swc4 co-purifies with Tra1 via the NuA4 histone acetyltransferase complex of which Swc4 is a component, in addition to SWR-C. We did not observe Swc4 Tt -FZZ to co-purify with any protein that would indicate it to be a member of a T. thermophila NuA4 complex. The set of proteins that we hypothesize to constitute SWR Tt are listed in Table 3. Although the T. thermophila genome encodes a predicted ortholog of Swc6/Vps71 (TTHERM_01298590), we did not find it to co-purify with Swc4 Tt or Ibd1 in growing cells.

Ibd1 function during conjugation
To gain further insight into Ibd1 function, we assessed its expression through growth and sexual development. We performed Western blotting of whole-cell extracts made at different times during the T. thermophila life cycle, probing for Ibd1-FZZ (Fig. 4a, lower panel). We have previously demonstrated Brg1 Tt to have relatively constant levels of expression throughout growth and development Table 2 Histone peptide-array data reveal the top post-translational modification recognized by Ibd1 The histone peptide array contains human histone modifications that resemble Tetrahymena's histones. The intensity average columns show the top 10 histone modifications recognized by 6xHIS-Ibd1 and 6xHIS-G9a in italics. Bold italics means that the amino acid is not present in the Tetrahymena's histone (see Additional file 4 for Raw Data)

Histone Modification 1 Modification 2 Modification 3 Modification 4 Intensity average 6xHIS-Ibd1 (4 repetitions)
Intensity average 6xHIS-G9a ( [15]. We therefore used anti-Brg1 Tt as a loading control (Fig. 4a, top panel) and anti-Pdd1 [42] as a developmentspecific control (Fig. 4a, middle panel) for these experiments. Similar to Brg1 Tt , Ibd1 is expressed throughout the T. thermophila life cycle. Indirect immunofluorescence of Ibd1-FZZ performed on growing and conjugating cells (Fig. 4b) demonstrated localization exclusively to the MAC during growth and conjugation, specifically to the parental MAC through early nuclear development including meiosis (Fig. 4b: 0-6 h) before switching to the anlagen midway through sexual development ( Fig. 4b:  8 h). This is similar to what was shown previously for Brg1 Tt [15]. In particular, as for Brg1 Tt , localization of Ibd1-FZZ in the parental macronucleus is lost at the onset of macronuclear development, a stage where the two anterior nuclei (the anlagen) have become visibly larger than the posterior nuclei ( of Ibd1 is therefore correlated with transcriptionally active MAC during growth and nuclear development. To determine whether Ibd1's protein interaction network changes during sexual development, we performed AP-MS using whole-cell extracts prepared from conjugating cells harvested 5 h post-mixing, a time period following meiosis that is marked by a series of rapid post-zygotic nuclear divisions and where Ibd1-FZZ is found exclusively in the parental MAC (Fig. 4b).
SAINT-curated AP-MS data are shown in Additional were associated with Ibd1-FZZ to a lower degree in conjugation than during vegetative growth, while members of the putative SWR Tt and SAGA Tt remained relatively unaffected (Fig. 5a). The recovery as defined by spectral counts of SWI/SNF Tt members ( Fig. 3c; Tables 1, 3) appeared relatively low at this stage when compared to members of SWR Tt and SAGA Tt . To validate this finding, we used M2 agarose to affinity purify Ibd1-FZZ from untagged and Ibd1-FZZ-expressing cells and blotted with anti-Brg1 antibody following SDS-PAGE (Fig. 5b). In these conjugating cells, Ibd1-FZZ did not co-purify with Brg1 Tt (Fig. 5b), consistent with the substantially lower amounts of the protein detected by mass spectrometry. These data suggest a profound modulation of the Ibd1 interactome favoring its association with SWR Tt and SAGA Tt over SWI/SNF Tt complex early in conjugation (5 h post-mixing).

Ibd1 localizes to transcriptionally active chromatin
As noted above, Ibd1 co-purifies with multiple protein complexes involved in gene expression regulation and in vitro recognizes histone marks associated with an active chromatin state. These observations suggest an intimate role of Ibd1 in transcription regulation. To examine this possibility in more detail, we employed chromatin immunoprecipitation followed by next-generation sequencing (ChIP-Seq). Specifically, we asked whether Ibd1 localizes to specific regions of the genome that correlate with transcriptionally active chromatin. Data for two biological replicates that include DNA from input chromatin as well as Ibd1-FZZ precipitate from two independent experiments were analyzed. Our ChIP-Seq (GEO accession GSE103318) data set utilizing the available genome annotations [33] was composed of all annotated genic or open reading frames (ORF) and intergenic regions. The two generated lists displayed greater than or equal to twofold enrichment of Ibd1 and were ranked in descending order (see Additional files 6, 7, All_ > 2X_Fold_Enrichment tab). From these lists we observed that Ibd1 strongly occupies to 837 ORF and 396 intergenic regions with an enrichment (IP/INPUT) greater than or equal to twofold ( Fig. 6a; see Additional files 6, 7, > 2X_Enriched_with_Strong_Peaks tab). We initially focused our attention to the identified 837 ORFs and assessed the transcriptional state of these genes. We utilized previously published RNA-Seq data that have been used to rank genes based on their expression level during vegetative growth (GEO accession GSM692081 [43]). Based on these data we found that 9 and 29% of genes in Tetrahymena are highly and moderately expressed, respectively (Fig. 6b, left panel; see Additional file 6, RNA-Seq tab). On the other hand, we found that 54% (457 ORF) and 16% (134 ORF) of genes occupied by Ibd1 are highly and moderately expressed, respectively (Fig. 6b, right panel, c; see Additional file 6, localization tab). These observations are consistent with our histone peptide-array data and further strengthen the idea that Ibd1 primarily occupies active chromatin regions. Interestingly, Ibd1 showed binding to 114 ORF with low expression to no-expression during vegetative growth ( Fig. 6c; see Additional file 6, localization tab). The overall trend of the Ibd1 binding pattern to highly expressed genes that are highly occupied is particularly evident for genes that have enrichment greater than or equal to fourfold (298 genes in total) (Fig. 6c). To examine whether these 298 genes are enriched for any particular functional categories, we grouped them using STRING [44] based on their predicted Gene Ontology (GO) terms [45]. We identified 122 genes that are significantly enriched with a particular term related to housekeeping functions, such as biological process, cellular process, translation, metabolic processes and gene expression (Fig. 6d, see Additional file 6, 4X + _GO_Biological_Expression tab). These housekeeping genes are generally highly expressed consistent with our findings that Ibd1 primarily occupies transcriptionally active chromatin. To compare these data with the overall distribution of all Tetrahymena's Curated SAINTexpress data from 2 biological replicates of Ibd1-FZZ, Ada2-FZZ and Swc4-FZZ AP-MS samples  annotated genes the same approach was used ( Fig. 6e; see Additional file 6, AllTtGenes_GO_Biological_Proces tab). Figure 6d, e suggests that Ibd1 mainly controls housekeeping genes in vegetative cells.
To validate our ChIP-Seq analysis of Ibd1-enriched chromatin, we designed primers for the three genes that showed the highest Ibd1-FZZ fold enrichment (see Additional file 6, > 2X_Enriched_with_Strong_Peaks tab) as well as a fourth, PDD1, which is exclusively developmentally expressed [46] and did not show enrichment for Ibd1-FZZ during growth (see GEO accession GSE103318) ( Table 4). Our ChIP-qPCR analysis of the four genes confirmed specific enrichment of Ibd1-FZZ in HTA3, RPS22 and HFF1 but not PDD1 relative to chromatin made from untagged cells (Fig. 6f; see Additional file 8 for Raw data). We conclude that Ibd1 occupies transcriptionally active chromatin and might have a role in regulating the expression of a subset of genes involved in basal cellular housekeeping functions.

Localization of Ibd1 in Tetrahymena's genome
We next examined our ChIP-Seq data for both ORFs and intergenic regions that showed greater than or equal to fourfold enrichment to determine how Ibd1 is situated in the genome relative to ORF and intergenic regions.
Using this fold-enrichment cutoff, we obtained 298 genic and 140 intergenic regions.
We first investigated the genic regions to assess the Ibd1 peak distribution. Figure 7a shows a representative example of Ibd1 ORF-specific localization where peaks are primarily enriched within the gene body (see Additional file 6, 4X_ + _Ibd1_Occupancy tab for the full list). Next, to classify 140 intergenic regions, we manually inspected the ChIP-Seq peaks using the genome browser [47] and categorized them into five groups based on their localization (Fig. 7b-f; see Additional file 7, Inter-genic_Groups tab). The promoter group showed intergenic localization that was proximal to the 5′ region of  Other processes 91 single predicted genes (e.g., Fig. 7b). The Ibd1 terminator group showed intergenic localization proximal to the 3′ region of 33 single predicted genes (e.g., Fig. 7c). The third intergenic group showed Ibd1 localization to 2 regions where there is an overlap between the promoter of one predicted gene and the terminator of another (e.g., Fig. 7d). The fourth group showed localization of Ibd1 to 13 single 5′ promoter regions potentially controlling expression of two predicted genes (Fig. 7e). The fifth group showed localization of Ibd1 to 11 single terminator 3′ regions of two distinct predicted genes (Fig. 7f ). We found that among the 298 ORF showing ≥ 4X Ibd1 enrichment, 37 also additionally showed enrichment through the promoter ( Fig. 7g; Additional file 7, Com-bining_Intergenic_and_ORF tab for list) and 19 at the terminator region ( Fig. 7h; Additional file 7, Combining_ Intergenic_and_ORF tab). Collectively these data suggest that Ibd1 appears to bind near the promoters and within gene bodies, consistent with a role in transcription regulation through its potential role in organizing multiple protein complexes.

Ibd1 is a BRD-containing protein that interacts with multiple chromatin-remodeling complexes in T. thermophila
In our previous molecular characterization of Brg1 Tt [15], we reported that it lacked a C-terminal BRD which differs from the case in yeast (Snf2/Sth1) and mammalian cells (Brg1/Brahma). We report here that a distinct, BRD-containing protein, Ibd1, is a member of the Tetrahymena SWI/SNF complex. Recombinant Ibd1 recognized several Kac histone PTMs that are correlated with transcription. Ibd1 however established a large interaction network beyond the SWI/SNF Tt complex including putative SAGA Tt and SWR Tt complexes as well a Atrx3/ Set1-like HMT that is predicted to be H3K4 specific, a modification linked to transcription. As is standard practice, we used a promiscuous DNAse and RNAse (benzonase nuclease) in the preparation of whole-cell extracts used for AP-MS (as detailed in "Methods"). Very little, if any, nucleic acid remains in our extract submitted to AP-MS. Also, although Ibd1 AP-MS yielded several putative protein complexes, reciprocal purification of individual complex components co-purified Ibd1 but not the other complexes consistent with binding of other proteins to Ibd1 being specific and independent of DNA. This being said, we cannot exclude that nucleic acids already bound by proteins are protected from nuclease cleavage and may contribute to the observed binding events.

Characterization of a Tetrahymena SWI/SNF complex
The   Fig. 7 Ibd1 is localized in promoters, ORF and terminators. In regions with more than or equal to fourfold enrichment (IP/NPUT), Ibd1 localizes to 8 specific type regions, including: a 483 ORF, b 91 promoters, c 33 terminators, d localization in 2 regions where there is overlap between the promoter of one predicted gene and the terminator of another, e 13 regions showed localization to a single 5′ promoter region potentially controlling expression of two predicted genes, f localization to 11 single terminator 3′ regions of two distinct predicted genes. Combining these data for genes that present enrichment in the ORF and intergenic region, we found that there is mutual enrichment in g 37 regions that occupy from the promoter to the ORF, and h 19 regions that present enrichment from the ORF to the terminator region (see Additional files 6, 7 for Raw Data). The fold enrichments are presented besides each peak suggest that the function of Saf2-4 Tt is to function in co-activation by recruiting general transcription factors and/or RNA polymerase to promoter regions of highly expressed genes in growing Tetrahymena. The finding that Ibd1 is a member of SWI/SNF Tt is informative in that its BRD interacts with Kac of histone proteins, similar to that observed for Snf2/Sth1 in yeast [50] and Brahma/Brg1 in humans [51]. In addition to the BRD-containing Ibd1, Tetrahymena SWI/ SNF also contains a PHD domain-containing protein, Saf5 Tt . One function attributed to PHD domains is recognizing methylated lysines in proteins such as histones. For example the PHD domain of human ING2 recognizes H3K4me3 [25]. Thus, the SWI/SNF Tt contains two proteins that potentially recognize PTM on histones, Saf5 Tt that likely recognizes methyl lysine (and possibly acetyl-lysine [27]) and Ibd1 that recognizes Kac. The Tetrahymena transcriptionally active MAC contains hyperacetylated histone H3 that is also di-or tri-methylated on H3K4 [40]. We suggest that a subset of these modified H3-containing nucleosomes can be recognized by SWI/SNF which would then remodel them to facilitate transcription. Additional SWI/SNF co-activator function could be derived from recruitment of general TFs and/ or RNA polymerase II by the Saf2-4 proteins with Q-rich regions. Ibd1 may not interact with SWI/SNF in development in the same manner as it does during vegetative growth. We suggest that the function of SWI/SNF during nuclear development occurs independent of histone acetylation.

Tetrahymena Ibd1-containing SWR, SAGA and HMT complexes
In addition to being a member of SWI/SNF Tt , Ibd1 is also a distinct component of the SWR and SAGA complexes as well as interacting with an uncharacterized H3K4-specific histone methyl transferase that is similar to human Atrx3 and yeast Set1. The function of the SWR complex in fission [52] and budding [6] yeasts is the deposition of the histone H2A variant Pht1/Htz1 (H2A.Z in humans and Hv1 in Tetrahymena). Deposition of Htz1 in budding yeast is linked to NuA4-dependent histone acetylation via the BRD-containing Bdf1 subunit of SWR [53]. In yeast, Bdf1 is also a component of TFIID linking histone acetylation to pre-initiation complex assembly [54]. In Tetrahymena, Ibd1 did not co-purify with any proteins similar to components of the general transcription apparatus. Like Ibd1, Hv1 is localized to transcriptionally active MAC in growing cells [55]. Unlike Ibd1, Hv1 localizes also to the crescent MIC corresponding to meiotic prophase [56], a time period in Tetrahymena where large genome-wide transcription of the MIC by RNAPII occurs (reviewed in [57]).
In budding yeast, SWR is functionally linked to the NuA4 histone acetyl transferase complex via shared subunits Swc4 and Yaf9. In Tetrahymena, Swc4 Tt did not co-purify with a histone acetyl transferase subunit and may not be a member of a NuA4-type complex. In fact, a strict NuA4-type complex in Tetrahymena is unlikely to exist, despite the presence of 3 genes encoding MYST family histone acetyl transferases. A previous study did identify a H2A/H4 nucleosomal HAT similar to the activity of NuA4 but also showed by glycerol gradient analysis that the activity purifies at ~ 80 kDa [58]. Consistent with this observation, the MAC does not appear to encode a gene that is a clear ortholog of the conserved NuA4 subunit such as Epl1/EPC so it is unclear whether there exists a 'piccolo' NuA4 [59]. Swc4 Tt did co-purify with orthologs of Tra1 Tt and Tra2 Tt kinases that did not purify with Ibd1 (Table 3; Fig. 3c). In S. cerevisiae Tra1 co-purifies with NuA4 [60] and SAGA [61] that contribute to their co-activator function [62]. It will be interesting to determine whether SAGA Tt fulfills the function of SAGA and NuA4 in budding yeast or whether there exists a divergent version of NuA4 in Tetrahymena.
Ibd1 co-purifies with Gcn5 tt and Ada2 Tt in addition to the PHD domain-containing A2A1 Tt . Ada2 Tt co-purifies with these proteins in addition to seven others including three additional PHD domain-containing proteins A2A2-4 Tt . Thus, Ada2 Tt co-purifies with four distinct PHD domain-containing proteins. Further work will be necessary to determine whether the set of Ada2-interacting proteins represent a single assemblage or whether Ibd1, Ada2 and Gcn5 represent a 'core' to the Tetrahymena SAGA complex that can have different specificity depending on which PHD protein it is interacting with at a particular time.

Model for Ibd1 function
We hypothesize that Ibd1 has a common function that it performs in diverse chromatin-remodeling complexes. Consistent with a function in promoting transcription, Ibd1-FZZ specifically localized to the coding regions of multiple highly transcribed genes during vegetative growth. A model for Ibd1 function is that it recognizes one or more specific histone Kac marks that are associated with transcription and recruits multiple chromatin-related complexes to the region to either further acetylate nearby chromatin (SAGA Tt ), to remodel nucleosomes (SWI/SNF Tt ), to deposit Hv1 (SWR Tt ), and to di-or tri-methylate histone H3K4 (Atrx3/Set1-like histone methyl transferase). SWI/SNF, SAGA and SWR, and H3K4 methylation are all linked to transcription in other experimental systems. We predict that Ibd1 is particularly important to maintain high rates of transcription on highly expressed genes such as those encoding the core histones or ribosomal proteins. Our ChIP-Seq analysis of Ibd1 supports this hypothesis with strong occupancy of the coding regions of genes encoding core histones HHT1 and HHF1. ChIP-Seq of Ibd1-containing complex-specific members (i.e., Snf5 Tt , Swr1 Tt , Ada2 Tt ) will be required to test the validity of this hypothesis. As well as being found in coding regions, Ibd1 also localizes to the regulatory region of several genes. Further work will be necessary to determine whether Ibd1 is necessary for the recruitment of SWI/SNF Tt , SAGA Tt , SWR Tt and the HMT to ORFs and the regulatory regions identified in our ChIP-Seq analysis. It will also be interesting to determine whether the regulatory regions enriched in Ibd1 contain conserved DNA sequences that may indicate whether specific DNA-binding transcription factors recruit Ibd1-containing protein complexes to regulatory regions.

BRD proteins in Tetrahymena
We have identified and performed a phylogenetic analysis on 14 BRD-containing proteins in Tetrahymena. Ibd1 is a member of a grouping that includes six proteins, five of which are like Ibd1 in possessing a single BRD and no other recognizable domains. Four of these 5 are similar in length to Ibd1 suggesting relatively recent evolutionary divergence of the four. BRD inhibitors are currently of a significant clinical interest in the development of drugs to treat parasitic infections as a number of apicomplexan protozoan parasites possess lineage-specific BRD proteins that appear to be important for various stages of their life cycle [63]. Because the ciliates and apicomplexans are closely related in evolution, we suggest Tetrahymena may provide a tractable model for molecular analysis of some of these BRD proteins.

Conclusions
In multi-cellular eukaryotes, the precise function of how chromatin-remodeling complexes work is poorly understood. Alteration or loss of factors involved in these complexes through mutation has been shown to be associated with cancer. We utilized the protist model, the Aleveloate Tetrahymena thermophila which segregates transcriptionally active, and silent chromatin into two distinct nuclei, the macronucleus (MAC) and micronucleus (MIC), respectively, contained in the same cell. Through the discovery of a bromodomain-containing protein, Ibd1, we advanced the knowledge of chromatin-remodeling complexes in protists by defining for the first time the protein complements of SWI/SNF, SWR and SAGA complexes. In addition, we present a model where a single protein, Ibd1, coordinates the action of multiple chromatin-remodeling complexes to achieve high levels of transcription. Our research will contribute to our current understanding of transcription in ciliates, and more broadly the function and diversity of chromatin-remodeling complexes in eukaryotes.

Protein sequence alignments
Multiple sequence alignments of Snf5, Saf5 and Ibd1 amino acid sequence from various model organisms were performed using Clustal Omega (http://www.ebi. ac.uk/Tools/msa/clustalo/) and then shaded by importing the ALN file into the Boxshade server (http://www. ch.embnet.org/software/BOX_form.html). SMART [64] was used to find the beginning and end of the domains.

Oligonucleotides
See Additional file 9 for a list of the oligonucleotides used during this study.

DNA manipulations
Whole-cell DNA was isolated from T. thermophila strains as described [66]. Molecular biology techniques were carried out using standard protocols or by following a supplier's instructions.

Affinity purification, sample preparation and mass spectrometric analysis
AP-MS analysis was performed as per [21] with minor modifications, see Additional file 10.

ChIP
ChIP was performed as described [67] with modifications described in Additional file 10.

NGS
Sequencing and analysis of DNA co-purifying with ChIP of Ibd1-FZZ is described in Additional file 10.

ChIP-qPCR
Four ChIP biological repetitions for the Ibd1-FZZ and three ChIP repetitions for the untagged cell lines were quantified (NanoDrop, Thermo Scientific) and diluted to reach the smallest DNA concentration found in a