Table 3 Homo sapiens (GM12878 lymphoblastoid cell line) data

From: Predicting expression: the complementary power of histone modification and transcription factor binding data

Data type Data source Notes
RNA-seq ENCODE [59] 49,488 genesFPKM-normalised [60]
TSS Ensembl hg19/GRCh37 [27] Consider only most 5 -located TSS for each gene
TF ChIP-seq ENCODE [59] c-Fos, Ctcf, Egr1, Nrf1, Nrsf, Pou2f2, Sp1, Srf, Stat3, Usf1 and Yy1
HM ChIP-seq ENCODE [59] H3K4me1, H3K4me2, H3K4me3, H4K20me1, H3K27me3 and H3K36me3
DNase-seq ENCODE [59] DNase I hypersensitivity
Gene ontology [33, 34] GOC validation date: 21 March 2014 Structure from GO.db R package
Housekeeping annotations [61, 62] 3,804 genes Using RNA-seq data GSE30611
  1. Genes corresponding with haplotype variants, unmapped contig regions and low confidence RNA-seq mappings were removed, resulting in a set of 38,041 genes for analysis.
  2. ChIP, chromatin immunoprecipitation; GOC, Gene Ontology Consortium; HM, histone modification; seq, sequencing; TF, transcription factor; TSS, transcription start site.