Skip to main content
Fig. 8 | Epigenetics & Chromatin

Fig. 8

From: “Gap hunting” to characterize clustered probe signals in Illumina methylation array data

Fig. 8

Distributions of standard deviations among 6 categories of 450k probes. All autosomal probes (n = 473,864) were classified into one of six groups: (1) non-gap probes that lack a SEED SNP, dbSNP-annotated polymorphism, or UCSC-annotated repeat that map to the probe (n = 301,590; shown in black), (2) non-gap probes with at least one SEED SNP present in the probe (n = 62,005; shown in red), (3) non-gap probes that do not contain a SEED SNP but do have an annotated variant as indicated by the dbSNP138 database or map to a UCSC-annotated repeat (n = 99,262; shown in blue), (4) gap probes that lack a SEED SNP, dbSNP-annotated polymorphism, or UCSC-annotated repeat that map to the probe (n = 1808; shown in purple), (5) gap probes with at least one SEED SNP present in the probe (n = 5453; shown in green), (6) gap probes that do not contain a SEED SNP but do have an annotated SNP as indicated by the dbSNP138 database or map to a UCSC-annotated repeat (n = 3746; shown in orange). The 3 non-gap probe distributions are distinct from the gap probe distribution but show some overlap, suggesting some probes with “gap-like” distributions are not captured by gap hunting (also see Fig. 7 for explanation). The gap probe distribution for those probes with annotated SNPs (green and orange) has a slightly higher area under the curve at higher standard deviation values (especially for the Type II design), which is likely due to the generally higher allele frequencies for the annotated SNPs compared to the measured SNPs (see Additional file 8: Figure S33). Gap probes lacking any probe SNPs form a distinct distribution, especially for the Type II design (purple)

Back to article page