Question: Enrichment of GWAS SNPs in regulatory regions
enricoferrero570
I'm trying to perform an analysis similar to those in ENCODE or FANTOM publications where enrichment of GWAS SNPs in regulatory regions (e.g.: DHSs, CAGE-defined enhancers) is calculated [1,2].

So, I would like to calculate if a set of GWAS SNPs associated with a disease of interest is enriched in my set or regulatory regions compared to a background distribution of SNPs (i.e.: the 1000 Genomes data).

How am I supposed to set up my contingency table for Fisher's exact test?

My guess would be something like:

Number of GWAS SNPs in regulatory regions Number of GWAS SNPs in regulatory regions
Total number of GWAS SNPs Total number of 1000 Genomes SNPs

And then simply use the fisher.test() function on the matrix.

There's also the fact that the GWAS SNPs are a subset of the 1000 Genomes SNPs: should I subtract them from the superset before performing the test?

