Question: Background for test in SPIA package.
Hi All, I've been investigating the SPIA package as a more satisfactory way of assessing whether a pathway is perturbed in a set of RNA-seq experiments than simple over enrichment analysis on a set of KEGG pathways. One thing I've noticed is that the choice of background for the test makes an enormous difference. I firsted noticed this when converting my ensembl ids to entrez ids inadvertently led to me including non-coding RNAs in my background list to the spia all parameter. Removing these (about 5,000 ids) from the background reduced the number of differential pathways from 51 to 6. Removing genes that we didn't test for differential expression because they were too lowly expressed reduced it even further (and annoyingly removing any interesting pathways from the results). I further realised that really, the background set for the over-representation analysis part of the test should only include those genes with a KEGG annotation (only about 5,000 in the case of humans). I'd have thought that SPIA should do this automatically: it has access to the list of genes that are in any pathway, but poking around in the code, it doesn't seem to. I could restrict the background to the all parameter to only include genes with keg annotations, but would this be the correct thing to do for the IF calculation? Ian [[alternative HTML version deleted]]
