Entering edit mode
Javier Pérez Florido
▴
840
@javier-perez-florido-3121
Last seen 6.8 years ago
Dear list,
I'm using an Hypergeometric Test using hyperGTest from GOstats and
Category packages. I have several questions related to this issue:
* What is the usual cutoff value used as an input for the
hypergeometric test according to the gene set collection used:
GO
BP, GO MF, GO CC, Chromosome Bands, KEGG and PFAM?
* In the nonspecific filtering, I suppose that one can perform
different kind of filters depending on the gene set collection
used. For example, using the nsFilter function:
o For GO BP: nsFilter(OligoEset,
require.entrez=TRUE,require.GOBP=TRUE,
remove.dupEntrez=TRUE,
var.func=IQR,var.cutoff=varCutoff,filterByQuantile=TRUE,
feature.exclude="^AFFX")
o For GO MF: nsFilter(OligoEset,
require.entrez=TRUE,require.GOMF=TRUE,
remove.dupEntrez=TRUE,
var.func=IQR,var.cutoff=varCutoff,filterByQuantile=TRUE,
feature.exclude="^AFFX")
o For GO CC: nsFilter(OligoEset,
require.entrez=TRUE,require.GOCC=TRUE,
remove.dupEntrez=TRUE,
var.func=IQR,var.cutoff=varCutoff,filterByQuantile=TRUE,
feature.exclude="^AFFX")
o For Chromosome Bands: nsFilter(OligoEset,
require.entrez=TRUE,require.CytoBand=TRUE,
remove.dupEntrez=TRUE,
var.func=IQR,var.cutoff=varCutoff,filterByQuantile=TRUE,
feature.exclude="^AFFX")
o For KEGG: nsFilter(OligoEset, require.entrez=TRUE,
remove.dupEntrez=TRUE,
var.func=IQR,var.cutoff=varCutoff,filterByQuantile=TRUE,
feature.exclude="^AFFX")
Therefore, depending on the gene set collection, the filter
changes.
* Once the Hypergeometric Test is done, I don't understand some of
the fields of the HyperGResult object. What I understood is:
o ExpCount: the expected number of genes in the selected
gene
list to be found at each tested category term.
o Count: for each category term tested, the number of genes
from the interesting gene list that are annotated at the
term.
o Size: for each category term tested, the number of genes
from the universe gene list that are annotated at the
term.
o OddsRatio: the odds ratio for each category term tested
If the test is done for over-represented terms, Count is greater
than ExpCount. Otherwise, the test has been performed for
under-represented terms. I don't understand the meaning of
ExpCount.
Expected by who?Is it expected a great difference between ExpCount
and Count? Is there a relationship between ExpCount, Count and the
p-values? I would like to understand better the meaning of the
HyperGResult object according to these fields: ExpCount, Count,
Size
and OddsRatio.
Thanks in advance,
Javier
[[alternative HTML version deleted]]