I have a question about an R-package: “ClusterProfiler”. There are two methods I’m using:
gseKEGG (or gseGO, etc.)
enrichKEGG (or enrichGO, etc.)
I can’t find an answer to what the exact input for these methods should be:
For gseKEGG, I need a genelist with FC values, but does this genelist contain only DE genes? Or does this list contain all genes, DE and not DE (after filtering low expressed genes offcourse)?
For enrichKEGG, I believe I only need the gene ID’s of DE genes, right?
In general, you can regard the following as being true:
for gseKEGG(), the input can be a named vector of fold changes, and
these can be either statistically significant or non-statistically
significant genes, or both. Those that are not statistically significant will
almost certainly have lower fold changes anyway, and this will be
taken into account [via ranking] when performing the enrichment.
for enrichKEGG() / enrichGO(), yes, these just take a vector of gene names; therefore, the
assumption would be that these are already genes of particular interest,
i.e., genes that you have found as statistically significantly
differentially expressed in your study.
Hi Kevin, the input of gseKEGG is the same input of gseGO?
I appreciate any help
Yes, these are the same. See
?gseGO
.