Question: Finding genes in a gene set detected by ROAST (or FRY) in limma
0
3 months ago by
kentfung0
kentfung0 wrote:

I used fry() in limma package and it gave me this as results

                                                                NGenes Direction       PValue          FDR  PValue.Mixed     FDR.Mixed
KEGG_APOPTOSIS                                                      81      Down 2.853559e-24 5.307619e-22  3.188755e-62  2.281186e-61
KEGG_CYTOKINE_CYTOKINE_RECEPTOR_INTERACTION                        161      Down 1.037278e-23 9.646683e-22 9.176848e-246 5.689646e-244
KEGG_CHEMOKINE_SIGNALING_PATHWAY                                   150      Down 1.723760e-23 1.068731e-21 5.378129e-200 2.000664e-198
KEGG_NATURAL_KILLER_CELL_MEDIATED_CYTOTOXICITY                     103      Down 6.604034e-22 3.070876e-20 1.150026e-118 2.139049e-117
KEGG_EPITHELIAL_CELL_SIGNALING_IN_HELICOBACTER_PYLORI_INFECTION     61      Down 3.996045e-20 1.486529e-18  2.675739e-18  7.899801e-18
KEGG_PHOSPHATIDYLINOSITOL_SIGNALING_SYSTEM                          67      Down 1.318473e-19 4.087265e-18  6.360791e-58  4.381878e-57
KEGG_OLFACTORY_TRANSDUCTION                                         42      Down 5.682502e-17 1.509922e-15  2.971345e-15  8.373791e-15
KEGG_NON_HOMOLOGOUS_END_JOINING                                     13        Up 3.624166e-16 8.426186e-15  4.728688e-01  6.925480e-01
KEGG_RNA_DEGRADATION                                                56        Up 3.536264e-14 7.135116e-13  1.445358e-18  4.336074e-18
KEGG_JAK_STAT_SIGNALING_PATHWAY                                    102      Down 3.883424e-14 7.135116e-13  6.883971e-95  9.849374e-94
KEGG_LEUKOCYTE_TRANSENDOTHELIAL_MIGRATION                           85      Down 4.337482e-14 7.135116e-13  2.992323e-87  3.478576e-86



And so on and so forth. My questions are how I can get the genes involved in the gene sets and how I can interpret the p-value. I want to validate the gene by qPCR but I couldn't find which ones are meaningful in the gene sets. And also there are a lot of p < 0.05 which kind of indicates that it is overly sensitive. Is there an adjusted p-value or would it make sense that I just do my own p-value adjustment according to the number of gene sets are tested?

limma roast fry • 121 views
modified 3 months ago by Gordon Smyth38k • written 3 months ago by kentfung0
Answer: Finding genes in a gene set detected by ROAST (or FRY) in limma
2
3 months ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

how I can get the genes involved in the gene sets

You have input the gene sets into fry yourself, so you must already know which genes are in which gene set. A gene set is just a vector of gene ids.

how I can interpret the p-value.

The help page for fry (which you can see by typing ?fry or help(fry)) explains what is being tested. If you want even more detail, consult the Wu et al (2010) journal article.

I want to validate the gene by qPCR but I couldn't find which ones are meaningful in the gene sets.

If you want to know which individual genes are DE, then just do a topTable for the genes in the gene set. For example, if index is the index you are inputting to fry and fit is the limma linear model fit object, then you can use something like:

i <- index[["KEGG_APOPTOSIS"]]
topTable( fit[i,] )


there are a lot of p < 0.05 which kind of indicates that it is overly sensitive.

fry is an approximation to roast, which has excellent error rate control (Wu et al 2010). If your experiment has lots of DE genes, then it is quite normal to have lots of DE sets as well.

However the limma authors themselves use camera rather than fry when testing large numbers of MSigDb sets, as can be seen in the limma RNA-seq 1-2-3 workflow, because it tests a competitive hypothesis that is of more interest in that context. Type ?camera to get a brief summary of how the test is different or read Wu and Smyth (2012) for complete details.

That's what the FDR column is.

Hi Gordon,

Thank you for suggesting the use of camera. I am quite new to the field and saw a review saying that self-contained gene set analysis is more specific and sensitive, so I just followed without thinking much. I will have a look at the paper and vignette again.

Thanks a lot for your help.

P.S. Sorry I just missed part of answers. I have deleted the follow up questions. Many thanks for your help.

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.