help in extracting limma results
1
0
Entering edit mode
@echang4lifeuiucedu-788
Last seen 9.6 years ago
Hi Bioconductor users, I am having trouble understanding how multiple-testing adjustment is done in limma (specifically the decideTest). I am really confused about the meaning between moderated F p-value and the adjusted p-value. If I try two different "flavors" of decideTest (e.g. nestedF and global), I can see that the results are different.... but is it the p-value that is adjusted or the F-statistic is adjusted? And how do I extract the gene lists (cutoff of FDR-adjusted p-value< 0.05)? I basically followed the limma user guide (which is superbly written, thank you!) But after completing the analysis, I am stuck trying to extract the gene lists out of R and into a csv file. Can someone please help me out with the code which would extract the genes under "results2"? My code is as follows: > cont.matrix <- makeContrasts( Ad.V_E2 = Ad.E2-Ad.veh, b20.V_E2 = b20.E2-b20.veh, Veh_modul = b20.veh-Ad.veh, E2_intrxn = (b20.E2-b20.veh)-(Ad.E2-Ad.veh), levels=design) >fit2 <- contrasts.fit(fit, cont.matrix) >fit2 <- eBayes(fit2) >results2 <- decideTests(fit2, method="nestedF") >summary(results2) Ad.V_E2 b20.V_E2 Veh_modul E2_intrxn -1 729 699 1108 224 0 20651 20662 20662 21830 1 903 922 513 229 > results2 <- decideTests(fit2, method="global") > summary(results2) Ad.V_E2 b20.V_E2 Veh_modul E2_intrxn -1 501 460 641 101 0 21174 21166 21311 22064 1 608 657 331 118 >write.csv(topTable(fit2, coef="Ad.V_E2", n=22283), "Ad+E2.csv", row.names=FALSE) ***If I do this, won't I end up with the unadjusted p-values?*** I also see other examples like this....how can I export genes in "selected"? > selected<- p.adjust(fit2$F.p.value, method="fdr") < 0.05 Basically, I'm stuck. Any help would be greatly appreciated. I really did read the limma user guide, but if anyone could point sections which would help me, I'd very much appreciate that as well. Thank you, Edmund Chang
limma limma • 5.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 13 minutes ago
United States
echang4 at life.uiuc.edu wrote: > Hi Bioconductor users, > I am having trouble understanding how multiple-testing adjustment is done > in limma (specifically the decideTest). I am really confused about the > meaning between moderated F p-value and the adjusted p-value. > > If I try two different "flavors" of decideTest (e.g. nestedF and global), > I can see that the results are different.... but is it the p-value that is > adjusted or the F-statistic is adjusted? The statistics themselves are never adjusted in decideTests(), only the p-values. The difference is in how the p-values are adjusted. For the 'global' option, all the contrasts are considered to be independent, and the p-values are adjusted as if you just had a bunch of independent t-tests. The nestedF option is a bit more complicated. First, a bit of background. The F-statistic is used to determine if there are any differences between the samples, but it doesn't tell you which sample(s) are different. You have to fit contrasts to find out which sample(s) are different. So the idea with the nestedF is to adjust the p-values associated with the F-test to find which genes are differentially expressed in at least one sample. Now we have a list of genes that are differentially expressed, but we don't know for which sample(s) that may be true. The t-statistics associated with the contrasts are then inspected and the largest one (in absolute value) is considered significant. Now, there may be other contrasts that are significant as well, so the largest t-statistic is set to the same absolute value as the second largest t-statistic, and the F-statistic is calculated again. If the F-statistic is still significant, the second largest contrast is considered significant. This procedure is continued until the F-statistic is no longer significant. The basic reasoning here is that the largest t-statistic for a set of contrasts is significant if the overall F-statistic is significant. By following this step-wise procedure, we can determine which contrasts are contributing to the overall significance of the F-statistic. > And how do I extract the gene lists (cutoff of FDR-adjusted p-value< 0.05)? There are three ways that I know of. First, you can use write.fit(), which is part of limma. Something like: write.fit(fit2, results2, "myoutputdata.txt", adjust="BH") will output all your genes in a text file that you can then open up in e.g., Excel and filter based on the different columns. Note that the p-values associated with the contrasts will be adjusted, but the p-values associated with the F-statistics will not, so this is equivalent to decideTests() with method = "separate". If you are using Affy chips, you could use limma2annaffy() in affycoretools. This will output either HTML or text tables (or both) containing the expression values, statistics, p-values and links to online databases. I don't know your platform, so I won't go into any more detail other than to say that affycoretools is in the devel repository, so you will either need R-2.3.0 if you want to use biocLite() to get it, or will have to download and install by hand if you have an earlier version of R. A third option would be to extract the information you want and output it by 'hand'. This would involve a bit of work on your part. To fully explain what you need to do would take too long, so I will just give some pointers below. > > I basically followed the limma user guide (which is superbly written, > thank you!) But after completing the analysis, I am stuck trying to > extract the gene lists out of R and into a csv file. > > Can someone please help me out with the code which would extract the genes > under "results2"? My code is as follows: > > >>cont.matrix <- makeContrasts( > > Ad.V_E2 = Ad.E2-Ad.veh, > b20.V_E2 = b20.E2-b20.veh, > Veh_modul = b20.veh-Ad.veh, > E2_intrxn = (b20.E2-b20.veh)-(Ad.E2-Ad.veh), levels=design) > >>fit2 <- contrasts.fit(fit, cont.matrix) >>fit2 <- eBayes(fit2) >>results2 <- decideTests(fit2, method="nestedF") >>summary(results2) > > Ad.V_E2 b20.V_E2 Veh_modul E2_intrxn > -1 729 699 1108 224 > 0 20651 20662 20662 21830 > 1 903 922 513 229 > > >>results2 <- decideTests(fit2, method="global") >>summary(results2) > > Ad.V_E2 b20.V_E2 Veh_modul E2_intrxn > -1 501 460 641 101 > 0 21174 21166 21311 22064 > 1 608 657 331 118 > >>write.csv(topTable(fit2, coef="Ad.V_E2", n=22283), "Ad+E2.csv", > > row.names=FALSE) > ***If I do this, won't I end up with the unadjusted p-values?*** No, you will have "BH" adjusted p-values. You can see what the defaults for a given function are by looking at the man page. For example ?topTable will tell you: adjust.method: method used to adjust the p-values for multiple testing. Options, in increasing conservatism, include '"none"', '"BH"', '"BY"' and '"holm"'. See 'p.adjust' for the complete list of options. A 'NULL' value will result in the default adjustment method, which is '"BH"'. > > I also see other examples like this....how can I export genes in > "selected"? > >>selected<- p.adjust(fit2$F.p.value, method="fdr") < 0.05 'selected' is a logical vector with TRUE for any F-statistic with an adjusted p-value less than 0.05 and FALSE otherwise. You can use this vector to subset other parts of your MarrayLM object and build up a data.frame that you could then output. For instance fit2$t[selected,] will give you a matrix containing all the t-statistics for the set of significant genes. You can get the coefficients using fit2$coefficient[selected,], etc. You might want to read 'An Introduction to R', which can be accessed by typing help.start() and looking at the top left of the page that pops up. This should help explain some of the data structures that are used in R, as well as ways to manipulate them. HTH, Jim > > > Basically, I'm stuck. > Any help would be greatly appreciated. I really did read the limma user > guide, but if anyone could point sections which would help me, I'd very > much appreciate that as well. > > Thank you, > Edmund Chang > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT

Login before adding your answer.

Traffic: 773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6