I am trying to understand it the proptruenull method in R better.
My question is if I have a set of p-values and I just want to check how many of them pass the false positive threshold, would it be correct to interpret the results as follows:
>p<-scan(pvalues.txt)
> library(limma)
> propTrueNull(p,method='lfdr');
[1] 0.02203173
All the pvalues in my vector p that are <0.022 are sig (pass the false positive threshold), but the rest are rejected because they may be false positives.
You've misunderstood the purpose of ptopTrueNull. propTrueNull does not give you information about specific hypothesis tests. It only estimates the overall proportion of tests whose null hypotheses are true. It does not give you a significance threshold. In your case, propTrueNull is estimating that 2.2% of your genes are not differentially expressed, and 97.8% of them are differentially expressed. I would generally be very suspicious of such a high proportion of differentially expressed genes, unless my experimental system was such that I expected the entire genome to be differentially expressed. It is likely that you have misspecified your model or your contrasts.
If you want to compute false discovery rates for your list of p-values, you should look at p.adjust, or better yet, if you are using limma, then use the topTable function which calls p.adjust for you. Also, please ensure that you understand the difference in interpretation between a false discovery rate for a list of p-values and a local false discovery rate.
Thank you very much Ryan. I have this list of p-values and all I want to do is get a subset that are not false positives after multiple testing. This is not a differential expression analysis. I am trying to detect individual presence of a motif in a set of sequences. I have 213 p-values to test. I have tried p.adjust with fdr and it gave me all the fdr.adjusted p-value<0.05, sure nearly 8 of them at the bottom of my sorted p-value list were 0.045-0.049 range. Then I tried lfdr and it gave me 0.022. Is there a way to detect which 2.2% of my p-value list are not sig. Please let me know. Thanks a ton for your response.
To reiterate what I already said, propTrueNull is not telling you anything about significance. It is not telling you that 97.8% of your p-values are significant. If you want to select a subset of p-values with a specified false discovery rate, use p.adjust.
Thanks for the clarification Ryan. Appreciate it. I tried this
>p<-scan('pvalues.txt')
>write.csv(p.adjust(p,method='BH'),'Q');
This returns me 213 p-values that are all <0.05. I am still learning, so thanks for your input and patience. Is there a way to add a specific fdr to the p.adjust function. I was unable to find an option in the manual.
To reiterate what I already said, propTrueNull is not telling you anything about significance. It is not telling you that 97.8% of your p-values are significant. If you want to select a subset of p-values with a specified false discovery rate, use p.adjust.
Thanks for the clarification Ryan. Appreciate it. I tried this
>p<-scan('pvalues.txt')
>write.csv(p.adjust(p,method='BH'),'Q');
This returns me 213 p-values that are all <0.05. I am still learning, so thanks for your input and patience. Is there a way to add a specific fdr to the p.adjust function. I was unable to find an option in the manual.
Please let me know.