edgeR: How is FDR calculated for logFC? (False Discovery Rate)
jol.espinoz ▴ 40
Last seen 14 months ago

I'm having difficulty conceptualizing how significance of logFC is calculated with exactTest() and Benjamini-Hochberg adjusted FDR calculation with topTags().

Let's say you had 100 genes with 5 control replicates and 5 experimental replicates,  how are p-values for the logFC calculated and then how are they adjusted using the Benjamini-Hochberg method?  For example consider gene_1, what if the control samples for gene_1 were: u = [5, 6, 5, 7, 5] and the experimental replicate samples for gene_1 were: v = [9, 10, 11, 10, 8].  Does the algorithm just do a t-test or wilcoxon between log2(u) and log2(v) to get the p-values?  If so, wouldn't you need quite a bit of samples to get a reliable p-value?

Is this how the p-value is calculated? If so, how the FDR calculated from this?  Sorry for all the questions, I'm just having trouble visualizing it.  I just ran edgeR on a dataset with 3 control replicates and 3 experimental replicates.  I'm not sure how the p-values and the FDR values are calculated and I'm trying to avoid blackboxing (i.e. using algos I don't understand conceptually).

There's no need to 'blackbox' anything. There is a user's guide that explains all, along with references if you really want to get down to the nitty gritty.

Thanks for that!  So does it use a negative binomial distribution of the controls and then find the likelihood of the experimentalsnto fit that model?