Question: edgeR: PValue, adjusted PValue, FDR
gravatar for zhxiaokang
9 months ago by
zhxiaokang0 wrote:

I'm using edgeR to do differentially expressed genes analysis. Here's part of my code:

fit <- glmFit(y, design)

# conduct likelihood ratio tests for tumour vs normal tissue differences and show the top genes
lrt <- glmLRT(fit)

# the DEA result for all the genes
dea <- lrt$table

# differentially expressed genes
toptag <- topTags(lrt, n = length(geneList), p.value = 0.05)
deg <- toptag$table

I got a 'PValue' in 'dea', then I'm wondering whether it's a p-value or adjusted p-value. Then I gave 'lrt' to 'topTags' to extract the differentially expressed genes, and set the cutoff of p.value to 0.05, then I'm wondering whether this cutoff is set for the 'PValue' in 'dea'. But then I got a 'FDR' in 'deg' (from 'toptag'), and I found that all the genes are with a FDR < 0.05, but not all the genes in 'dea' with a PValue < 0.05 are listed in 'deg'.

That's what I found from the result. So what I'm thinking now is: the PValue in dea is just a p-value, not adjusted. The function topTags will adjust those p-value with method such as 'BH', and after that, it will provide you with the differentially expressed genes with FDR smaller than the threshold that you set (but somehow it's 'p.value' here, instead of 'FDR' ~~~). So the FDR here is the same as adjusted p-value here.   Is my understanding right? 

ADD COMMENTlink modified 8 months ago by James W. MacDonald46k • written 9 months ago by zhxiaokang0
gravatar for James W. MacDonald
8 months ago by
United States
James W. MacDonald46k wrote:

You don't need to try to infer what the output from a function is. You can simply read the help page. For ?glmLRT, under the 'Value' section, which lists the output, I get

  PValue: p-values.

And under the Value section for "topTags", I get


     an object of class 'TopTags' containing the following elements for
     the top 'n' most differentially expressed tags as determined by

   table: a data frame containing the elements 'logFC', the
          log-abundance ratio, i.e. fold change, for each tag in the
          two groups being compared, 'logCPM', the log-average
          concentration/abundance for each tag in the two groups being
          compared, 'PValue', exact p-value for differential expression
          using the NB model. When 'adjust.method' is not '"none"',
          there is an extra column of 'FDR' showing the adjusted
          p-value if 'adjust.method' is one of the '"BH"', '"BY"' and
          '"fdr"', or an extra column of 'FWER' if 'adjust.method' is
          one of the '"holm"', '"hochberg"', '"hommel"', and

Which is, I believe, pretty explanatory. You might have an argument that some or all of that is not actually explanatory, in which case you could present your argument and say why you think it isn't clear.

ADD COMMENTlink written 8 months ago by James W. MacDonald46k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 122 users visited in the last hour