I'm running a deferential expression pipeline that first calculates DE with DEseq2 and then looks for enticement in the DE genes. This pipeline seems somewhat common in the literature, so I'm sure it's fine.
While well represented in the literature, I'm wondering if this approach is not redundant? GOseq is supposed to prevent biases introduced by the greater power to detect DE in longer transcripts. However, since the variance stabilization approach of DEseq2 should already have effectively compensated for any gene length effects, it seems like the subsequent use of GOseq could result in more false positives negatives by effectively subjecting longer genes to a second round of increased scrutiny (and possible devaluation rejection).
My feeling now is that if a gene makes it through DEseq2 and shows up as being DE, then it should remain in the GO enrichment analysis with no addition weighting based upon length.
EDIT: In my initial wording, I made some very poor word choices and have tried to amend them. My initial question was poorly formed and am hoping that it now is more to the point of what I am asking.
Thanks for the response. My language was very imprecise. I meant to ask whether false negatives (in the GO enrichment results) could be introduced by unnecessarily devaluing longer genes?
The way that I understand the paper, GOseq assigns weights based upon size:
"The PWF quantifies how the probability of a gene selected as DE changes as a function of its transcript length."
But if the DE gene set of interest has already been exposed to a prior round of scrutiny using DEseq, the rlog transformation should have already minimized the bias that might be due to length differences.
Gordon and I agree: there is a difference in statistical power across genes. DESeq2's variance stabilizing transformations can't help that genes with higher counts will have more power for DE (note, the VST or rlog are not used in DESeq2 testing routines). Gene sets even of the same size are not equally powered.
Ok. Thanks to you both for helping to clarify this. Much appreciated!