Hi,
I posted this question to Biostars previously, but realised it probably belongs here. Sorry about that.
I have a table of FPKM values generated by DEseq2, and I'm trying to find out what DEseq2 uses as gene lengths when these are not supplied (I'm trying to assess to what extent my results are likely to change by supplying these).
According to the manual, "feature length is calculated from the rowRanges of the dds object, if a column basepairs is not present in mcols(dds). The calculated length is the number of basepairs in the union of all GRanges assigned to a given row of object, e.g., the union of all basepairs of exons of a given gene."
Does that mean DEseq2 directly uses the ranges obtained by using rowRanges(dds)? I'm comparing these values to those obtained using the function getGeneLengthAndGCContent from the EDAseq package. The rowRanges values from DEseq2 are sometimes very close to those obtained with EDAseq, but sometimes they differ by a factor of 10. Can someone explain to me how this discrepancy is caused? Or am I simply looking at the wrong values?
Thank you, Best wishes, Hasse
- Biostars post: https://www.biostars.org/p/429296/