I have a question about the appropriate salmon-generated gene counts, which take into account gene length, for the purpose of running eqtl analysis. I would much appreciate your suggestions on this.
I used tximport to obtain gene-level counts/abundance estimates for our RNAseq data (estimated using default countsFromAbundance=no). The downstream steps involve running TMM normalisation (to obtain normalised counts), rank transformation and batch correction, before running eqtl analysis.
With this approach, I am concerned that the counts do not take into account the length of genes. Your document here :
section : Use with downstream Bioconductor differential expression packages
outlines two ways in which one can get the counts ready for DE analysis, by taking into account gene/transcript length. I have a couple of questions:
Do the two methods work in a similar way, and can either be applied to edgeR/DEseq2/limma-voom?
If yes, would counts generated with tximport's countsFromAbundance = "lengthScaledTPM" be the appropriate ones to use for my eqtl pipeline?
If no, would the block of code under edgeR be appropriate to apply for my eqtl processing pipeline too ? I use the output of calcNormFactors to modify my counts. I don't know if/where I should use the offset, or if running calcNormFactors(cts/normMat) would do what I need?
Thanks for your help!