I have some questions about rlog transformation using DESeq2:
1) This transformation normalizes for library size, which is sequencing depth or total number of mapped reads, correct?
2) This transformation does not normalize for gene or transcript length, correct? If this is true, is there a program you would recommend to normalize transcript length using the output from rlog transformation?
Thanks!
Thank you for the helpful response. I have a follow-up question.
Using DESeq2 workflow, I performed rlog transformation on gene-level raw counts to obtain Transformed Gene-level Count (normalized for sequencing depth). Now, can I use the tximport pipeline on this Transform Gene-level Count to obtain Transformed Normalized Gene-level Count, i.e. normalized for sequencing depth and differences in transcript length across samples?
My goal is to use this normalized gene-level counts for candidate gene expression differential analysis (not genome-wide). Thanks.
If you want to do differential expression with DESeq2 you should just use DESeq() on the dds object containing the counts (not normalized). Dealing with differing library size occurs within the model, you should not pre-normalize.
If you think that there is differential isoform usage across samples, I would recommend the tximport pipeline before DESeq2 (I recommend this in general for a number of reasons discussed in the workflow). This entails running software like Salmon, Sailfish or kallisto on your reads first, then reading these files into R using tximport as described in the tximport vignette.