Background: I would like to identify differentially expressed gene orthologs across multiple related organisms. I predicted a set of single copy orthologs from transcriptome de novo assemblies and quantified expression of these transcripts with RSEM.
#I imported RSEM quantifications with tximport txi.rsem <- tximport(files, type = "rsem", txIn = FALSE, txOut = FALSE) #Made a DEseq dataset dds <- DESeqDataSetFromTximport(txi.rsem, colData = samples, design = ~ condition) #...and run DEseq2 dds <-DESeq(dds)
Question: The orthologs for which I want to perform differential expression analysis are not of the same sequence length - does the procedure above "normalize" the counts with respect to the different transcript lengths (I presume that yes - also given the vignette of DESeq2::plotCounts which says that "the counts should be normalized by size factor (default is TRUE)" but I'm not entirely sure. Thanks!