Hi,
I am new DESeq2 user, interested in Translation Efficiency (TE) and Log2FoldChange (L2FC) of TE. To the best of my understanding, before DESeq2 calculates TE (RiboSeq/RNASeq), the counts are normalized with the appropriate SizeFactor. The SizeFactors take into account the geometric avg of counts across all conditions.
If the above is correct, then the standalone TE values for each condition are dependent on the counts of other conditions and should not be consistent. Would TPM be a better measure for standalone TE values?
Thanks! Nathan
I think I see your point that it doesn't matter if the SizeFactors for ctrlRNASeq counts (SFctrlRNA) and ctrlRiboSeq (SFctrlRibo) normalize the the ctrl counts to 1M or some other number, however it seems to me that it is critical that the SizeFactor ratio: (SFctrlRNA)/(SFctrlRibo) remains independent of the other treatments in the dataset. To the best of my understanding the ratio (SFctrlRNA)/(SFctrlRibo) can change if the counts distributions of treat1RNA, treat1Ribo are different than treat2RNA, treat2Ribo. An 2-fold increase in R = (SFctrlRNA)/(SFctrlRibo) would lead to a 2-fold increase in the ctrl-TE of every gene.
Should I just divide all the ctrl-TE values for by the ctrl-TE of an "anchor gene"? (e.g. ACTB)
So you need to correct for library size somehow, because RNA and Ribo experiments are all different sequencing experiments, and the library size is a technical artifact that tells you nothing about the ratios for individual genes. It's not robust to pick a single gene, this is why the median ratio method (Anders and Huber 2010) uses the center of the distribution of ratios for each sample to a pseudoreference. If you want to pick a set of genes you believe are good for calculating the library size you can pass this set of genes to
controlGenes
but I would pick hundreds of genes for which you have a prior that they are not changing much across the experiments instead of a single gene.