Question

Filtering Orthologus Genes Before or After TMM normalization?

1

Entering edit mode

e.g.w.miller ▴ 10

@650a9366

Last seen 2.6 years ago

United States

Hello,

I am looking at DE across multiple species of mammals. In order to look at multiple species (and do some phylogenetic analyses), I am looking at orthologus genes shared among species. I want to use edgeR for some pairwise comparisons.

I aligned using bowtie and quantified using htseq to get raw counts. I am thinking of the following workflow, starting with the raw counts:

filter lowly expressed genes in raw counts
TMM normalize
filter only orthologs shared between species
maybe convert to TPM in order to normalize for different gene lengths between species, but likely not because this isn't recommended based on what I've read
use limma-voom pipeline for DE;; may also incorporate phytools

One paper [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4668955/] recommends that we find orthologs, then run edgeR because these are the transcripts of interest. However, TMM normalizes library sizes, so if the original library sizes are different. Will this mean that TMM normalization will be as accurate? Thank you in advance!

RNASeq Normalization TMM edgeR • 1.2k views

ADD COMMENT • link updated 2.6 years ago by Gordon Smyth 52k • written 2.6 years ago by e.g.w.miller ▴ 10

score 0 · Answer 1 · 2022-08-02

In general, TMM can be used just fine before or after filtering. It really makes little difference provided you still have large-scale genomic gene coverage after filtering.

However, in your case, I don't see how your proposed workflow is even viable. How could you TMM normalize before subsetting to orthologs? Before you subset to orthologs, you will have different genes for different species, so multi-species normalization would not even be possible.