Entering edit mode
gowtham
▴
210
@gowtham-5301
Last seen 10.2 years ago
Hi Everyone,
I have been using edgeR for quite sometime. Most of our RNAseq data
comes
from infectious organisms like Malaria and Tryps. Our libraries
generally
have 10 to 20% of the reads coming from rRNA genes (not sure if this
is the
typical value for other organisms/protocols). All these days, I have
been
ignoring them while doing the DE analysis using edgeR.
I am NOT interested in differential expression of rRNA genes, but,
worrying
that excluding them from edgeR might bias the library size
calculations. On
the other hand, including them might introduce bunch of outliers
(these
rRNA genes have very high read counts). I could not intuitively decide
one
over other. So, asking for a help from experts.
Does this change if libraries have varying amount of rRNA
contamination.
Say, one set of libraries have 20% rRNA and another has 40%.
Thanks a bunch in advance,
Gowthaman
--
Gowthaman
Bioinformatics Systems Programmer.
SBRI, 307 West lake Ave N Suite 500
Seattle, WA. 98109-5219
Phone : LAB 206-256-7188 (direct).
[[alternative HTML version deleted]]