Entering edit mode
Kevin Lee
▴
40
@kevin-lee-5904
Last seen 10.3 years ago
Hello,
My name is Kevin Lee, and I am a PhD candidate in Bioinformatics at
Georgia
Institute of Technology. I have been trying to decide between various
normalization techniques for RNA-seq methods for a parasite infection
time-course study. The RNA-seq data that I will have will be from
infected
blood and will contain RNA from both the host and the parasite. I
believe
that it is best to separate the analysis of these two RNA sub-sets for
the
purposes of normalization.
I have been using DESeq because it is clearly superior to FPKM.
Recently
however, I was intrigued by the new Cuff Diff 2 software.
As I have weighed the two methods (DESeq versus CuffDiff2), I see the
benefits of each. CuffDiff2 has the advantage that it quantifies
isoform
abundance, but it uses FPKM to "normalize" expression levels across
samples. DESeq, however, estimates library size much more robustly
than
RPKM. Since my study will be looking at immune response to a parasite
over
time, I expect that there will be at least a few genes that are VERY
differentially expressed in one or more of the conditions/time points.
Consequently, I believe that a DESeq normalization approach will
yield
much more accurate analysis. Does this seem like a reasonable
assessment?
If I do choose to use DESeq, one what level should I quantify
transcription: exonic, gene, or isoform? Exonic seems the most
straight-forward but also has the drawback of representing a much
smaller
"area", and individual exons will have much fewer reads that map
compared
to the number of reads mapping to the gene (of which the exon is a
part).
If quantifying by gene, how is a gene defined: as all exons from all
isoforms? I don't know of any way to quantify isoforms as is done by
CuffDiff, and this is the main reason I am hesitant about using DESeq.
One possible approach is to use reads that fall within any annotated
exon
as being part of a gene. And using those measures to normalize and
test
for differential expression. And in parallel use DEXSeq to test for
differential exon usage. Does this seem like a reasonable approach?
Any
further advice?
Cheers,
Kevin
--
Kevin Lee
Georgia Institute of Technology
Department of Biology
PhD candidate in Bioinformatics
[[alternative HTML version deleted]]