Entering edit mode
Alvaro J. González
▴
80
@alvaro-j-gonzalez-5813
Last seen 10.2 years ago
But Gordon,
Isn't it the case that if you feed logs of ratios into limma you're
automatically losing the statistical significance of those ratios, as
well
as the absolute expression in each condition, which can be relevant?
For instance, define "t" as one of Gowthaman's transcripts. As far as
I
understand, he has four RNAseq libraries measuring the activity of
this
transcript:
1) Transcripts from normal mRNA:
1.1) Life stage 1, his transcript gets t_1.1 = 4 reads.
1.2) Life stage 2, his transcript gets t_1.2 = 2 reads.
2) Transcripts from ribosome-bound RNAs:
2.1) Life stage 1, his transcript gets t_2.1 = 100 reads.
2.2) Life stage 2, his transcript gets t_2.2 = 50 reads.
Let's say edgeR being applied to the two 1) conditions produces:
log2(t_1.1/t_1.2) = log2(4/2) = 1 with adjP = 0.5, meaning, it seems
like
the transcript was differentially overexpressed in life stage 1, but
with
no statistical significance, so we're not really sure.
Then you do the same with the two 2) conditions:
log2(t_2.1/t_2.2) = log2(100/50) = 1 with adjP < 0.01, so you really
believe the transcript was overexpressed in life stage 1.
Now you feed those two logFCs into limma (1 and 1), and of course, you
get
nothing out, in terms of differential behavior. But the reality is
that
there was a huge change between normal and ribosomal RNAs which was
diluted
by the use of the ratios.
What do you think?
My suggestion, just to start, would be to produce a scatter plot of
logFC(normal RNA) vs logFC(ribosomal RNA), and to encode adjP values
in
both axes: say for instance by using colors in the x-axis (red is
significant, green is not), and using dot shapes in the y-axis (star
is
significant, dot is not).
This plot should show you those transcripts in which interesting stuff
is
going on.
Regards,
- Alvaro.
> Dear Gowthaman,
>
> I'm not quite sure what translational efficiencies are. Do you have
a
different efficiency value for each gene and each RNA sample? If you
do,
why not take logs of the ratios (offsetting counts by 1/2 or 1 to
avoid
zeros) and feed them into limma?
>
> Best wishes
> Gordon
>
>>
>> Hi Everyone,
>> I have been using edgeR for the last couple years with great
success.
>> Thanks very much. Now I have slightly unconventional dataset to
try. We
>> have two groups to compare (life stages) each with three
replicates. But,
>> for each sample in each group, we made two different RNAseq
libraries.
>> 1) one from fragmented mRNA (classical RNAseq) and
>> 2) another from Ribosome-bound RNA fragments. This library would
indicate
>> how much of the RNA is actively being translated.
>>
>> I have used edgeR to analyse data from each of this separately
(data from
>> classical RNAseq or Ribosome-bound). So this let us study the
>> differentially transcribed genes or differentially translated
genes. And
>> got really nice results.
>>
>> The next step is to compare the translational efficiencies between
them.
In
>> each sample the ratio between read counts of Ribosome bound mRNAs
and
>> fragmented mRNA would give us the translational efficience of that
gene.
We
>> can generate these efficiences (ratios) for each of the three
replicates
in
>> each group. Can I feed this data to edgeR to find out which genes
have
>> 'differential efficiencies' between groups?
>>
>> I understand, edgeR insists on NOT normalizing the read counts and
all
the
>> further statistics depends on the total library size count. By,
using
>> ratios, i completely throw edgeR off. But, i am not sure what is
the best
>> alternate to this?
>>
>> Any ideas?
>>
>> Much thanks in advance,
>> Gowthaman
[[alternative HTML version deleted]]