edgeR: Using ratios (translational efficiencies) as input

0

Entering edit mode

Alvaro J. González ▴ 80

@alvaro-j-gonzalez-5813

Last seen 9.6 years ago

Dear Gowthaman, My recommendation in a previous message (below) was directed at you. Sorry that my message reads as if it were being directed at Gordon. Your experimental scenario is an interesting one and it would be nice if you keep us posted on how you end up tackling it. I think the scatter plot that I suggested and the interaction test that Gordon suggested are worth pursuing. Best wishes, Alvaro J. Gonzalez, PhD Computational Biology Memorial Sloan-Kettering Cancer Center New York, NY *#### original message ####* *[BioC] edgeR: Using ratios (translational efficiencies) as input* *Gordon K Smyth smyth at wehi.EDU.AU * *Wed May 1 02:36:02 CEST 2013* *Dear Alvaro,* * * *Well, honestly I don't know. My naive and literal reading of the original * *email was that the interest was in "differential efficiencies between * *groups". In your example, the ratio of ribosomal-bound RNA to normal mRNA * *is identical for the two groups, hence my naive interpretation is that * *there no evidence for "differential efficiencies" between life stages 1 * *and 2.* * * *Now that I see your interpretation of the data, I see that one could test * *for "differential efficiency" simply by using an interaction test in * *edgeR.* * * *However I have no experience of this type of analysis, and I don't know * *what is scientifically sensible. Making good plots is always a good way * *to go, but send suggestions to the original poster. It's his problem, not * *mine!* * * *Best wishes* *Gordon* * * *--------------- original message --------------* *[BioC] edgeR: Using ratios (translational efficiencies) as input* *Alvaro J. Gonzlez alvaro.gonzalez4 at gmail.com* *Mon Apr 29 15:34:47 CEST 2013* * * *But Gordon,* * * *Isn't it the case that if you feed logs of ratios into limma you're * *automatically losing the statistical significance of those ratios, as well * *as the absolute expression in each condition, which can be relevant?* * * *For instance, define "t" as one of Gowthaman's transcripts. As far as I * *understand, he has four RNAseq libraries measuring the activity of this * *transcript:* * * *1) Transcripts from normal mRNA:* * 1.1) Life stage 1, his transcript gets t_1.1 = 4 reads.* * 1.2) Life stage 2, his transcript gets t_1.2 = 2 reads.* *2) Transcripts from ribosome-bound RNAs:* * 2.1) Life stage 1, his transcript gets t_2.1 = 100 reads.* * 2.2) Life stage 2, his transcript gets t_2.2 = 50 reads.* * * *Let's say edgeR being applied to the two 1) conditions produces:* * * *log2(t_1.1/t_1.2) = log2(4/2) = 1 with adjP = 0.5, meaning, it seems like* *the transcript was differentially overexpressed in life stage 1, but with* *no statistical significance, so we're not really sure.* * * *Then you do the same with the two 2) conditions:* * * *log2(t_2.1/t_2.2) = log2(100/50) = 1 with adjP < 0.01, so you really* *believe the transcript was overexpressed in life stage 1.* * * *Now you feed those two logFCs into limma (1 and 1), and of course, you get * *nothing out, in terms of differential behavior. But the reality is that * *there was a huge change between normal and ribosomal RNAs which was * *diluted by the use of the ratios.* * * *What do you think?* * * *My suggestion, just to start, would be to produce a scatter plot of* *logFC(normal RNA) vs logFC(ribosomal RNA), and to encode adjP values in* *both axes: say for instance by using colors in the x-axis (red is* *significant, green is not), and using dot shapes in the y-axis (star is* *significant, dot is not).* * * *This plot should show you those transcripts in which interesting stuff is* *going on.* * * *Regards,* * * *- Alvaro.* * * *> Dear Gowthaman,* *>* *> I'm not quite sure what translational efficiencies are. Do you have a * *> different efficiency value for each gene and each RNA sample? If you * *> do, why not take logs of the ratios (offsetting counts by 1/2 or 1 to * *> avoid zeros) and feed them into limma?* *>* *> Best wishes* *> Gordon* *>* *>>* *>> Hi Everyone,* *>> I have been using edgeR for the last couple years with great success.* *>> Thanks very much. Now I have slightly unconventional dataset to try. We* *>> have two groups to compare (life stages) each with three replicates. * *>> But,* *>> for each sample in each group, we made two different RNAseq libraries.* *>> 1) one from fragmented mRNA (classical RNAseq) and* *>> 2) another from Ribosome-bound RNA fragments. This library would* *>> indicate how much of the RNA is actively being translated.* *>>* *>> I have used edgeR to analyse data from each of this separately (data * *>> from classical RNAseq or Ribosome-bound). So this let us study the * *>> differentially transcribed genes or differentially translated genes. * *>> And got really nice results.* *>>* *>> The next step is to compare the translational efficiencies between * *>> them. In each sample the ratio between read counts of Ribosome bound * *>> mRNAs and fragmented mRNA would give us the translational efficience of * *>> that gene. We can generate these efficiences (ratios) for each of the * *>> three replicates in each group. Can I feed this data to edgeR to find * *>> out which genes have 'differential efficiencies' between groups?* *>>* *>> I understand, edgeR insists on NOT normalizing the read counts and all * *>> the further statistics depends on the total library size count. By, * *>> using ratios, i completely throw edgeR off. But, i am not sure what is * *>> the best alternate to this?* *>>* *>> Any ideas?* *>>* *>> Much thanks in advance,* *>> Gowthaman* [[alternative HTML version deleted]]

RNASeq GO Cancer limma edgeR RNASeq GO Cancer limma edgeR • 1.1k views

ADD COMMENT • link updated 11.0 years ago by gowtham ▴ 210 • written 11.0 years ago by Alvaro J. González ▴ 80

0

Entering edit mode

gowtham ▴ 210

@gowtham-5301

Last seen 9.6 years ago

Hi Alvaro/Gordon, Please accept my sincere apologies for late reply. I did not see this email until midnight today, until I searched as I was suspicious about not getting a reply from this very active mailing list. Not that I take that for granted, but, sounded bit unusual. I am still not sure how It escaped my eyes. Thanks very very much for the suggestions. Gordon: Yes, we expect same RNA gets translated with different efficiencies at different life cycles. Of course, the effect may not be as big as differential gene expression itself (we hope). I was trying something along the lines of what Gordon suggested, but with out taking logs. For each of the three replicates in each group, i calculate ratios (efficiencies). Then just do a T-Test to see if ratios are significantly different between each group. I was simply using rowttest() from genefilter package. I will try limma. (with log of ratios rather than just ratios). Alvaro: I am still trying to understand the plots you are suggesting. I will get back to you after a giving a bit of thought to it. I really really appreciate your efforts. Thanks, Gowthaman On Wed, May 1, 2013 at 6:47 AM, Alvaro J. GonzÃ¡lez < alvaro.gonzalez4@gmail.com> wrote: > Dear Gowthaman, > > My recommendation in a previous message (below) was directed at you. Sorry > that my message reads as if it were being directed at Gordon. > > Your experimental scenario is an interesting one and it would be nice if > you keep us posted on how you end up tackling it. I think the scatter plot > that I suggested and the interaction test that Gordon suggested are worth > pursuing. > > Best wishes, > > Alvaro J. Gonzalez, PhD > Computational Biology > Memorial Sloan-Kettering Cancer Center > New York, NY > > *#### original message ####* > *[BioC] edgeR: Using ratios (translational efficiencies) as input* > *Gordon K Smyth smyth at wehi.EDU.AU * > *Wed May 1 02:36:02 CEST 2013* > *Dear Alvaro,* > * > * > *Well, honestly I don't know. My naive and literal reading of the original > * > *email was that the interest was in "differential efficiencies between * > *groups". In your example, the ratio of ribosomal-bound RNA to normal mRNA > * > *is identical for the two groups, hence my naive interpretation is that * > *there no evidence for "differential efficiencies" between life stages 1 * > *and 2.* > * > * > *Now that I see your interpretation of the data, I see that one could test > * > *for "differential efficiency" simply by using an interaction test in * > *edgeR.* > * > * > *However I have no experience of this type of analysis, and I don't know * > *what is scientifically sensible. Making good plots is always a good way * > *to go, but send suggestions to the original poster. It's his problem, not > * > *mine!* > * > * > *Best wishes* > *Gordon* > * > * > *--------------- original message --------------* > *[BioC] edgeR: Using ratios (translational efficiencies) as input* > *Alvaro J. Gonzlez alvaro.gonzalez4 at gmail.com* > *Mon Apr 29 15:34:47 CEST 2013* > * > * > *But Gordon,* > * > * > *Isn't it the case that if you feed logs of ratios into limma you're * > *automatically losing the statistical significance of those ratios, as well > * > *as the absolute expression in each condition, which can be relevant?* > * > * > *For instance, define "t" as one of Gowthaman's transcripts. As far as I * > *understand, he has four RNAseq libraries measuring the activity of this * > *transcript:* > * > * > *1) Transcripts from normal mRNA:* > * 1.1) Life stage 1, his transcript gets t_1.1 = 4 reads.* > * 1.2) Life stage 2, his transcript gets t_1.2 = 2 reads.* > *2) Transcripts from ribosome-bound RNAs:* > * 2.1) Life stage 1, his transcript gets t_2.1 = 100 reads.* > * 2.2) Life stage 2, his transcript gets t_2.2 = 50 reads.* > * > * > *Let's say edgeR being applied to the two 1) conditions produces:* > * > * > *log2(t_1.1/t_1.2) = log2(4/2) = 1 with adjP = 0.5, meaning, it seems like* > *the transcript was differentially overexpressed in life stage 1, but with* > *no statistical significance, so we're not really sure.* > * > * > *Then you do the same with the two 2) conditions:* > * > * > *log2(t_2.1/t_2.2) = log2(100/50) = 1 with adjP < 0.01, so you really* > *believe the transcript was overexpressed in life stage 1.* > * > * > *Now you feed those two logFCs into limma (1 and 1), and of course, you get > * > *nothing out, in terms of differential behavior. But the reality is that * > *there was a huge change between normal and ribosomal RNAs which was * > *diluted by the use of the ratios.* > * > * > *What do you think?* > * > * > *My suggestion, just to start, would be to produce a scatter plot of* > *logFC(normal RNA) vs logFC(ribosomal RNA), and to encode adjP values in* > *both axes: say for instance by using colors in the x-axis (red is* > *significant, green is not), and using dot shapes in the y-axis (star is* > *significant, dot is not).* > * > * > *This plot should show you those transcripts in which interesting stuff is* > *going on.* > * > * > *Regards,* > * > * > *- Alvaro.* > * > * > *> Dear Gowthaman,* > *>* > *> I'm not quite sure what translational efficiencies are. Do you have a * > *> different efficiency value for each gene and each RNA sample? If you * > *> do, why not take logs of the ratios (offsetting counts by 1/2 or 1 to * > *> avoid zeros) and feed them into limma?* > *>* > *> Best wishes* > *> Gordon* > *>* > *>>* > *>> Hi Everyone,* > *>> I have been using edgeR for the last couple years with great success.* > *>> Thanks very much. Now I have slightly unconventional dataset to try. > We* > *>> have two groups to compare (life stages) each with three replicates. * > *>> But,* > *>> for each sample in each group, we made two different RNAseq libraries.* > *>> 1) one from fragmented mRNA (classical RNAseq) and* > *>> 2) another from Ribosome-bound RNA fragments. This library would* > *>> indicate how much of the RNA is actively being translated.* > *>>* > *>> I have used edgeR to analyse data from each of this separately (data * > *>> from classical RNAseq or Ribosome-bound). So this let us study the * > *>> differentially transcribed genes or differentially translated genes. * > *>> And got really nice results.* > *>>* > *>> The next step is to compare the translational efficiencies between * > *>> them. In each sample the ratio between read counts of Ribosome bound * > *>> mRNAs and fragmented mRNA would give us the translational efficience of > * > *>> that gene. We can generate these efficiences (ratios) for each of the * > *>> three replicates in each group. Can I feed this data to edgeR to find * > *>> out which genes have 'differential efficiencies' between groups?* > *>>* > *>> I understand, edgeR insists on NOT normalizing the read counts and all > * > *>> the further statistics depends on the total library size count. By, * > *>> using ratios, i completely throw edgeR off. But, i am not sure what is > * > *>> the best alternate to this?* > *>>* > *>> Any ideas?* > *>>* > *>> Much thanks in advance,* > *>> Gowthaman* > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Gowthaman Bioinformatics Systems Programmer. SBRI, 307 West lake Ave N Suite 500 Seattle, WA. 98109-5219 Phone : LAB 206-256-7188 (direct). [[alternative HTML version deleted]]

ADD COMMENT • link 11.0 years ago gowtham ▴ 210

Login before adding your answer.