Question: Difference between gene-level test and transcript-level test with RATs (Relative Abundance of Transcrits)
0
gravatar for UserAnonyme
3 months ago by
UserAnonyme0 wrote:

Hi

I'm trying to achieve a Differential Transcript Usage (DTU) analysis, in order to find genes that are differently alternatively spliced between my two conditions.

The package RATs seems to allow this and I saw that two methods were used to find DTU genes:

At the gene level, RATs compares the set of each gene’s isoform abundances between the two conditions to identify if the abundance ratios have changed. At the transcript level, RATs compares the abundance of each individual transcript against the pooled abundance of its sibling isoforms to identify changes in the proportion of the gene’s expression attributable to that specific transcript.

I don't understand what is the difference between the 2 tests, I made a diagram to try to illustrate what I understood from the test transcripts-level:

https://i.ibb.co/pPx609Y/Capture-d-cran-2019-07-09-20-35-39.png

Is it right ?

And so what is the difference with the gene-level test ?

Thank's by advance

ADD COMMENTlink modified 3 months ago by fruce_ki20 • written 3 months ago by UserAnonyme0

Cross posted: https://www.biostars.org/p/388913/

ADD REPLYlink written 3 months ago by swbarnes2330

I have deleted the post on biostar

ADD REPLYlink written 3 months ago by UserAnonyme0
Answer: Difference between gene-level test and transcript-level test with RATs (Relative
1
gravatar for fruce_ki
3 months ago by
fruce_ki20
Austria/Vienna/Research Institute for Molecular Pathology
fruce_ki20 wrote:

Hi,

Thank you for taking an interest in RATs.

Say you have two conditions A and B and a gene with isoforms 1, 2, 3, etc...

The gene level test compares the full set of abundances (A1, A2, A3, ...) against the full set of abundances (B1, B2, B3, ...). This can tell you in which genes the ratios change, but it can't tell you which isoforms are responsible for the change.

For the transcript level test all sets have only two components, the abundance of the isoform in question and the abundance of all the other isoforms together. So it compares (A1, A2+A3+...) vs (B1, B2+B3+...) for transcript 1, (A2, A1+A3+...) vs (B2, B1+B3+...) for transcript 2, (A3, A1+A2+...) vs (B3, B1+B2+...) for transcript 3, etc. This can tell you when the abundance of that specific isoform changes.

Does that make sense?

PS. I am a little surprised to find this question on bioconductor, as I have not yet submitted the package to bioconductor.

EDIT: I should probably note that when I say abundances, I mean the scaled TPM values, not the proportions. The size of the counts is important in determining significance, proportions lose that information.

ADD COMMENTlink modified 3 months ago • written 3 months ago by fruce_ki20

Thank's for this explanation, I see better !

I have a last question, for exemple in a gene-level test we have this abundances for the 3 transcripts between 2 conditions:

g.test.2(obsx= c(100, 200, 300), obsy= c(5000, 200, 300))

The g.test.2 will be significative only because the first transcript saw his abundance greatly increased, even if the other transcripts have not changed ?

So this gene will be DTU only because the gene is also DEG ?

I probably miss something and I apologize if my question seems silly aha

EDIT: so if I understand if we take the scaledTPM values this solves the problem

ADD REPLYlink modified 3 months ago • written 3 months ago by UserAnonyme0

You can have any combination of DGE and DTE and DTU going on at the same time, and they are all valid.

In your example, you have one DTE isoform (from 100 to 5000) and the whole gene being DGE (from 600 to 5500) andv the gene also being DTU (from 0.17/0.33/0.5 to 0.91/0.04/0.05). That isoform will probably be flagged as DTU (0.17 to 0.91), but the other two may also be flagged as DTU (0.33 to 0.04, 0.50 to 0.05), even though they are not DTE. You also have a primary isoform switch going on.

I would not say the gene is DTU because it is DGE. There are many regulatory mechanisms acting at various levels, so unless you actually know what regulatory change occurred, you can't say that one observation is causing the other. They are all just different views of the same complex event.

ADD REPLYlink written 3 months ago by fruce_ki20

Ok, thank's ! :-)

I compare my list od DEG and my list of DTU, and the overlap is small, so the exemple that I used ( obsx= c(100, 200, 300), obsy= c(5000, 200, 300) ) is maybe rare biologically speaking

When I compare the results with rMATS the overlap with RATs is way better ( not surprising as both approaches study splicing )

ADD REPLYlink written 3 months ago by UserAnonyme0

DGE and DTU are different things, regulated by different mechanisms. You can have one or the other or both. I would not normally expect a high overlap between DGE and DTU genes.

ADD REPLYlink written 3 months ago by fruce_ki20

Also, I'm not sure why you are using the g.test.2 function directly. Do you only have one gene to test?

You should use TPMs scaled to the size of the sample's library, but I'm not sure what "problem" you think this will fix. You didn't mention a problem, it was just a question about test design.

ADD REPLYlink written 3 months ago by fruce_ki20

No it was just to see an example with a single gene, I have the scaled TPM for all of the genes

ADD REPLYlink written 3 months ago by UserAnonyme0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 375 users visited in the last hour