Question

stageR following DEXSeq two-stage analysis

0

Entering edit mode

rbenel ▴ 40

@rbenel-13642

Last seen 20 months ago

Israel

After running the dex.padj <- getAdjustedPValues(stageRObj, order=FALSE, onlySignificantGenes=TRUE), and filtering for only the transcripts in the final table that are less than 0.05 I received a list of genes with only 1 transcript that was significant.

I was curious to see what the expression of this gene looked like in relation to the 1 transcript that was found to be significant. Using the counts from scaledTPM in tximport I was surprised to see that in many cases there were other transcripts that maybe should have been considered DTU as well...

Anecdotal Example:

geneID          txID            gene         transcript

ENSG00000006468 ENST00000485475 7.875795e-06 1.402172e-06

ENSG00000006468 ENST00000483075 7.875795e-06 3.333497e-01

ENSG00000006468 ENST00000405192 7.875795e-06 6.832075e-01

A graph of expression, can be found here: https://www.dropbox.com/s/52vpnqroxk88h3y/ENSG00000006468.pdf?dl=0

As a side note, what would be considered the minimum read depth for DTU detection?

Thank You

Rina

rnaseqdtu • 1.2k views

ADD COMMENT • link 5.2 years ago rbenel ▴ 40

score 0 · Answer 1 · 2019-01-27

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 20 hours ago

United States

This is not a clear case for me. Across the days the pattern of isoform expression looks roughly similar to me especially considering the error bars you have drawn (are these SE or SD?)

The only filters are the ones we employ and discuss in the article, the filter function from DRIMSeq.

ADD COMMENT • link 5.2 years ago Michael Love 41k

0

Entering edit mode

I can look for a clearer case, but if the expression looks roughly similar, than I should not have gotten even a single significant transcript..? The error bars are SD.

I "filtered" just to make my own script clear, but from the workflow, "The final table with adjusted p-values summarizes the information from the two-stage analysis. Only genes that passed the filter are included in the table, so the table already represents screened genes. The transcripts with values in the column, transcript, less than 0.05 pass the confirmation stage on a target 5% overall false discovery rate, or OFDR."

maybe a clearer case:

         geneID            txID       gene transcript
ENSG00000125812 ENST00000424216 0.04715227 0.01595182
ENSG00000125812 ENST00000338121 0.04715227 0.09605123
ENSG00000125812 ENST00000461789 0.04715227 0.87313556

https://www.dropbox.com/s/tc7mfp9zzspor2o/ENSG00000125812.pdf?dl=0

Thanks :)

ADD REPLY • link 5.2 years ago rbenel ▴ 40

0

Entering edit mode

Also, as a side note, what would be considered the minimum read depth for DTU detection?

ADD REPLY • link 5.2 years ago rbenel ▴ 40

1

Entering edit mode

I don’t have an answer for this, but one can sometimes assess this by exploring one or more real datasets.

ADD REPLY • link 5.2 years ago Michael Love 41k

0

Entering edit mode

So what’s the question exactly? Is it, how is it possible that there is evidence for one transcript but not the other? The answer to that is: because we make an arbitrary decision on “significance” and because the transcripts have different power based on their distributions, as well as simple sampling variance, it’s expected that some genes could end up with one transcript passing the arbitrary threshold while another is above. It is true that for one transcript to actually have participated in DTU, another one must have as well, but that is a question about the actual underlying proportions, whereas statistical testing is a different matter.

ADD REPLY • link 5.2 years ago Michael Love 41k

0

Entering edit mode

Yes, my question was how there can be evidence for a single transcript when in order for DTU to occur by definition there needs to be another transcript involved. I understand that there is a distinction between the underlying proportions v. statistical testing.

So, what would you suggest one do with these types of examples? Should we disregard cases where only 1 transcript is identified, as there seems to not be enough statistical evidence that supports that 2 transcripts are involved in DTU? Or do we analyze this on a case by case basis?

ADD REPLY • link 5.2 years ago rbenel ▴ 40

1

Entering edit mode

For the genes where only one transcript passes the threshold for DTU, I would assume that, if the gene and transcript are true (and not part of the FDR allowed FP), then the one transcript has the strongest signal, and the other transcript(s) must participate in DTU, but didn't show as strong a signal. I would not disregard these cases.

ADD REPLY • link 5.2 years ago Michael Love 41k

0

Entering edit mode

OK.

Thank you for the detailed answer.

ADD REPLY • link 5.2 years ago rbenel ▴ 40