Question

Inconsistent Dirichlet-multinominal coefficient and Beta-Binominal coefficient for a transcript in DTU analysis.

0

Entering edit mode

wangziwei0010 ▴ 10

@18a9ad04

Last seen 8 months ago

Singapore

Dear rnaseqDTU & DRIMseq community,

I am performing Swimming downstream , and I want to confirm those transcripts overused in condition A compared with condition B. So I went looking for some metrics similar to logFC in DESeq2 and I found coefficient() function on dmDStest could give me a 'regression coefficient' for each transcript enter image description here

I'm not statistician and I hope I understand it correctly, that:

coefficient(dmDStest) gives trasncript level coefficient using Dirichlet-multinominal (DM) model,
coefficient(dmDStest, level = "feature") gives trasncript level coefficient using Beta-Binominal (BB) model

then I extracted all transcripts with a padj < 0.05 (follow this post) and plot DM coef and BB coef at x- and y-axis. I found:

many transcripts showed DM coef that equal 0
some transcripts showed DM coef and BB coef with incosistent sign.

I would like to ask:

if it using the sign of the coef to define the direction of DTU in different condition resonable (just like log2FC in DESeq2 analysis)
If resonable, which coef should I use to determine the direction of DTU in different condition?
How to deal with those transcripts with inconsistent coef?
could I say that the higher the coeffcient, the higher the overusage?

Best regards,

Wang

enter image description here

rnaseqDTU DRIMSeq • 1.0k views

ADD COMMENT • link 3.0 years ago wangziwei0010 ▴ 10

score 2 · Answer 1 · 2022-07-08

Yes, you are correct AFAIK about the two coef tables. In the DRIMSeq vignette, they have:

Transcript-level analysis based on the beta-binomial model. In this case, each transcript ratio is modeled separately assuming it follows the beta-binomial distribution which is a one-dimensional version of the Dirichlet-multinomial distribution.

The way a DM model works, one level of the multinomial serves as a reference. This is why you get the 0 coefficient for one of the transcripts -- it is serving as the reference level. In fact the last level in whatever order they are provided to DRIMSeq is the one that is chosen if you check the source code:

https://github.com/gosianow/DRIMSeq/blob/master/R/dmDS_fit.R#L81

However, in the beta-binomial, it is testing each level vs all the others collapsed into one level. E.g. transcript T vs not T. Except in the special case of a two-transcript gene, then the coefficients will not be equal.

For simplicity of interpretation, I would use the BB coefficients for interpretation of the change in proportion per transcript.