Question: Unusual skewed MA plot at lower expression
1
2.5 years ago by

Hello,

I have RNA seq data (paired end) isolated from exosomes from healthy and disease and have performed diff express using DESeq2. However, I have unusual skweing in my MA plot. I have not come accorss this before in any of my other experiments (or in the microvesicles from the same donors). There are 2984 DE genes. I was wondering whether this could be a consequence of uneven amounts of mRNA per vesicle? If so, what would be the best way to handle data of this type?

I noticed that someone else had an issue similar to this: DESeq2 MA plot of encode data

normalization deseq2 • 510 views
modified 2.5 years ago • written 2.5 years ago by adam.cribbs10

I would recommend re-running DESeq2 without the beta prior and then re-generating the MA plot in order to get a better picture of the skew. It's likely that the upward skew goes all the way to the left of the plot in actuality, but the left edge has been squeezed out by the beta prior.

Those triangles indicate simply that the LFC goes off the plot as set by ylim, it's not the beta prior that squeezed them down.

That said, it would be interesting to at least plot the MLE fold changes in addition, which can be done simply with:

res <- results(dds, addMLE=TRUE)
​plotMA(res, MLE=TRUE)

Oh, I wasn't referring to the triangles, I was referring to the left side of the plot, where it seems that the skew goes back down to zero. I think the apparent hump shape of the skew is probably a product of the beta prior rather than a property of the data.

1

Oh gotcha, yeah those are going to be positive count in the non-reference group and ~0 in the reference group, leading to very large or Inf LFC.

Which is why I'm suspecting there could be confounding of the library sizes with the condition, which is problematic for doing any kind of inference on the low count genes.

Thanks for the comments. I re-analysed the data without the beta prior and have the following MA plot:

The summary stats for read depth are as follows:
Disease:
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
18.16   18.98   21.58   21.32   23.47   24.37
Healthy:
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
20.84   21.56   22.82   23.70   25.20   28.08 

So it looks like the sequencing depth is not confounded.

I don't know what it is about the many genes that have 4x, 16x, 64x counts in disease vs healthy. You can pick them out and do something like GO analysis, or simply look at some example genes using plotCounts() and then figuring out what these genes may have in common.

To test for genes with large fold change, you can do, e.g.:

res <- results(dds, lfcThreshold=4)
Answer: Unusual skewed MA plot at lower expression
0
2.5 years ago by
Michael Love22k
United States
Michael Love22k wrote:

Can you summarize the sequencing depth across healthy and disease?

sapply(split(colSums(counts(dds))/1e6, condition), summary)

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.