Question

DESeq2 PCA plot on fitted values

0

Entering edit mode

Sachin Pundhir • 0

@sachin-pundhir-6921

Last seen 9.5 years ago

Denmark

I am analyzing a RNA-seq data having some batch effect using DESeq2.

To account for the effect, I add it as a covariate to the regression model.

Next, to determine, how well the effect is corrected, I plot
two PCA plots

1. plotMDS(assays(dds)[["counts"]

] - raw count

2. plotMDS(assays(dds)[["mu"]] - fitted values.

My question is: Is it the correct way to check how well the regression
model worked on accounting for the batch effect?

deseq • 3.1k views

ADD COMMENT • link updated 9.5 years ago by Michael Love 41k • written 9.5 years ago by Sachin Pundhir • 0

0

Entering edit mode

*moved it down

ADD REPLY • link 5.9 years ago C T ▴ 140

score 1 · Answer 1 · 2014-10-26

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 6 hours ago

United States

No the "mu" here is not the right matrix to look at, as it contains the batch effect terms. In ?DESeq, you can rearrange the formula to get:

mu_ij = s_j 2^(beta_intercept + beta_batch + ...)

It's not easy to observe how the model corrects for batch effect, as it happens inside the model, in the balance of the coefficients from batch and the other variables. The model accounts for mean shift (in log common scale counts) which can be explained by batch.

You might look at the difference between the counts and fitted values on the log common scale to see if they associate with batch:

log(counts(dds, normalized=TRUE) + 1) - log(t( t(assays(dds)[["mu"]]) / sizeFactors(dds) ) + 1)

ADD COMMENT • link 9.5 years ago Michael Love 41k

0

Entering edit mode

Thanks, based on the suggestion, I made two PCA plots (https://www.dropbox.com/s/nk55yr1cgvux6hr/DESeq2_pca_plot.pdf?dl=0)

a) on normalized read counts

plotMDS(log(counts(dds, normalized=TRUE) + 1))

b) on the difference between the counts and fitted values

plotMDS(log(counts(dds, normalized=TRUE) + 1) - log(t( t(assays(dds)[["mu"]]) / sizeFactors(dds) ) + 1))

By looking at them, can we infer that the batch effect on WT_1 and KO_1 has been accounted for? or else like in Limma where we can make PCA plots to see how well the batch effect has been removed, is there a way to visualize or quantify how well the batch effect has been accounted for by the model in DESeq2.

ADD REPLY • link 9.5 years ago Sachin Pundhir • 0

0

Entering edit mode

With only 2 samples per batch, it's hard to see how much of a batch effect there is in the first place. The batch 1 samples are different from the others, but not in the same direction, and the batch 2 and 3 samples do not tightly cluster by batch before or after. This looks ok to me.

ADD REPLY • link 9.5 years ago Michael Love 41k

0

Entering edit mode

Sorry to revive this old thread. But, I just want to make sure.

So, if I do

plotMDS(log(t( t(assays(dds)[["mu"]]) / sizeFactors(dds) ) + 1)

I am using the fitted values that include the batch effect, correct? So, if samples are closer together in this MDS plot, it should represents how similar the samples are after the variance associated with batch effect is accounted for, correct?

ADD REPLY • link 5.9 years ago C T ▴ 140

0

Entering edit mode

Can you make a new post and explain more about your dataset (e.g how many samples, what is your design, etc.) and your goal?

ADD REPLY • link 5.9 years ago Michael Love 41k