Question: Extracting VST matrix with batch effects removed
0
gravatar for A
6 weeks ago by
A40
A40 wrote:

Hi all,

Was just wondering if somebody would be able to clarify something for me regarding variance stabilising transformation and batch correction and subsequently extracting a matrix of batch corrected VST counts.

I have run an experiment as follows:

DESeqDataSetFromMatrix(countData = countdata, colData = sampledata, design = ~  Organ +Extraction+ Age )
DESeq(dds, reduced = ~Organ+Extraction, test = "LRT", parallel = TRUE)

Extraction being the batch (for this run, only two batches, 1 and 2). And i only want to see DE genes as a result of Age. Organ and batch (extraction) effects are therefore included in the reduced model. I am happy with the inclusion of extraction in the reduced model and i cannot see any clear batch related effects when plotting PCA and there is a good mix amongst batches.

I would like to do further downstream analysis away from DESeq2 however and so need to take a log or VST transformed counts table for this analysis. Although these effects are modelled within DESeq2, is:

vsd <- vst(dds) and then extracting counts from this taking in to account the batch effects across samples? or does the variance stabilisation automatically take care of this?

If not, is there a way to extract a VST of counts with batch effects accounted for?

My downstream analysis is machine learning classification. I have been using a VST counts matrix till now which has caused no real issues during classification across all samples and ages etc, however I will soon have an additional 3 batches so want to make sure these effects are completely minimised in the future.

Many thanks!!

deseq2 batchcorrection • 71 views
ADD COMMENTlink written 6 weeks ago by A40

Hi,

have a look at this post: https://support.bioconductor.org/p/62954/

ADD REPLYlink written 6 weeks ago by andres.firrincieli30

Thank you! so by running removeBatchEffect in limma, the mean shifts are removed in the same way they would be when including batch in the reduced model? is this a correct interpretation of Michaels comment?

so then, would:

newCounts<-removeBatchEffect(assay(vsd), vsd$Extraction))
write.csv(assay(newCounts), file = "batch corrected.csv"))

produce the counts matrix I am after?

ADD REPLYlink written 6 weeks ago by A40

Yes, we actually have it in the FAQ now:

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#why-after-vst-are-there-still-batches-in-the-pca-plot

ADD REPLYlink written 6 weeks ago by Michael Love26k

Brilliant! thank you so much! i will proceed like this!

The code above for generating the new counts matrix is ok?

thanks!

ADD REPLYlink written 6 weeks ago by A40

I said yes and it's also what's listed in the link I sent, no?

ADD REPLYlink written 6 weeks ago by Michael Love26k

you did, apologies, i wasn't sure if yes was referring to the whole statement or the interpretation about mean shift removal! thanks again!

ADD REPLYlink written 6 weeks ago by A40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 203 users visited in the last hour