Using deseq2 with batch effects and classification as downstream analysis
1
0
Entering edit mode
Maryam • 0
@maryam-17785
Last seen 5.4 years ago

I'm using Deseq2 for differential gene expression analysis and then for downstream analysis i'm going to classify healthy vs disease states using support vector machine. The data comprises of healthy and disease states which sequenced in 13 batches. I have 2 questions:

1) I used design = ~ batch + condition and when I ran resultsNames(dds), I found that the result is only between each batch against the first batch whereas I'm looking for all differential expressed genes between healthy and disease states in all samples. 

How can I find all differential expressed genes between healthy and disease states ? Should I ignore batch effects?

 

2) Deseq2 doesn't remove batch effects and only model it, so how can I use this in my classification? I'm using FPKM of genes which are differentially expressed as input of support vector machine.

deseq2 batch effect classification • 1.1k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

1) The fact that one batch is the reference for the coefficients doesn't affect your results at all. All of these batch_2_vs_1,...,batch_13_vs_1 coefficients are just controlling for differences across batch. One has to be chosen as the reference. In short, don't worry about the way that these look, this is normal, and ~batch + condition is the correct approach.

2) I've just added a FAQ to the development branch, as I am getting this question quite a lot:

https://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#why-after-vst-are-there-still-batches-in-the-pca-plot

ADD COMMENT
0
Entering edit mode

Many thanks

ADD REPLY

Login before adding your answer.

Traffic: 485 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6