Question: Using deseq2 with batch effects and classification as downstream analysis
gravatar for Maryam
5 weeks ago by
Maryam0 wrote:

I'm using Deseq2 for differential gene expression analysis and then for downstream analysis i'm going to classify healthy vs disease states using support vector machine. The data comprises of healthy and disease states which sequenced in 13 batches. I have 2 questions:

1) I used design = ~ batch + condition and when I ran resultsNames(dds), I found that the result is only between each batch against the first batch whereas I'm looking for all differential expressed genes between healthy and disease states in all samples. 

How can I find all differential expressed genes between healthy and disease states ? Should I ignore batch effects?


2) Deseq2 doesn't remove batch effects and only model it, so how can I use this in my classification? I'm using FPKM of genes which are differentially expressed as input of support vector machine.

ADD COMMENTlink modified 5 weeks ago by Michael Love20k • written 5 weeks ago by Maryam0
gravatar for Michael Love
5 weeks ago by
Michael Love20k
United States
Michael Love20k wrote:

1) The fact that one batch is the reference for the coefficients doesn't affect your results at all. All of these batch_2_vs_1,...,batch_13_vs_1 coefficients are just controlling for differences across batch. One has to be chosen as the reference. In short, don't worry about the way that these look, this is normal, and ~batch + condition is the correct approach.

2) I've just added a FAQ to the development branch, as I am getting this question quite a lot:

ADD COMMENTlink written 5 weeks ago by Michael Love20k

Many thanks

ADD REPLYlink written 5 weeks ago by Maryam0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 365 users visited in the last hour