DESeq2 - Batch effect correction in 2 conditions + 2 genotypes + interaction term design
1
0
Entering edit mode
@barbaramariotti-8083
Last seen 10 days ago
Italy

Dear all, I am analyzing RNAseq data with DESeq2 for a dataset that resemble the type "2 conditions, 2 genotypes and interaction term"; specifically I have healthy donors and patients for both male and female population. I am interested in obtaining:

1- genes modulated in male patients as compared to male controls 2- genes modulated in female patients as compared to female controls 3- genes for which the disease effect is different across sex

Here the code used for this analysis

dds<- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~SEX+DISEASE+SEX:DISEASE)
dds$DISEASE <- relevel(dds$DISEASE, ref="HD")
dds$SEX <- relevel(dds$SEX, ref="M")
dds <- DESeq(dds,fitType="local")
results(dds, name="SEXF.DISEASEDISEASE", pAdjustMethod = "fdr")  #DISEASE effect is different across SEX
results(dds, list( c("DISEASE_DISEASE_vs_HD","SEXF.DISEASEDISEASE") ), pAdjustMethod = "fdr") #effect of DISEASE on F
results(dds, contrast=c("DISEASE","DISEASE","HD"), pAdjustMethod = "fdr") #effecto of DISEASE on M


At this point I decide to control some biological variable known to influence my results. I considered them as batch effect. Here the code

dds1<- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~AGE+COMORBIDITY+SMOKING+DISEASE_SEVERITY_GOLD+SEX+DISEASE+SEX:DISEASE)
dds1$DISEASE <- relevel(dds1$DISEASE, ref="HD")
dds1$SEX <- relevel(dds1$SEX, ref="M")
dds1 <- DESeq(dds1,fitType="local")
results(dds1, name="SEXF.DISEASEDISEASE", pAdjustMethod = "fdr")  #DISEASE effect is different across SEX
results(dds1, list( c("DISEASE_DISEASE_vs_HD","SEXF.DISEASEDISEASE") ), pAdjustMethod = "fdr") #effect of DISEASE on F
results(dds1, contrast=c("DISEASE","DISEASE","HD"), pAdjustMethod = "fdr") #effecto of DISEASE on M


But when I compared the results from dds and dds1 I show that they are exactly the same.

Considering that we know that variable such as Age, disease severity and comorbidity affect our data I was wondering: where is the mistake in my code?

Should be better to work using LRT with something like the folowing?

dds1<- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~AGE+COMORBIDITY+SMOKING+DISEASE_SEVERITY_GOLD+SEX+DISEASE+SEX:DISEASE)
dds1$DISEASE <- relevel(dds1$DISEASE, ref="HD")
dds1$SEX <- relevel(dds1$SEX, ref="M")
dds1 <- DESeq(dds1,fitType="local", test="LRT", reduced=~AGE+COMORBIDITY+SMOKING+DISEASE_SEVERITY_GOLD)


Thanks a lot for all your suggestiona and explanation.

Barbara

DESeq2 • 164 views
0
Entering edit mode
@mikelove
Last seen 2 days ago
United States

For questions about statistical analysis and interpretation, I recommend working with a local statistician or someone familiar with linear models in R.