DESeq2 modeling different organisms as batch effects
Entering edit mode
mjldehoon • 0
Last seen 3.5 years ago

Hi all,

We have gene expression data for 12 cell types both in human and mouse, and we would like to use DESeq2 to find genes that are differentially expressed between human and mouse in the same cell type. As a simplified example, suppose for a particular gene we have these expression levels:

celltype human mouse
A 100 200
B 80 160
C 120 240
D 100 200
E 60 120
F 40 400
G 0 0
H 200 400
I 120 240
J 150 300
K 180 360
L 20 40

i.e. the expression level is twice as high in mouse compared to human, but it is in cell type F the expression in mouse is ten times higher compared to human. We are primarily interested in the interaction effect, so we want to identify that this gene is differentially expressed in cell type F, but we don't care about the fact that in general the expression of this gene in mouse is twice the expression in human (which we consider a batch effect).

When calling DESeqDataSetFromMatrix, should the design then include the main effects only (i.e. "~ organism + celltype") or should we also include the interaction effect (i.e. "~ organism + celltype + organism:celltype")? And when calling the results function, how should we specify the interaction effect (I guess it should be something like c("organism:celltype", "human:celltypeX", "mouse:celltypeX"), where celltypeX loops from A to L)?


Thank you,



deseq2 batch effect across species • 380 views
Entering edit mode
Last seen 39 minutes ago
United States

The hardest part for a cross species comparison is quantification that deals with the aspect of having a different genome and transcriptome. There have been some posts here about that as well as papers from e.g. Yoav Gilad's group. So I won't go into that, because that is not related to DESeq2 software, but I want to mention that it seems to me to be the most critical part of this, and a pitfall that could lead to spurious results if you aren't careful.

You didn't mention having any replicates. It's not possible to find cell-type-specific differences without replicates in this case.

If you did have replicates, you could use ~species + celltype + species:celltype, and you could perform a LRT with the reduced model of ~species + celltype. This would find genes where the difference across species is cell-type specific. Then you could perform a stage-wise analysis using the stageR package to detect in which cell-types there are differences.


Login before adding your answer.

Traffic: 443 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6