Question: Pair-wise DESeq Analysis with 3 Factor Levels - RNA-Seq
Entering edit mode
Last seen 3.9 years ago

[ Crossposted from Biostars: ]

Hello, very new to RNA-Sequencing, R and DESeq2 so the question might be too simple or repeated so I apologize in advance.

I am working with 24 RNA-Sequencing samples, 8 conditions with 3 biological replicates for each. A part of my data matrix is shown below.

image of count matrix

Each sample represents one of three phenotypes. A wild-type phenotype, and two altered phenotypes. I would like to compare the wild-type phenotype with each of the other phenotypes, and the other two phenotypes with each other. I am confused as to what design formula I should use to generate my DESeq object. I have read the vignettes and beginner's guide but I lack the background to fully understand how the design formulas work. I have included part of my coldata below.

I would like to compare NB and CW, and NB and CCW. I aim to find genes that are differentially expressed as phenotype changes from NB to CW and from NB to CCW. Then I would like to compare CW and CCW. I would like to do this pair-wise for each sample pair and retain that information in the output. So far, I have been doing this by splitting my matrix into pairs of samples and running each pair individually but that is incredibly tedious and was wondering if I could accomplish the same results by using the complete expression matrix and coldata. I have tried only using chirality in the design formula (design = ~ Chirality) and set NB as the reference lever using the relevel() function, but upon using the resultsNames(dds) function I get,

"Intercept" "Chirality_CW_vs_CCW" "Chirality_NB_vs_CCW"

However, I believe I am looking for NB vs CW and NB vs CCW. Also, I have not included the Cell.Type in the design formula but I assume it should be included since I want to compare the samples pair-wise. Any help regarding design formulae for this particular case, understanding design formulae in R, in general, or other comments are greatly appreciated.


Tasnif Rahman

deseq2 design model rnaseq multiple factor levels • 1.3k views
Entering edit mode
Last seen 1 day ago
United States

You can use a design of ~condition (or however you wish to name this variable), and then:

res <- results(dds, contrast=c("condition", "X", "Y"))

Where you can fill in X and Y to be the levels you want to compare.

If you want to include cell type, I need to know more information about how cell type and chirality overlap. You can just paste the entire, this is what I typically recommend users to do to completely describe their experiment.

Entering edit mode

Hi Michael, thank you for your response. I have included the full coldata object.

    Chirality           Cell.Type
S11        NB               RUES2
S12        NB               RUES2
S13        NB               RUES2
S22        CW D5 Cardiac Mesoderm
S23        CW D5 Cardiac Mesoderm
S24        CW D5 Cardiac Mesoderm
S31       CCW D5 Neural Induction
S32       CCW D5 Neural Induction
S33       CCW D5 Neural Induction
S41       CCW        D3 Endoderm 
S42       CCW        D3 Endoderm 
S43       CCW        D3 Endoderm 
S52        CW       SB431542 24hr
S53        CW       SB431542 24hr
S54        CW       SB431542 24hr
S63       CCW         Wnt3a 24hr 
S64       CCW         Wnt3a 24hr 
S65       CCW         Wnt3a 24hr 
S71       CCW               D7 MH
S72       CCW               D7 MH
S73       CCW               D7 MH
S82        CW           IWP2 24hr
S83        CW           IWP2 24hr
S84        CW           IWP2 24hr

There is no correlation between the cell type and the Chirality. The only reason to include cell type would be to get pairwise data for NB vs CW and NB vs CCW for each sample vs the REUS2 cell type, and then further to identify DE genes that overlap between samples for NB vs CW and NB vs CCW respectively. We are trying to identify gene targets that are differentially expressed that might lead to the two different phenotypes (Chirality: CW and CWW) vs the control phenotype (Chirality: NB).


Thank you! 

Tasnif Rahman

Entering edit mode

The cell type looks nested within each chirality. From scanning the column, I don't see anywhere where you could do a blocked or paired analysis. Can you explain what kind of pairing you are imagining?

Entering edit mode

Yeah, they are basically different treatment conditions that pluripotent stem cells are treated with, and each condition leads to a CW or CWW bias. The stem cells themselves have no bias. I would like to compare the gene expression of groups with CW bias to the group with no bias. Do you reckon I should just use "NB" and "CW" as the ~condition and then do another comparison with "NB" and "CCW"? I was thinking of identifying genes differentially expressed between each CW cell type and the NB cell type, and then looking at genes that show similar differential expression between each of those comparisons to identify genes differentially expressed between the NB and CW conditions. So the pairs for NB vs CW would be REUS2 vs Cardiac Mesoderm, REUS2 vs SB431542 and REUS2 vs IWP2 24hr, and similar for the CCW groups. Since the different cell types are cells treated with drugs targetting different pathways, or induction of different lineage differentiations, I am a bit concerned about grouping all CW groups together for the comparison, since each would have its own gene expression profile for genes not involved in the phenotypic bias of interest. The goal however, is to identify gene expression patterns that are common between all biased samples with so I guess that should not matter. Any insights would be appreciated, and thank you for your time, Michael. 



Entering edit mode

To generalize, let me call NB, CW, and CWW just A, B and C, levels of "condition". You can compare C to A and B to A in one design, you don't need to rerun DESeq(), you just use the "contrast" argument of results().

That doesn't deal with the correlations among the cell type though. You can't account for these using fixed effects. The method for doing this (to compare across groups of samples, but account for various levels that are nested within) is the duplicateCorrelation() function in limma.

As far as doing post-hoc tests to find which cell type, and looking at more than just C vs A and B vs A, you should make sure that whatever post-hoc test you do, you control the false positives. We do not have functionality for this in DESeq2, so you may want to discuss with a statistician.

Entering edit mode

I understand. How should I set up the single design for C vs A and B vs A? Can I have more than one contrast comparison in one command? Also, I only used "design = ~ Chirality" to generate the dds, should that be okay with such comparisons? Thank you again, really appreciate the help. 


Entering edit mode

My answer is that this experiment, with the questions you want to answer, can’t really be analyzed with fixed effects and so you need to use limma.


Login before adding your answer.

Traffic: 356 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6