How to set the contrast in case of a design of 2 level factor and interaction term.
2
0
Entering edit mode
@solgakarbitechnionacil-6453
Last seen 7.7 years ago
European Union

Hi,

I have a question on how to define contrast when the design includes 2 level factors and an interaction term. design =~genome+condition+condition:genome.

The resultsNames(dds):

"Intercept"               "genome_yb1_vs_v252"      "condition_mice_vs_log"   "genomeyb1.conditionmice"

I need the following comparison:

1. mice vs log in all the samples

2. mice vs log only in v252 samples

3. mice vs log only in yb1 samples

The way I defined the contrast for each comparison:

1. contrast = c("condition","mice","log")

2. contrast = list("condition_mice_vs_log")

3. contrast = list(c("condition_mice_vs_log","genomeyb1.conditionmice"))

I get the same results for the first 2 comparisons. To which of the comparisons is the contrast correct and how to define the contrast to the other comparison.

Thank you,

Karen

deseq2 deseq rna-seq • 2.3k views
0
Entering edit mode
@mikelove
Last seen 11 hours ago
United States
Hi, See here for a similar post deseq2: coding 2x2 design Your desired contrast #1 (mice vs log in all the samples) might be described as the average effect across the two groups, and you can use a numeric contrast as described in the post above.
ADD COMMENT
0
Entering edit mode
@solgakarbitechnionacil-6453
Last seen 7.7 years ago
European Union

Hi Michael,

Thanks for the help, and fast reply.  

I ran all comparisons and have another questions:

When comparing mice vs log only in yb1 samples I get genes with no expression in any of these samples but I do get statistics of FC and p-adj values. I set the contrast = list(c("condition_mice_vs_log","genomeyb1.conditionmice")). 

Other samples in the dataset that are not included in this comparison are expressed. can you explain to me why it has such effect on the results. It's confusing to get such results.  

For example gene VV2026 that all 4 samples in this differential expression test had no expression (normalized counts 0), the baseMean = 84.13789 and the statistics were : log2FoldChange -0.47838 ; pvalue 0.619603; padj 0.714385. 

Thanks a lot,

Karen

 

0
Entering edit mode

hi Karen,

The non zero LFC here is because models with an interaction term include shrinkage on the interaction term but not on main effects. The inference is borrowing strength from the other group and from the other genes. The interaction effects were found to be small over all genes, and the condition effect was found to be large for this gene in the other group, so the model is essentially predicting that if the counts for yb1 rise above zero, a negative LFC would be likely. But to avoid such situations, you can either run a model with a single factor "~ group" where group encodes, for example "mice_yb1", etc.; or you can set betaPrior=FALSE to turn off the shrinkage of interaction terms. Then the LFCs as in your contrast will be closer to zero.

ADD REPLY

Login before adding your answer.

Traffic: 605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6