DESeq2: 5 Conditions (B vs C) vs (D vs B)
1
0
Entering edit mode
@fischer-philipp-18490
Last seen 23 months ago
Austria

DESeq2: 5 Conditions (B vs C) vs (D vs B)

Consider a setup with 5 Conditions A-E, where E is the untreated condition and hence the reference label.

I want to know which genes are upregulated in B vs C but are downregulated in D vs B. All samples should be compared to E (the untreated condition).

coldata
        condition
    A_1 "A"      
    A_2 "A"      
    A_3 "A"      
    B_1 "B"      
    B_2 "B"      
    B_3 "B"      
    C_1 "C"      
    C_2 "C"      
    C_3 "C"      
    D_1 "D"      
    D_2 "D"      
    D_3 "D"      
    E_1 "E"      
    E_2 "E"      
    E_3 "E"

    dds_f <- DESeqDataSetFromMatrix(countData = data_f, 
                                    colData = coldata,
                                    design = ~ condition)

    dds_f$condition <- relevel(dds_f$condition, ref = "E")
    dds_f <- DESeq(dds_f)
    resultsNames(dds_f)

    [1] "Intercept"        "condition_A_vs_E" "condition_B_vs_E" "condition_C_vs_E" "condition_D_vs_E"

    upd <- results(dds_f, listValues = c(1, -1/2), 
                           contrast = list(c("condition_B_vs_E"), c("condition_C_vs_E", "condition_D_vs_E")))
    rn <- rownames(upd[!is.na(upd$padj) & upd$padj <= 0.05 & upd$log2FoldChange >= 1, ])

My questions are:

1.) I want to find genes which are downregulated in (condition_D_vs_E vs condition_B_vs_E) but upregulated in (condition_B_vs_E vs condition_C_vs_E)

2.) Is this the right comparison (condition_B_vs_E vs condition_C_vs_E) vs (condition_D_vs_E vs condition_B_vs_E)?

3.) Does the contrast I choose answer this question?

*2.) or does one has to interpret this contrast like this: On average condition_B_vs_E is smaller or higher than condition_C_vs_E and condition_D_vs_E? Meaning that it could happen that this could result in a positive log2 fold change even though the gene expression levels of condition_D_vs_E are actually higher than in condition_B_vs_E? Which is because of the average of condition_D_vs_E with condition_C_vs_E, which in this case contains very low expressed genes.

deseq2 contrast ratio of ratio • 1.8k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

This comes up a few times on the support forum. Comparing X vs Y, with both compared to Z, is equivalent to X vs Y alone. The algebra on the log scale is:

(X - Z) - (Y - Z) = X - Y

DESeq2 can form the contrasts for you if you just do:

results(dds, contrast=c("condition","D","B"))

You don't have to bother with the list here.

If you want downregulated in D vs B but upregulated in B vs C, you should build two results tables, the first one for D vs B and the second one for B vs C, and then just take the intersection. You can specify altHypothesis="greaterAbs" or "lessAbs" to perform one sided testing.

ADD COMMENT
0
Entering edit mode

Thank you for your answer.

Just for confirmation: The code would be like this (with lfcThreshold = 0). And the results would just be a list.

m1 <- results(dds_f, contrast = c(0, 0, 1, -1, 0)) #B vs C
m2 <- results(dds_f, contrast = c(0, 0, -1, 0, 1)) #D vs B
rn.m1 <- rownames(m1)[m1$log2FoldChange >= 1 & m1$padj  <=  0.05 & !is.na(m1$padj)]
rn.m2 <- rownames(m2)[m2$log2FoldChange <= -1 & m2$padj  <=  0.05 & !is.na(m2$padj)]

un <- intersect(rn.m1, rn.m2)
head(un)
[1] "Klf6"   "Tbx2"   "Trim25" "Pparg"  "Slc5a5" "Ccl3"  

Is there also a way to compare resulting in a log2 fold change? Or am I getting something wrong?

Thank you in advance.

ADD REPLY
0
Entering edit mode

Sorry I mistyped above. To get a one sided test, above 1 or below -1, you should do lfcThreshold=1 and altHypothesis="greaterAbs" or altHypothesis="lessAbs". You can use a numeric contrast, but it's safer to use the code I had above, because it will always be correct even if you happened to relevel the variables.

You can intersect on the rownames and then index the two results objects with the rownames vector:

res[idx,]

ADD REPLY
0
Entering edit mode

Closely related I stuck on another question Is it possible to compare (A vs B) vs (C vs D) So imagine 2 cell types (CT) and 2 conditions (CD) (CT1CD1 vs CT1CD2) vs ( CT2CD1 vs CT2CD2)

Thank you for answering.

ADD REPLY
0
Entering edit mode

Sure, you can do that with the list style on contrast. Just rearrange so that A and D are in the numerator and B and C in the denominator.

ADD REPLY
0
Entering edit mode

Hello I made some typos but I corrected them now - sorry for that. What I want is: (CT1CD1 vs CT1CD2) vs ( CT2CD1 vs CT2CD2) How I got it is:

dds <- DESeqDataSetFromMatrix(countData = cnt_merge_all,
                              colData = coldata, 
                              design = ~ batch + condition)
dds$condition <- factor(dds$condition, levels = c("CT1_CD1", "CT1_CD2", "CT2_CD1", "CT2_CD2"))
dds$condition <- relevel(dds$condition, ref = "CT1_CD1")
dds <- DESeq(dds)
res <- results(dds, contrast = list(c("CT1_CD1", "CT1_CD2"), c("CT2_CD1", "CT2_CD2")))

Which results in:

Fehler in checkContrast(contrast, resNames) : all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'

within resultsNames(dds) is just one desired contrast, which is: condition_CT1CD1 vs CT1CD2 but condition_CT2CD1 vs CT2CD2 is missing.

What is it i did not get? Again thank you.

ADD REPLY
3
Entering edit mode

Be careful about the terms here, and read carefully over the help pages. I said above to put A and D in the numerator. You have put A and B in the numerator and C and D in the denominator.

See the help page for ?results:

a list of 2 character vectors: the names of the fold changes for the numerator, and the names of the fold changes for the denominator.

Just write out what you want on paper:

(A / B) / (C / D)

that means A and D are in the top and B and C go in the bottom:

(A * D) / (B * C)

So the first element of the list should be c("A", "D") and the second element of the list should be c("B", "C").

For getting all the levels of condition in the resultsNames(dds), you can add a 0 to the design. Also, because you have batch, this is one of the rare cases where you want to put condition before batch to get the coefficients correct. Usually it's' recommended to put condition at the end, but here you need to put it after the 0, so you will get the per-group coding of the coefficients.

~ 0 + condition + batch

ADD REPLY

Login before adding your answer.

Traffic: 519 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6