Multifactorial experimental design in DESeq2
1
0
Entering edit mode
@upendrakumardevisetty-7478
Last seen 3.8 years ago
United States

I have a question about the design that i am using for a question that i am interested in. Basically i have three genotypes (LL, OP and M), each of which have been subjected to 2 treatments (5 and 33) with 6 biological replicates per treatment. Here is the sample dataframe for the experiment

    sample   trt gt rep
1   5_OP_1    5  OP   1
2   5_OP_2    5  OP   2
3   5_OP_3    5  OP   3
4   5_OP_4    5  OP   4
5   5_OP_5    5  OP   5
6   5_OP_6    5  OP   6
7  33_OP_1   33  OP   1
8  33_OP_2   33  OP   2
9  33_OP_3   33  OP   3
10 33_OP_4   33  OP   4
11 33_OP_5   33  OP   5
12 33_OP_6   33  OP   6
13   5_M_1    5   M   1
14   5_M_2    5   M   2
15   5_M_3    5   M   3
16   5_M_4    5   M   4
17   5_M_5    5   M   5
18   5_M_6    5   M   6
19  33_M_1   33   M   1
20  33_M_2   33   M   2
21  33_M_3   33   M   3
22  33_M_4   33   M   4
23  33_M_5   33   M   5
24  33_M_6   33   M   6
25  5_LL_1    5  LL   1
26  5_LL_2    5  LL   2
27  5_LL_3    5  LL   3
28  5_LL_4    5  LL   4
29  5_LL_5    5  LL   5
30  5_LL_6    5  LL   6
31 33_LL_1   33  LL   1
32 33_LL_2   33  LL   2
33 33_LL_3   33  LL   3
34 33_LL_4   33  LL   4
35 33_LL_5   33  LL   5
36 33_LL_6   33  LL   6

Now i am interested in transcripts that are differentially expressed in genotype LL at either 5 or 33 treatment with respect to the other two genotypes (OP and M). Here is what i did and i want to make sure that i did correct.

# Load the DESeq2 package
library("DESeq2")

# Load the file
data <- read.csv("data_51_full.txt", sep = "\t")

# Create a df with sample, treatment, genotype and replicate information
samples <- data.frame(
  sample = names(data),
  trt = sub("(5|33)_(OP|M|LL)_([1-6])","\\1",names(data)),
  gt  = sub("(5|33)_(OP|M|LL)_([1-6])","\\2",names(data)),
  rep  = sub("(5|33)_(OP|M|LL)_([1-6])","\\3",names(data))
)
samples

# Count matrix input
dds <- DESeqDataSetFromMatrix(countData = data,
                              colData = samples,
                              design = ~ 1)

# Filtering to remove rows with 0 reads
dds <- dds[ rowSums(counts(dds)) > 1, ]

# design
design(dds) <- ~ trt + gt + trt:gt

# Run DESeq2
dds <- DESeq(dds)

# Print the coefficients
resultsNames(dds)
#[1] "Intercept"    "trt_5_vs_33" "gt_M_vs_LL"  "gt_OP_vs_LL" "trt5.gtM"   "trt5.gtOP" 

# The treatment effect for genotype I (LL) (the main effect)
res <- results(dds, contrast=c("trt","5","33"))  

mcols(res)$description
#[1] "mean of normalized counts for all samples" "log2 fold change (MLE): trt 5 vs 33"     
#[3] "standard error: trt 5 vs 33"              "Wald statistic: trt 5 vs 33"             
#[5] "Wald test p-value: trt 5 vs 33"           "BH adjusted p-values"                     

 

 

 

 

deseq2 • 1.8k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 12 hours ago
United States
Let me ask a few more questions about your goal. The interaction design may not be the right approach. "i am interested in transcripts that are differentially expressed in genotype LL at either 5 or 33 treatment with respect to the other two populations (OP and M)" Note that, with your experiment, you can't measure the 5 or 33 treatment effect because there are no control samples. But you can contrast the two treatments. The interaction design helps to answer if the difference between 33 and 5 treatment changes across population. Is this your goal, or if not can you reformulate what your goal is?
ADD COMMENT
0
Entering edit mode

Thanks Michael for your response. There are several questions that i want to ask

  1. The first is obviously the one i asked before - transcripts that are differentially expressed in genotype LL at either 5 or 33 treatment with respect to the other two populations (OP and M). Do you mean to say that the code that i have above is for testing the difference between 33 and 5 treatments across population. Then how do i get this for LL vs OP + M genotypes?
  2. Which has the greatest effect, is the the genotype acting differently? or is it the treatment that causes the effect? or the effect being how LL genotype handle hypoosmotic stress (treatment).

Please let me know if this is not clear and i will try to elaborate further...

Here is the whole code that i have written to answer the above questions in my capacity. Please find the same....

ADD REPLY
0
Entering edit mode

hi,

So first, let me recommend that you update to the latest version of DESeq2 which is 1.10. You can check with this:

packageVersion("DESeq2")

I think the best design for you, to address the different questions you'd like to answer is:

~ trt + trt:gt

This is essentially the same design as the one you have, but makes it easier to perform the contrasts that I think you have described.

There is not a single contrast which gives the answer of DE of e.g. (C vs A) or (B vs A), but you can accomplish this by taking the union of the FDR limited sets from two pairwise comparisons. You can compare OP vs LL for treatment 5 with the following:

results(dds, name="trt5.gtOP")

And you can compare M vs LL for treatment 5 with:

results(dds, name="trt5.gtM")

Likewise, you would swap in trt33 in the name for the same question in treatment 33.

If you want to ask if the OP vs LL effect is different in treatment 33 vs treatment 5, you form the contrast:

results(dds, contrast=list("trt33.gtOP", "trt5.gtOP"))

I'm not exactly sure how to answer some of your questions in 2. Remember, as I said previously, you can't estimate the treatment effect alone (only compare across treatments) because you do not have control samples.

ADD REPLY
0
Entering edit mode

Thanks Micheal. I have a question/clarification. Isn't the results(dds, name="trt5.gtOP") and results(dds, name="trt5.gtM") the same? The only difference being the log fold changes? (Similarly for the 33 treatment). In addition, i am getting the same set of genes in results(dds, contrast=list("trt33.gtOP", "trt5.gtOP"))and results(dds, contrast=list("trt33.gtM", "trt5.gtM")). I have set up this model dds <- nbinomLRT(dds, full = design(dds), reduced = ~ cond) for the above all .. Please correct me if anything i am missing/misinterpreting. Thanks

ADD REPLY
0
Entering edit mode

If you use DESeq() then the results() calls I posted will work.

If you have run nbinomLRT() instead, you'll have to specify that you want Wald tests for these contrasts by setting test="Wald" in results().

ADD REPLY
0
Entering edit mode

Thanks Michael again. I have incorporated the changes you recommended and here is the code for the same. Can you please look at it and let me know what you think. 

ADD REPLY
0
Entering edit mode
I skimmed the relevant parts and it follows my recommendation.
ADD REPLY
0
Entering edit mode

Thanks Michael. Very much appreciated for your help!

ADD REPLY

Login before adding your answer.

Traffic: 509 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6