Question

Multifactorial experimental design in DESeq2

0

Entering edit mode

upendrakumar.devisetty • 0

@upendrakumardevisetty-7478

Last seen 3.8 years ago

United States

I have a question about the design that i am using for a question that i am interested in. Basically i have three genotypes (LL, OP and M), each of which have been subjected to 2 treatments (5 and 33) with 6 biological replicates per treatment. Here is the sample dataframe for the experiment

sample trt gt rep
1 5_OP_1 5 OP 1
2 5_OP_2 5 OP 2
3 5_OP_3 5 OP 3
4 5_OP_4 5 OP 4
5 5_OP_5 5 OP 5
6 5_OP_6 5 OP 6
7 33_OP_1 33 OP 1
8 33_OP_2 33 OP 2
9 33_OP_3 33 OP 3
10 33_OP_4 33 OP 4
11 33_OP_5 33 OP 5
12 33_OP_6 33 OP 6
13 5_M_1 5 M 1
14 5_M_2 5 M 2
15 5_M_3 5 M 3
16 5_M_4 5 M 4
17 5_M_5 5 M 5
18 5_M_6 5 M 6
19 33_M_1 33 M 1
20 33_M_2 33 M 2
21 33_M_3 33 M 3
22 33_M_4 33 M 4
23 33_M_5 33 M 5
24 33_M_6 33 M 6
25 5_LL_1 5 LL 1
26 5_LL_2 5 LL 2
27 5_LL_3 5 LL 3
28 5_LL_4 5 LL 4
29 5_LL_5 5 LL 5
30 5_LL_6 5 LL 6
31 33_LL_1 33 LL 1
32 33_LL_2 33 LL 2
33 33_LL_3 33 LL 3
34 33_LL_4 33 LL 4
35 33_LL_5 33 LL 5
36 33_LL_6 33 LL 6

Now i am interested in transcripts that are differentially expressed in genotype LL at either 5 or 33 treatment with respect to the other two genotypes (OP and M). Here is what i did and i want to make sure that i did correct.

# Load the DESeq2 package
library("DESeq2")

# Load the file
data <- read.csv("data_51_full.txt", sep = "\t")

# Create a df with sample, treatment, genotype and replicate information
samples <- data.frame(
  sample = names(data),
  trt = sub("(5|33)_(OP|M|LL)_([1-6])","\\1",names(data)),
  gt  = sub("(5|33)_(OP|M|LL)_([1-6])","\\2",names(data)),
  rep  = sub("(5|33)_(OP|M|LL)_([1-6])","\\3",names(data))
)
samples

# Count matrix input
dds <- DESeqDataSetFromMatrix(countData = data,
                              colData = samples,
                              design = ~ 1)

# Filtering to remove rows with 0 reads
dds <- dds[ rowSums(counts(dds)) > 1, ]

# design
design(dds) <- ~ trt + gt + trt:gt

# Run DESeq2
dds <- DESeq(dds)

# Print the coefficients
resultsNames(dds)
#[1] "Intercept"    "trt_5_vs_33" "gt_M_vs_LL"  "gt_OP_vs_LL" "trt5.gtM"   "trt5.gtOP" 

# The treatment effect for genotype I (LL) (the main effect)
res <- results(dds, contrast=c("trt","5","33"))  

mcols(res)$description
#[1] "mean of normalized counts for all samples" "log2 fold change (MLE): trt 5 vs 33"     
#[3] "standard error: trt 5 vs 33"              "Wald statistic: trt 5 vs 33"             
#[5] "Wald test p-value: trt 5 vs 33"           "BH adjusted p-values"

deseq2 • 1.8k views

ADD COMMENT • link updated 8.2 years ago by Michael Love 41k • written 8.2 years ago by upendrakumar.devisetty • 0

score 0 · Answer 1 · 2016-02-10

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 12 hours ago

United States

Let me ask a few more questions about your goal. The interaction design may not be the right approach. "i am interested in transcripts that are differentially expressed in genotype LL at either 5 or 33 treatment with respect to the other two populations (OP and M)" Note that, with your experiment, you can't measure the 5 or 33 treatment effect because there are no control samples. But you can contrast the two treatments. The interaction design helps to answer if the difference between 33 and 5 treatment changes across population. Is this your goal, or if not can you reformulate what your goal is?

ADD COMMENT • link 8.2 years ago Michael Love 41k

0

Entering edit mode

Thanks Michael for your response. There are several questions that i want to ask

The first is obviously the one i asked before - transcripts that are differentially expressed in genotype LL at either 5 or 33 treatment with respect to the other two populations (OP and M). Do you mean to say that the code that i have above is for testing the difference between 33 and 5 treatments across population. Then how do i get this for LL vs OP + M genotypes?
Which has the greatest effect, is the the genotype acting differently? or is it the treatment that causes the effect? or the effect being how LL genotype handle hypoosmotic stress (treatment).

Please let me know if this is not clear and i will try to elaborate further...

Here is the whole code that i have written to answer the above questions in my capacity. Please find the same....

ADD REPLY • link 8.2 years ago upendrakumar.devisetty • 0

0

Entering edit mode

hi,

So first, let me recommend that you update to the latest version of DESeq2 which is 1.10. You can check with this:

packageVersion("DESeq2")

I think the best design for you, to address the different questions you'd like to answer is:

~ trt + trt:gt

This is essentially the same design as the one you have, but makes it easier to perform the contrasts that I think you have described.

There is not a single contrast which gives the answer of DE of e.g. (C vs A) or (B vs A), but you can accomplish this by taking the union of the FDR limited sets from two pairwise comparisons. You can compare OP vs LL for treatment 5 with the following:

results(dds, name="trt5.gtOP")

And you can compare M vs LL for treatment 5 with:

results(dds, name="trt5.gtM")

Likewise, you would swap in trt33 in the name for the same question in treatment 33.

If you want to ask if the OP vs LL effect is different in treatment 33 vs treatment 5, you form the contrast:

results(dds, contrast=list("trt33.gtOP", "trt5.gtOP"))

I'm not exactly sure how to answer some of your questions in 2. Remember, as I said previously, you can't estimate the treatment effect alone (only compare across treatments) because you do not have control samples.

ADD REPLY • link 8.2 years ago Michael Love 41k

0

Entering edit mode

Thanks Micheal. I have a question/clarification. Isn't the results(dds, name="trt5.gtOP") and results(dds, name="trt5.gtM") the same? The only difference being the log fold changes? (Similarly for the 33 treatment). In addition, i am getting the same set of genes in results(dds, contrast=list("trt33.gtOP", "trt5.gtOP"))and results(dds, contrast=list("trt33.gtM", "trt5.gtM")). I have set up this model dds <- nbinomLRT(dds, full = design(dds), reduced = ~ cond) for the above all .. Please correct me if anything i am missing/misinterpreting. Thanks

ADD REPLY • link 8.2 years ago upendrakumar.devisetty • 0

0

Entering edit mode

If you use DESeq() then the results() calls I posted will work.

If you have run nbinomLRT() instead, you'll have to specify that you want Wald tests for these contrasts by setting test="Wald" in results().

ADD REPLY • link 8.2 years ago Michael Love 41k

0

Entering edit mode

Thanks Michael again. I have incorporated the changes you recommended and here is the code for the same. Can you please look at it and let me know what you think.

ADD REPLY • link 8.2 years ago upendrakumar.devisetty • 0

0

Entering edit mode

I skimmed the relevant parts and it follows my recommendation.

ADD REPLY • link 8.2 years ago Michael Love 41k

0

Entering edit mode

Thanks Michael. Very much appreciated for your help!

ADD REPLY • link 8.2 years ago upendrakumar.devisetty • 0