DESeq2 multifactor and paired sample analysis
3
0
Entering edit mode
amy • 0
@amy-10733
Last seen 7.9 years ago

I am using DESeq2_1.12.0  on R (version 3.3.0) to analyze RNASeq. I have some questions on how to set up the design formula and have looked at a number of multifactor analysis but can't quite find what I am looking for.

I have a total of 24 samples from twelve subjects.

  • These twelve subjects are divided into two groups (group A and group B).
  • In group A, samples were taken at day 0 and day 1.
  • In group B, samples were taken at day 0 and day 5.
  • The different sampling time between the groups is due to the destructive sampling procedure
  • Within each group, half of the subject received treatment, while the other half didn't.

Here is how I set up the metadata table, with an added column for paired analysis (paired.subject)

Group

treatment time subject paired.subject

A

control day0 one one

A

control day1 one one

A

control day0 two two

A

control day1 two two

A

control day0 three three

A

control day1 three three

A

treated day0 four one

A

treated day1 four one

A

treated day0 five two

A

treated day1 five two

A

treated day0 six three

A

treated day1 six three

B

control day0 seven four

B

control day5 seven four

B

control day0 eight five

B

control day5 eight five

B

control day0 nine six

B

control day5 nine six

B

treated day0 ten four

B

treated day5 ten four

B

treated day0 eleven five

B

treated day5 eleven five

B

treated day0 twelve six

B

treated day5 twelve six

To figure out the differences between treated vs control within each group between the different time points, I subset the data based on groups and analyzed them separately.

dds <- DESeqDataSetFromMatrix(countData = countdata,
                              colData = coldata,
                              design = ~ treatment)

dds$combined <- factor(paste0(dds$treatment, dds$time))

m1 <- model.matrix(~paired.subject + combined, colData(dds))

  • For example, for group A, difference between treated and control on day 1 was extracted by:

resdds <- results(dds, contrast = list("combined.treated.day1", "combined.control.day1")

 

My questions are:

  • Is it possible to compare the differences between (combined.treated.day1-combined.treated.day0) to (combined.control.day1-combined.control.day0)?

 

  • I know in the vignette and also in other posts that adding extra samples is typically better for the dispersion estimate DESeq2 design matrix for 6 groups.
  • However, for this dataset, I don't get any DE genes when I combined groupA and groupB together, which is why I subset them. Any suggestion on what to do if I want to compare between day 1 and day 5?

 

Thanks for your feedback.

 

 

deseq2 • 1.7k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 5 hours ago
United States

hi Amy,

The "factor(paste0(" approach is easier when you only want to compare individual pairs of groups (defined by unique combinations of variables), but if you want to compare fold changes, it makes more sense to use a design with interaction terms. You can also extract pairwise comparisons with the interaction design, it's just that typically users find it a bit more challenging to know which terms to extract, and this is why I have the simpler approach described in the vignette. If you have difficulty interpreting the terms in my following suggestion, I'd recommend you try to find a local statistician or person with experience in linear modeling who can help explain the interpretation of these.

You should change the day1 and day5 levels to just dayX, which will represent the post-day0 time point, and is just for convenience in building the design and model matrix. You will nevertheless get separate results for groups A and B.

levels(coldata$time)[2] <- "dayX"
levels(coldata$time)[3] <- "dayX"

Next you will add a column, "nested.subj" which will help us deal with the correlation of samples from the same subject within unique Group x Treatment combinations.

coldata$nested.subj <- factor(rep(rep(1:3, each=2),4))

Finally, you will use a design which controls for the subject differences within Group x Treatment, and let's you pull out the interaction of treatment and time for each group:

~ group + group:treatment + group:treatment:nested.subj + group:time + group:treatment:time

The two terms "groupA:treatmenttreated:timedayX" and "groupB:treatmenttreated:timedayX" are for testing whether the LFC across day is different for treatment vs control, and this is split for group A and group B.

ADD COMMENT
0
Entering edit mode
amy • 0
@amy-10733
Last seen 7.9 years ago

Hi Michael,

Thank you for the clear explanation.

I tried out the design as you specified. However, I encountered the error: "model matrix is not full rank, one or more variables in the design formula are linear combinations of the others".

I didn't think that the nested.subj was the a linear combination of the others?

Thanks,

Amy

ADD COMMENT
0
Entering edit mode

Can you post colData, so I can see if it matches what I was suggesting?

ADD REPLY
0
Entering edit mode
amy • 0
@amy-10733
Last seen 7.9 years ago

Group

treatment time nested.subject
A control day0 1
A control dayX 1
A control day0 2
A control dayX 2
A control day0 3
A control dayX 3
A treated day0 1
A treated dayX 1
A treated day0 2
A treated dayX 2
A treated day0 3
A treated dayX 3
B control day0 1
B control dayX 1
B control day0 2
B control dayX 2
B control day0 3
B control dayX 3
B treated day0 1
B treated dayX 1
B treated day0 2
B treated dayX 2
B treated day0 3
B treated dayX 3

 

 

Thank you Mike. Sorry for not providing enough information - just edited this post.

coldata = with(samples, data.frame(shortname = I(shortname), Group = Group, treatment = treatment, time = time, nested.subject = nested.subject))

countdata = counts

dds <- DESeqDataSetFromMatrix(countData = countdata,
                              colData = coldata,
                              design = ~ treatment)

dds$treatment <- relevel(dds$treatment, "control")
dds$time <- relevel(dds$time, "day0")

m1 <- model.matrix(~ Group + Group:treatment + Group:treatment:nested.subject + Group:time + Group:treatment:time, colData(dds))

dds = DESeq(dds, full = m1, betaPrior = FALSE, parallel = TRUE)

 

Let me know if it would be easier to send you the script via email.

 

What I ended up doing was to remove the nested.subject and really simplify my design matrix to:

m1 <- model.matrix( ~Group + Group:treatment + Group:treatment:Time, colData(dds))

With this new design matrix, I did not observe the model matrix error.

 

Thanks for your time!

 

 

 

 

ADD COMMENT
0
Entering edit mode

And what design are you using where you get the "model matrix is not full rank" error? Can you post the code?

ADD REPLY

Login before adding your answer.

Traffic: 615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6