Repeated Measures mRNA expression analysis II
3
0
Entering edit mode
@charles-determan-jr-5949
Last seen 6.4 years ago
United States

I apologize for a second post but I want to bring this questing back up as I still cannot find a definitive answer on my own (Moderator: previous post was Repeated Measures mRNA expression analysis I ). In brief, I am wondering about the design matrix when testing for differential expression between two groups within which each sample has been measured at consecutive timepoints (repeated measures). Therefore, if my interpretations are correct, I need a two-way analysis that recognizes dependence between consecutive measurements. I am familiar with limma, edgeR and DESeq but am uncertain how to design an appropriate design matrix for these comparisons. The best I can guess is that I add a 'Subject' factor to the design matrix corresponding to each unique sample to correct for dependence, is this correct?

My sincere regards,
Charles

--
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

 

limma edgeR DESeq • 2.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 5 hours ago
United States

Hi Charles,

It depends on how sophisticated you want to get, or alternatively what assumptions you are willing to make. The simplest thing to do would be to block on subject (see the blocking portion of the limma User's guide, starting on p. 42). This makes very simple assumptions about the data, namely that the differences between subjects can be accounted for by the mean of each subject.

Best,
Jim

ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

Hi Charles,

Yes, you're on the right track now, but this is not a simple design and it requires care.  As James says, it depends on what assumptions you want to make.  I would add that it also depends on what questions you want to answer.  In my previous two posts (see Repeated Measures mRNA expression analysis I ), I tried to prompt you to state what questions you want to answer, but you haven't taken the bait yet.  A statistical analysis is always designed to test certain scientific questions -- there isn't a "correct" analysis for a given design independent of what your hypotheses are.

Have you looked at Section 3.5 "Comparisons Both Between and Within Subjects" in the edgeR User's Guide?  The design discussed in this section is the same as your experiment, except that you have 3 repeated measures per subject instead of 2.

The analysis given in the edgeR user's guide allows you to find genes that are different over time for (i) treated subjects and (ii) control subjects, and it allows you to find genes that respond differently to time in the treated vs control subject.

However it does not allow you to test for a baseline difference between treated and control subjects at time 0.  If you need to do this, then a quite different analysis is needed (discussed in Section 9.7 "Multi-level Experiments" of the limma User's Guide).

Best wishes
Gordon

ADD COMMENT
0
Entering edit mode

Thank you Gordon,

I apologize for not being more straightforward with a specific question.  I continue to be reminded by my ignorance in approaching statistical problems but my education continues.  I had previously been under the impression that different experimental designs have specific methods that are most appropriate.  I previously read section 3.5 in the edgeR guide but glossed over it because it didn't have time points explicitly included.  I feel a
little silly that the idea of between and within subjects escaped me but that should serve my purposes.  If you will indulge one further question concerning that very example.

The design matrix I generate looks like this:

> colnames(design) [1] "(Intercept)"                "group.Treatment"
 [3] "group.control:subject"      "group.Treatment:subject"
 [5] "group.control:times.time2"  "group.Treatment:times.time2"
 [7] "group.control:times.time3"  "group.Treatment:times.time3"

You said "The analysis given in the edgeR user's guide allows you to find genes that are different over time for (i) treated subjects and (ii) control subjects, and it allows you to find genes that respond differently to time in the treated vs control subject."  If I am not mistaken,
coefficients 5-8 correspond to point (i) and (ii).  However, I don't see how I can determine which genes respond different to time in the treated vs. control subject.  I apologize if I seem obtuse but these interactions have always been difficult for me to conceptualize.  Any explanation or direction so that I may understand these interactions related to your points would be sincerely appreciated.

Best regards,
Charles

ADD REPLY
0
Entering edit mode

I made an error in my last response, my subject was not set as a factor. The design matrix looks like the matrix below.  Perhaps a shorter question that will satisfy me is what I do with all the 'subject' coefficients. If I am only interested in the genes that respond differently to time in
treated vs. control subjects do I simply ignore the subject coefficients? Lastly, am I correct that to determine genes that respond differently to time in treated vs. control subjects I simply conduct the contrasts between the last coefficients (i.e. 40-39 and 38-37).  Apologies for turning
this into such a long post, I hope it is helpful for others as well.

Thanks, Charles

> colnames(design)
[1] "(Intercept)" "group.Treatment"
[3] "group.Control:subject.2" "group.Treatment:subject.2"
[5] "group.Control:subject.3" "group.Treatment:subject.3"
[7] "group.Control:subject.4" "group.Treatment:subject.4"
[9] "group.Control:subject.5" "group.Treatment:subject.5"
[11] "group.Control:subject.6" "group.Treatment:subject.6"
[13] "group.Control:subject.7" "group.Treatment:subject.7"
[15] "group.Control:subject.8" "group.Treatment:subject.8"
[17] "group.Control:subject.9" "group.Treatment:subject.9"
[19] "group.Control:subject.10" "group.Treatment:subject.10"
[21] "group.Control:subject.11" "group.Treatment:subject.11"
[23] "group.Control:subject.12" "group.Treatment:subject.12"
[25] "group.Control:subject.13" "group.Treatment:subject.13"
[27] "group.Control:subject.14" "group.Treatment:subject.14"
[29] "group.Control:subject.15" "group.Treatment:subject.15"
[31] "group.Control:subject.16" "group.Treatment:subject.16"
[33] "group.Control:subject.17" "group.Treatment:subject.17"
[35] "group.Control:subject.18" "group.Treatment:subject.18"
[37] "group.Control:times.time2" "group.Treatment:times.time2"
[39] "group.Control:times.time3" "group.Treatment:times.time3"

 

ADD REPLY
0
Entering edit mode

> I made an error in my last response, my subject was not set as a factor.
> The design matrix looks like the matrix below.  Perhaps a shorter
> question that will satisfy me is what I do with all the 'subject'
> coefficients.  If I am only interested in the genes that respond
> differently to time in treated vs. control subjects do I simply
> ignore the subject coefficients?

Yes, you ignore them.  They take care of the subject baseline effects, but
these are not of interest to you.

> Lastly, am I correct that to determine genes that respond differently to
> time in treated vs. control subjects I simply conduct the contrasts
> between the last coefficients (i.e. 40-39 and 38-37).

Yes.

> Apologies for turning this into such a long post, I hope it is helpful
> for others as well.

To find genes DE at time2 in the controls:
   coef = "group.Control:times.time2"

To find genes DE at time2 in the treated subjects:
   coef = "group.Treatment:times.time2"

To find genes DE at time3 in the controls:
   coef = "group.Control:times.time3"

To find genes DE at time3 in the treated subjects:
   coef = "group.reatment:times.time3"

Best wishes
Gordon

ADD REPLY
0
Entering edit mode

On Tue, 2 Jul 2013, Charles Determan Jr wrote:

> Thank you Gordon,
> I apologize for not being more straightforward with a specific question.
> I continue to be reminded by my ignorance in approaching statistical
> problems but my education continues.  I had previously been under the
> impression that different experimental designs have specific methods
> that are most appropriate.

This is a common misconception.  The appropriate analysis is not determined solely by the experimental layout.

Gordon

ADD REPLY

Login before adding your answer.

Traffic: 284 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6