timecourse + factorial + replicates in LIMMA
2
0
Entering edit mode
@aaronjmackeygskcom-1706
Last seen 10.2 years ago
I have an experimental setup in which four strains (A, B, C and D) are given a treatment or control mock treatment, and observed (by Affy) over a post-treatment timecourse (4 timepoints); each strain/treatment/timepoint observation is performed in replicate. At the end of the day, I'd like to answer two scientific questions: 1) which probesets are consistently (across all four strains) differentially expressed (treatment vs. control) at timepoints 2, 3 and 4? 2) which treatment-responsive probesets are consistently responsive within (but differentially responsive between) A&B and C&D strain groupings? My target matrix looks like this: array strain treatment time 1 A mock 1 2 A mock 1 3 A mock 1 4 A mock 2 5 A mock 2 6 A mock 2 ... 13 A treated 1 14 A treated 1 15 A treated 1 16 A treated 2 ... 25 B mock 1 26 B mock 1 ... 96 D treated 4 I built my design matrix like so: strain <- factor(target$strain); # etc. for treatment, time design <- model.matrix(~0+strain*treatment*time) And my "replicates" array looks like: c(1,1,1, 2,2,2, 3,3,3, 4,4,4, 5,5,5, ..., 32,32,32) Yet when I run duplicateCorrelation() to handle the replicates, I get a consensus correlation of 1, and "Inf" values for each correlation. What have I done wrong? (I haven't even gotten to building the contrast matrices to answer my questions of actual interest ...) Thanks, -Aaron
TimeCourse timecourse TimeCourse timecourse • 1.6k views
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.6 years ago
United States
Why would you want to use duplicateCorrelation? This is for error correlation. Presumably your replicates are biologically distinct, and required for the test statistic denominator. However, to answer your question, this is due to removing the intercept. With no intercept, the correlation is computed without removing the mean and this pretty much makes all the correlation 1. --Naomi At 04:38 PM 9/11/2007, aaron.j.mackey at gsk.com wrote: >I have an experimental setup in which four strains (A, B, C and D) are >given a treatment or control mock treatment, and observed (by Affy) over a >post-treatment timecourse (4 timepoints); each strain/treatment/timepoint >observation is performed in replicate. > >At the end of the day, I'd like to answer two scientific questions: > >1) which probesets are consistently (across all four strains) >differentially expressed (treatment vs. control) at timepoints 2, 3 and 4? > >2) which treatment-responsive probesets are consistently responsive within >(but differentially responsive between) A&B and C&D strain groupings? > >My target matrix looks like this: > >array strain treatment time >1 A mock 1 >2 A mock 1 >3 A mock 1 >4 A mock 2 >5 A mock 2 >6 A mock 2 >... >13 A treated 1 >14 A treated 1 >15 A treated 1 >16 A treated 2 >... >25 B mock 1 >26 B mock 1 >... >96 D treated 4 > >I built my design matrix like so: > >strain <- factor(target$strain); # etc. for treatment, time >design <- model.matrix(~0+strain*treatment*time) > >And my "replicates" array looks like: > >c(1,1,1, 2,2,2, 3,3,3, 4,4,4, 5,5,5, ..., 32,32,32) > >Yet when I run duplicateCorrelation() to handle the replicates, I get a >consensus correlation of 1, and "Inf" values for each correlation. > >What have I done wrong? > >(I haven't even gotten to building the contrast matrices to answer my >questions of actual interest ...) > >Thanks, > >-Aaron > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
"Naomi Altman" <naomi at="" stat.psu.edu=""> wrote on 09/11/2007 05:23:59 PM: > Why would you want to use duplicateCorrelation? This is for error > correlation. Presumably your replicates are biologically distinct, > and required for the test statistic denominator. Sorry, I didn't explain myself very well. The replicates are technical replicates - same biological organism, not distinct (there were four distinct organisms, from strains A, B, C and D). I guess since I only have one biological replicate per strain, that the distinction between technical and biological replicates might not matter in this case. > However, to answer your question, this is due to removing the > intercept. With no intercept, the correlation is computed without > removing the mean and this pretty much makes all the correlation 1. Thanks. I removed the intercept because I wanted to be able to model each strain independently (with the intercept, I only get strains B, C and D as factors; A is subsumed by the intercept). -Aaron > At 04:38 PM 9/11/2007, aaron.j.mackey at gsk.com wrote: > >I have an experimental setup in which four strains (A, B, C and D) are > >given a treatment or control mock treatment, and observed (by Affy) over a > >post-treatment timecourse (4 timepoints); each strain/treatment/timepoint > >observation is performed in replicate. > > > >At the end of the day, I'd like to answer two scientific questions: > > > >1) which probesets are consistently (across all four strains) > >differentially expressed (treatment vs. control) at timepoints 2, 3 and 4? > > > >2) which treatment-responsive probesets are consistently responsive within > >(but differentially responsive between) A&B and C&D strain groupings? > > > >My target matrix looks like this: > > > >array strain treatment time > >1 A mock 1 > >2 A mock 1 > >3 A mock 1 > >4 A mock 2 > >5 A mock 2 > >6 A mock 2 > >... > >13 A treated 1 > >14 A treated 1 > >15 A treated 1 > >16 A treated 2 > >... > >25 B mock 1 > >26 B mock 1 > >... > >96 D treated 4 > > > >I built my design matrix like so: > > > >strain <- factor(target$strain); # etc. for treatment, time > >design <- model.matrix(~0+strain*treatment*time) > > > >And my "replicates" array looks like: > > > >c(1,1,1, 2,2,2, 3,3,3, 4,4,4, 5,5,5, ..., 32,32,32) > > > >Yet when I run duplicateCorrelation() to handle the replicates, I get a > >consensus correlation of 1, and "Inf" values for each correlation. > > > >What have I done wrong? > > > >(I haven't even gotten to building the contrast matrices to answer my > >questions of actual interest ...) > > > >Thanks, > > > >-Aaron > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor at stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > >Search the archives: > >http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > >
ADD REPLY
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.6 years ago
United States
In this situation (all technical replicates) you cannot really make biological conclusions about differences among strains. You can only make conclusions about differences among organisms in which case the technical replicates are treated as though independent (but should come from independent samples from the organisms- i.e. independent RNA extractions). In general, even if you want to get the means for each strain, you need to use the model with intercept for computing the correlations. --Naomi At 11:57 AM 9/12/2007, aaron.j.mackey at gsk.com wrote: >"Naomi Altman" <naomi at="" stat.psu.edu=""> wrote on 09/11/2007 05:23:59 PM: > > > Why would you want to use duplicateCorrelation? This is for error > > correlation. Presumably your replicates are biologically distinct, > > and required for the test statistic denominator. > >Sorry, I didn't explain myself very well. The replicates are technical >replicates - same biological organism, not distinct (there were four >distinct organisms, from strains A, B, C and D). I guess since I only >have one biological replicate per strain, that the distinction between >technical and biological replicates might not matter in this case. > > > However, to answer your question, this is due to removing the > > intercept. With no intercept, the correlation is computed without > > removing the mean and this pretty much makes all the correlation 1. > >Thanks. I removed the intercept because I wanted to be able to model each >strain independently (with the intercept, I only get strains B, C and D as >factors; A is subsumed by the intercept). > >-Aaron > > > At 04:38 PM 9/11/2007, aaron.j.mackey at gsk.com wrote: > > >I have an experimental setup in which four strains (A, B, C and D) are > > >given a treatment or control mock treatment, and observed (by Affy) >over a > > >post-treatment timecourse (4 timepoints); each >strain/treatment/timepoint > > >observation is performed in replicate. > > > > > >At the end of the day, I'd like to answer two scientific questions: > > > > > >1) which probesets are consistently (across all four strains) > > >differentially expressed (treatment vs. control) at timepoints 2, 3 and >4? > > > > > >2) which treatment-responsive probesets are consistently responsive >within > > >(but differentially responsive between) A&B and C&D strain groupings? > > > > > >My target matrix looks like this: > > > > > >array strain treatment time > > >1 A mock 1 > > >2 A mock 1 > > >3 A mock 1 > > >4 A mock 2 > > >5 A mock 2 > > >6 A mock 2 > > >... > > >13 A treated 1 > > >14 A treated 1 > > >15 A treated 1 > > >16 A treated 2 > > >... > > >25 B mock 1 > > >26 B mock 1 > > >... > > >96 D treated 4 > > > > > >I built my design matrix like so: > > > > > >strain <- factor(target$strain); # etc. for treatment, time > > >design <- model.matrix(~0+strain*treatment*time) > > > > > >And my "replicates" array looks like: > > > > > >c(1,1,1, 2,2,2, 3,3,3, 4,4,4, 5,5,5, ..., 32,32,32) > > > > > >Yet when I run duplicateCorrelation() to handle the replicates, I get a > > >consensus correlation of 1, and "Inf" values for each correlation. > > > > > >What have I done wrong? > > > > > >(I haven't even gotten to building the contrast matrices to answer my > > >questions of actual interest ...) > > > > > >Thanks, > > > > > >-Aaron > > > > > >_______________________________________________ > > >Bioconductor mailing list > > >Bioconductor at stat.math.ethz.ch > > >https://stat.ethz.ch/mailman/listinfo/bioconductor > > >Search the archives: > > >http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > Naomi S. Altman 814-865-3791 (voice) > > Associate Professor > > Dept. of Statistics 814-863-7114 (fax) > > Penn State University 814-865-1348 (Statistics) > > University Park, PA 16802-2111 > > > > > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT

Login before adding your answer.

Traffic: 1065 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6