Limma; a kind of extended paired analyses with or without treatment

0

Entering edit mode

john herbert ▴ 560

@john-herbert-4612

Last seen 11.4 years ago

Dear all. I have been pondering about constructing a design matrix based on the Limma user guide, where I combine a time course with a paired analyses. The targets file looks like; Sample treatment time 1 control 24 1 control 72 1 control 0 1 treatment 24 1 treatment 72 2 control 24 2 control 72 2 control 0 2 treatment 24 2 treatment 72 3 control 24 3 control 72 3 control 0 3 treatment 24 3 treatment 72 Sample number refers to an individuals cancer cells, treatment refers to added drug or not and numbers are in hours (time elapsed). So it is a kind of paired, as patient variability is to be considered. The control sample at 0 is the same as treatment at time 0 as these are the same cells without any time/treatment. Please could someone help me understand how I can construct a design matrix and to understand how I can extract differently expressed genes that come about due to time, due to treatment and interaction of them both. Any pointers appreciated, though I am trying to see if the examples in the manual can be applied to this scenario. Thank you. John.

Cancer Cancer • 2.5k views

ADD COMMENT • link 13.3 years ago john herbert ▴ 560

0

Entering edit mode

john herbert ▴ 560

@john-herbert-4612

Last seen 11.4 years ago

Dear all. I have been pondering about constructing a design matrix based on the Limma user guide, where I combine a time course with a paired analyses. The targets file looks like; Sample treatment time 1 control 24 1 control 72 1 control 0 1 treatment 24 1 treatment 72 2 control 24 2 control 72 2 control 0 2 treatment 24 2 treatment 72 3 control 24 3 control 72 3 control 0 3 treatment 24 3 treatment 72 Sample number refers to an individuals cancer cells, treatment refers to added drug or not and numbers are in hours (time elapsed). So it is a kind of paired, as patient variability is to be considered. The control sample at 0 is the same as treatment at time 0 as these are the same cells without any time/treatment. Please could someone help me understand how I can construct a design matrix and to understand how I can extract differently expressed genes that come about due to time, due to treatment and interaction of them both. Any pointers appreciated, though I am trying to see if the examples in the manual can be applied to this scenario. Thank you. John.

ADD COMMENT • link 13.3 years ago john herbert ▴ 560

0

Entering edit mode

Hi John, On 10/11/2012 2:15 PM, john herbert wrote: > Dear all. > I have been pondering about constructing a design matrix based on the > Limma user guide, where I combine a time course with a paired > analyses. The targets file looks like; > > Sample treatment time > 1 control 24 > 1 control 72 > 1 control 0 > 1 treatment 24 > 1 treatment 72 > 2 control 24 > 2 control 72 > 2 control 0 > 2 treatment 24 > 2 treatment 72 > 3 control 24 > 3 control 72 > 3 control 0 > 3 treatment 24 > 3 treatment 72 > > Sample number refers to an individuals cancer cells, treatment refers > to added drug or not and numbers are in hours (time elapsed). So it is > a kind of paired, as patient variability is to be considered. The > control sample at 0 is the same as treatment at time 0 as these are > the same cells without any time/treatment. > > Please could someone help me understand how I can construct a design > matrix and to understand how I can extract differently expressed genes > that come about due to time, due to treatment and interaction of them > both. > > Any pointers appreciated, though I am trying to see if the examples in > the manual can be applied to this scenario. See the multi-level experiment example in the user guide, starting on p. 47. Best, Jim > > Thank you. > > John. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 13.3 years ago James W. MacDonald 68k

0

Entering edit mode

Hi John, On 10/11/2012 2:15 PM, john herbert wrote: > Dear all. > I have been pondering about constructing a design matrix based on the > Limma user guide, where I combine a time course with a paired > analyses. The targets file looks like; > > Sample treatment time > 1 control 24 > 1 control 72 > 1 control 0 > 1 treatment 24 > 1 treatment 72 > 2 control 24 > 2 control 72 > 2 control 0 > 2 treatment 24 > 2 treatment 72 > 3 control 24 > 3 control 72 > 3 control 0 > 3 treatment 24 > 3 treatment 72 > > Sample number refers to an individuals cancer cells, treatment refers > to added drug or not and numbers are in hours (time elapsed). So it is > a kind of paired, as patient variability is to be considered. The > control sample at 0 is the same as treatment at time 0 as these are > the same cells without any time/treatment. > > Please could someone help me understand how I can construct a design > matrix and to understand how I can extract differently expressed genes > that come about due to time, due to treatment and interaction of them > both. > > Any pointers appreciated, though I am trying to see if the examples in > the manual can be applied to this scenario. See the multi-level experiment example in the user guide, starting on p. 47. Best, Jim > > Thank you. > > John. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 13.3 years ago James W. MacDonald 68k

0

Entering edit mode

Thanks James, This does not have time course but judging by your answer, I can just add this in, in place of, say, tissue. Kind regards, John. On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > Hi John, > > > On 10/11/2012 2:15 PM, john herbert wrote: >> >> Dear all. >> I have been pondering about constructing a design matrix based on the >> Limma user guide, where I combine a time course with a paired >> analyses. The targets file looks like; >> >> Sample treatment time >> 1 control 24 >> 1 control 72 >> 1 control 0 >> 1 treatment 24 >> 1 treatment 72 >> 2 control 24 >> 2 control 72 >> 2 control 0 >> 2 treatment 24 >> 2 treatment 72 >> 3 control 24 >> 3 control 72 >> 3 control 0 >> 3 treatment 24 >> 3 treatment 72 >> >> Sample number refers to an individuals cancer cells, treatment refers >> to added drug or not and numbers are in hours (time elapsed). So it is >> a kind of paired, as patient variability is to be considered. The >> control sample at 0 is the same as treatment at time 0 as these are >> the same cells without any time/treatment. >> >> Please could someone help me understand how I can construct a design >> matrix and to understand how I can extract differently expressed genes >> that come about due to time, due to treatment and interaction of them >> both. >> >> Any pointers appreciated, though I am trying to see if the examples in >> the manual can be applied to this scenario. > > > See the multi-level experiment example in the user guide, starting on p. 47. > > Best, > > Jim > >> >> Thank you. >> >> John. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 >

ADD REPLY • link 13.3 years ago john herbert ▴ 560

0

Entering edit mode

Thanks James, This does not have time course but judging by your answer, I can just add this in, in place of, say, tissue. Kind regards, John. On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > Hi John, > > > On 10/11/2012 2:15 PM, john herbert wrote: >> >> Dear all. >> I have been pondering about constructing a design matrix based on the >> Limma user guide, where I combine a time course with a paired >> analyses. The targets file looks like; >> >> Sample treatment time >> 1 control 24 >> 1 control 72 >> 1 control 0 >> 1 treatment 24 >> 1 treatment 72 >> 2 control 24 >> 2 control 72 >> 2 control 0 >> 2 treatment 24 >> 2 treatment 72 >> 3 control 24 >> 3 control 72 >> 3 control 0 >> 3 treatment 24 >> 3 treatment 72 >> >> Sample number refers to an individuals cancer cells, treatment refers >> to added drug or not and numbers are in hours (time elapsed). So it is >> a kind of paired, as patient variability is to be considered. The >> control sample at 0 is the same as treatment at time 0 as these are >> the same cells without any time/treatment. >> >> Please could someone help me understand how I can construct a design >> matrix and to understand how I can extract differently expressed genes >> that come about due to time, due to treatment and interaction of them >> both. >> >> Any pointers appreciated, though I am trying to see if the examples in >> the manual can be applied to this scenario. > > > See the multi-level experiment example in the user guide, starting on p. 47. > > Best, > > Jim > >> >> Thank you. >> >> John. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 >

ADD REPLY • link 13.3 years ago john herbert ▴ 560

0

Entering edit mode

Ugh. Jumped the gun. This does *not* require you to fit a random effects model, as you have done every treatment to cells from each patient. You can just block on Sample and then make your comparisons. In other words, if you add Sample to your design matrix, you will in effect be removing the patient-specific effect. Something like design <- model.matrix(~0+treatment*time+sample) Best, Jim On 10/11/2012 2:27 PM, john herbert wrote: > Thanks James, > This does not have time course but judging by your answer, I can just > add this in, in place of, say, tissue. > > Kind regards, > > John. > > On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald<jmacdon at="" uw.edu=""> wrote: >> Hi John, >> >> >> On 10/11/2012 2:15 PM, john herbert wrote: >>> Dear all. >>> I have been pondering about constructing a design matrix based on the >>> Limma user guide, where I combine a time course with a paired >>> analyses. The targets file looks like; >>> >>> Sample treatment time >>> 1 control 24 >>> 1 control 72 >>> 1 control 0 >>> 1 treatment 24 >>> 1 treatment 72 >>> 2 control 24 >>> 2 control 72 >>> 2 control 0 >>> 2 treatment 24 >>> 2 treatment 72 >>> 3 control 24 >>> 3 control 72 >>> 3 control 0 >>> 3 treatment 24 >>> 3 treatment 72 >>> >>> Sample number refers to an individuals cancer cells, treatment refers >>> to added drug or not and numbers are in hours (time elapsed). So it is >>> a kind of paired, as patient variability is to be considered. The >>> control sample at 0 is the same as treatment at time 0 as these are >>> the same cells without any time/treatment. >>> >>> Please could someone help me understand how I can construct a design >>> matrix and to understand how I can extract differently expressed genes >>> that come about due to time, due to treatment and interaction of them >>> both. >>> >>> Any pointers appreciated, though I am trying to see if the examples in >>> the manual can be applied to this scenario. >> >> See the multi-level experiment example in the user guide, starting on p. 47. >> >> Best, >> >> Jim >> >>> Thank you. >>> >>> John. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 13.3 years ago James W. MacDonald 68k

0

Entering edit mode

Thanks James, appreciated as you have saved me a lot of time. John. On Thu, Oct 11, 2012 at 7:48 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > Ugh. Jumped the gun. This does *not* require you to fit a random effects > model, as you have done every treatment to cells from each patient. You can > just block on Sample and then make your comparisons. > > In other words, if you add Sample to your design matrix, you will in effect > be removing the patient-specific effect. Something like > > design <- model.matrix(~0+treatment*time+sample) > > Best, > > Jim > > > > > On 10/11/2012 2:27 PM, john herbert wrote: >> >> Thanks James, >> This does not have time course but judging by your answer, I can just >> add this in, in place of, say, tissue. >> >> Kind regards, >> >> John. >> >> On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald<jmacdon at="" uw.edu=""> >> wrote: >>> >>> Hi John, >>> >>> >>> On 10/11/2012 2:15 PM, john herbert wrote: >>>> >>>> Dear all. >>>> I have been pondering about constructing a design matrix based on the >>>> Limma user guide, where I combine a time course with a paired >>>> analyses. The targets file looks like; >>>> >>>> Sample treatment time >>>> 1 control 24 >>>> 1 control 72 >>>> 1 control 0 >>>> 1 treatment 24 >>>> 1 treatment 72 >>>> 2 control 24 >>>> 2 control 72 >>>> 2 control 0 >>>> 2 treatment 24 >>>> 2 treatment 72 >>>> 3 control 24 >>>> 3 control 72 >>>> 3 control 0 >>>> 3 treatment 24 >>>> 3 treatment 72 >>>> >>>> Sample number refers to an individuals cancer cells, treatment refers >>>> to added drug or not and numbers are in hours (time elapsed). So it is >>>> a kind of paired, as patient variability is to be considered. The >>>> control sample at 0 is the same as treatment at time 0 as these are >>>> the same cells without any time/treatment. >>>> >>>> Please could someone help me understand how I can construct a design >>>> matrix and to understand how I can extract differently expressed genes >>>> that come about due to time, due to treatment and interaction of them >>>> both. >>>> >>>> Any pointers appreciated, though I am trying to see if the examples in >>>> the manual can be applied to this scenario. >>> >>> >>> See the multi-level experiment example in the user guide, starting on p. >>> 47. >>> >>> Best, >>> >>> Jim >>> >>>> Thank you. >>>> >>>> John. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 >

ADD REPLY • link 13.3 years ago john herbert ▴ 560

0

Entering edit mode

Dear James, I attempted "design <- model.matrix(~0+treatment*time+sample)"; the design matrix did not make sense to me, unfortunately. So I stepped back and did your first suggestion. patient <- factor(targets$patient) treat <- factor(targets$treat, levels=c("N","Y")) time <- factor(targets$time) treatment <- factor(paste(treat,time,sep=".")) design <- model.matrix(~0+treatment) colnames(design) <- levels(treatment) # > head(design) # N.0 N.24 N.72 Y.24 Y.72 # 1 0 0 0 1 0 # 2 0 0 0 0 1 corfit <- duplicateCorrelation(y_norm_ave,design,block=targets$patient) fit <- lmFit(y_norm_ave, design, block=targets$patient, correlation=corfit$consensus) cm <- makeContrasts( Treat24_vs_Normal24 = Y.24-N.24, Treat72_vs_Normal72 = Y.72-N.72, Treat24_vs_Normal0 = Y.24-N.0, Treat72_vs_Normal0 = Y.72-N.0, Normal24_vs_Normal0 = N.24-N.0, Normal72_vs_Normal0 = N.72-N.0, Diff=(Y.72-N.0)-(N.72-N.0), levels=design) fit2 <- contrasts.fit(fit, cm) fit2 <- eBayes(fit2) topTable(fit2, coef="Treat24_vs_Normal24") topTable(fit2, coef="Treat72_vs_Normal72")....... A look at some top diff genes, looks reasonably encouraging based on the biology but take your observation that there is no need for a random effect model. This style of code and matrix I can make sense of; is there a way of integrating your second suggestion into this code, so that the design and contrast matrices make sense to me? design <- model.matrix(~0+treatment) # (above) and design <- model.matrix(~0+treatment*time+sample) produce very different looking design matrices. Kind regards, John. On Fri, Oct 12, 2012 at 8:23 AM, john herbert <arraystruggles at="" gmail.com=""> wrote: > Thanks James, > appreciated as you have saved me a lot of time. > > John. > > On Thu, Oct 11, 2012 at 7:48 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: >> Ugh. Jumped the gun. This does *not* require you to fit a random effects >> model, as you have done every treatment to cells from each patient. You can >> just block on Sample and then make your comparisons. >> >> In other words, if you add Sample to your design matrix, you will in effect >> be removing the patient-specific effect. Something like >> >> design <- model.matrix(~0+treatment*time+sample) >> >> Best, >> >> Jim >> >> >> >> >> On 10/11/2012 2:27 PM, john herbert wrote: >>> >>> Thanks James, >>> This does not have time course but judging by your answer, I can just >>> add this in, in place of, say, tissue. >>> >>> Kind regards, >>> >>> John. >>> >>> On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald<jmacdon at="" uw.edu=""> >>> wrote: >>>> >>>> Hi John, >>>> >>>> >>>> On 10/11/2012 2:15 PM, john herbert wrote: >>>>> >>>>> Dear all. >>>>> I have been pondering about constructing a design matrix based on the >>>>> Limma user guide, where I combine a time course with a paired >>>>> analyses. The targets file looks like; >>>>> >>>>> Sample treatment time >>>>> 1 control 24 >>>>> 1 control 72 >>>>> 1 control 0 >>>>> 1 treatment 24 >>>>> 1 treatment 72 >>>>> 2 control 24 >>>>> 2 control 72 >>>>> 2 control 0 >>>>> 2 treatment 24 >>>>> 2 treatment 72 >>>>> 3 control 24 >>>>> 3 control 72 >>>>> 3 control 0 >>>>> 3 treatment 24 >>>>> 3 treatment 72 >>>>> >>>>> Sample number refers to an individuals cancer cells, treatment refers >>>>> to added drug or not and numbers are in hours (time elapsed). So it is >>>>> a kind of paired, as patient variability is to be considered. The >>>>> control sample at 0 is the same as treatment at time 0 as these are >>>>> the same cells without any time/treatment. >>>>> >>>>> Please could someone help me understand how I can construct a design >>>>> matrix and to understand how I can extract differently expressed genes >>>>> that come about due to time, due to treatment and interaction of them >>>>> both. >>>>> >>>>> Any pointers appreciated, though I am trying to see if the examples in >>>>> the manual can be applied to this scenario. >>>> >>>> >>>> See the multi-level experiment example in the user guide, starting on p. >>>> 47. >>>> >>>> Best, >>>> >>>> Jim >>>> >>>>> Thank you. >>>>> >>>>> John. >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> >>>> -- >>>> James W. MacDonald, M.S. >>>> Biostatistician >>>> University of Washington >>>> Environmental and Occupational Health Sciences >>>> 4225 Roosevelt Way NE, # 100 >>>> Seattle WA 98105-6099 >>>> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >>

ADD REPLY • link 13.3 years ago john herbert ▴ 560

0

Entering edit mode

Dear James, I attempted "design <- model.matrix(~0+treatment*time+sample)"; the design matrix did not make sense to me, unfortunately. So I stepped back and did your first suggestion. patient <- factor(targets$patient) treat <- factor(targets$treat, levels=c("N","Y")) time <- factor(targets$time) treatment <- factor(paste(treat,time,sep=".")) design <- model.matrix(~0+treatment) colnames(design) <- levels(treatment) # > head(design) # N.0 N.24 N.72 Y.24 Y.72 # 1 0 0 0 1 0 # 2 0 0 0 0 1 corfit <- duplicateCorrelation(y_norm_ave,design,block=targets$patient) fit <- lmFit(y_norm_ave, design, block=targets$patient, correlation=corfit$consensus) cm <- makeContrasts( Treat24_vs_Normal24 = Y.24-N.24, Treat72_vs_Normal72 = Y.72-N.72, Treat24_vs_Normal0 = Y.24-N.0, Treat72_vs_Normal0 = Y.72-N.0, Normal24_vs_Normal0 = N.24-N.0, Normal72_vs_Normal0 = N.72-N.0, Diff=(Y.72-N.0)-(N.72-N.0), levels=design) fit2 <- contrasts.fit(fit, cm) fit2 <- eBayes(fit2) topTable(fit2, coef="Treat24_vs_Normal24") topTable(fit2, coef="Treat72_vs_Normal72")....... A look at some top diff genes, looks reasonably encouraging based on the biology but take your observation that there is no need for a random effect model. This style of code and matrix I can make sense of; is there a way of integrating your second suggestion into this code, so that the design and contrast matrices make sense to me? design <- model.matrix(~0+treatment) # (above) and design <- model.matrix(~0+treatment*time+sample) produce very different looking design matrices. Kind regards, John. On Fri, Oct 12, 2012 at 8:23 AM, john herbert <arraystruggles at="" gmail.com=""> wrote: > Thanks James, > appreciated as you have saved me a lot of time. > > John. > > On Thu, Oct 11, 2012 at 7:48 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: >> Ugh. Jumped the gun. This does *not* require you to fit a random effects >> model, as you have done every treatment to cells from each patient. You can >> just block on Sample and then make your comparisons. >> >> In other words, if you add Sample to your design matrix, you will in effect >> be removing the patient-specific effect. Something like >> >> design <- model.matrix(~0+treatment*time+sample) >> >> Best, >> >> Jim >> >> >> >> >> On 10/11/2012 2:27 PM, john herbert wrote: >>> >>> Thanks James, >>> This does not have time course but judging by your answer, I can just >>> add this in, in place of, say, tissue. >>> >>> Kind regards, >>> >>> John. >>> >>> On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald<jmacdon at="" uw.edu=""> >>> wrote: >>>> >>>> Hi John, >>>> >>>> >>>> On 10/11/2012 2:15 PM, john herbert wrote: >>>>> >>>>> Dear all. >>>>> I have been pondering about constructing a design matrix based on the >>>>> Limma user guide, where I combine a time course with a paired >>>>> analyses. The targets file looks like; >>>>> >>>>> Sample treatment time >>>>> 1 control 24 >>>>> 1 control 72 >>>>> 1 control 0 >>>>> 1 treatment 24 >>>>> 1 treatment 72 >>>>> 2 control 24 >>>>> 2 control 72 >>>>> 2 control 0 >>>>> 2 treatment 24 >>>>> 2 treatment 72 >>>>> 3 control 24 >>>>> 3 control 72 >>>>> 3 control 0 >>>>> 3 treatment 24 >>>>> 3 treatment 72 >>>>> >>>>> Sample number refers to an individuals cancer cells, treatment refers >>>>> to added drug or not and numbers are in hours (time elapsed). So it is >>>>> a kind of paired, as patient variability is to be considered. The >>>>> control sample at 0 is the same as treatment at time 0 as these are >>>>> the same cells without any time/treatment. >>>>> >>>>> Please could someone help me understand how I can construct a design >>>>> matrix and to understand how I can extract differently expressed genes >>>>> that come about due to time, due to treatment and interaction of them >>>>> both. >>>>> >>>>> Any pointers appreciated, though I am trying to see if the examples in >>>>> the manual can be applied to this scenario. >>>> >>>> >>>> See the multi-level experiment example in the user guide, starting on p. >>>> 47. >>>> >>>> Best, >>>> >>>> Jim >>>> >>>>> Thank you. >>>>> >>>>> John. >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> >>>> -- >>>> James W. MacDonald, M.S. >>>> Biostatistician >>>> University of Washington >>>> Environmental and Occupational Health Sciences >>>> 4225 Roosevelt Way NE, # 100 >>>> Seattle WA 98105-6099 >>>> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >>

ADD REPLY • link 13.3 years ago john herbert ▴ 560

0

Entering edit mode

Hi John, On 10/14/2012 6:51 AM, john herbert wrote: > Dear James, > I attempted "design<- model.matrix(~0+treatment*time+sample)"; the > design matrix did not make sense to me, unfortunately. So I stepped > back and did your first suggestion. > > patient<- factor(targets$patient) > treat<- factor(targets$treat, levels=c("N","Y")) > time<- factor(targets$time) > > treatment<- factor(paste(treat,time,sep=".")) design <- model.matrix(~0+treatment+patient) Then continue as you have, omitting the call to duplicateCorrelation(), and no block or correlation args for lmFit(). I don't think what you have done is technically wrong, you are just making different assumptions as compared to fitting patient as a random effect. If you fit patient as a fixed effect, you are assuming that you can model the inter-patient differences as a shift in mean expression. In other words, you are assuming that for a given gene, patient 1 has on average a higher expression than patient 2, and you can account for that by subtracting a constant from the expression of patient 1 and then making comparisons. This is a pretty simple assumption, and not likely to be questioned by a reviewer, in addition to being easy to explain. If you fit patient as a random effect, you are fitting a different type of model that incorporates the correlation structure between related samples. In this case you are estimating the intra-patient correlation and then assuming that a.) the intra-patient correlation is the same for all patients, and b.) that the intra-patient correlation is the same for all samples for a given patient. So for this model you are making additional assumptions. Usually you want to go with the simplest model possible, and since you are not precluded from fitting everything as a fixed effect, that is probably the safest way to go. Best, Jim > design<- model.matrix(~0+treatment) > colnames(design)<- levels(treatment) > > #> head(design) > # N.0 N.24 N.72 Y.24 Y.72 > # 1 0 0 0 1 0 > # 2 0 0 0 0 1 > > corfit<- duplicateCorrelation(y_norm_ave,design,block=targets$patient) > > fit<- lmFit(y_norm_ave, design, block=targets$patient, > correlation=corfit$consensus) > > cm<- makeContrasts( > Treat24_vs_Normal24 = Y.24-N.24, > Treat72_vs_Normal72 = Y.72-N.72, > Treat24_vs_Normal0 = Y.24-N.0, > Treat72_vs_Normal0 = Y.72-N.0, > Normal24_vs_Normal0 = N.24-N.0, > Normal72_vs_Normal0 = N.72-N.0, > Diff=(Y.72-N.0)-(N.72-N.0), > levels=design) > > fit2<- contrasts.fit(fit, cm) > fit2<- eBayes(fit2) > > topTable(fit2, coef="Treat24_vs_Normal24") > > topTable(fit2, coef="Treat72_vs_Normal72")....... > > A look at some top diff genes, looks reasonably encouraging based on > the biology but take your observation that there is no need for a > random effect model. This style of code and matrix I can make sense > of; is there a way of integrating your second suggestion into this > code, so that the design and contrast matrices make sense to me? > > design<- model.matrix(~0+treatment) # (above) > > and > > design<- model.matrix(~0+treatment*time+sample) > > produce very different looking design matrices. > > Kind regards, > > John. > > On Fri, Oct 12, 2012 at 8:23 AM, john herbert<arraystruggles at="" gmail.com=""> wrote: >> Thanks James, >> appreciated as you have saved me a lot of time. >> >> John. >> >> On Thu, Oct 11, 2012 at 7:48 PM, James W. MacDonald<jmacdon at="" uw.edu=""> wrote: >>> Ugh. Jumped the gun. This does *not* require you to fit a random effects >>> model, as you have done every treatment to cells from each patient. You can >>> just block on Sample and then make your comparisons. >>> >>> In other words, if you add Sample to your design matrix, you will in effect >>> be removing the patient-specific effect. Something like >>> >>> design<- model.matrix(~0+treatment*time+sample) >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> >>> On 10/11/2012 2:27 PM, john herbert wrote: >>>> Thanks James, >>>> This does not have time course but judging by your answer, I can just >>>> add this in, in place of, say, tissue. >>>> >>>> Kind regards, >>>> >>>> John. >>>> >>>> On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald<jmacdon at="" uw.edu=""> >>>> wrote: >>>>> Hi John, >>>>> >>>>> >>>>> On 10/11/2012 2:15 PM, john herbert wrote: >>>>>> Dear all. >>>>>> I have been pondering about constructing a design matrix based on the >>>>>> Limma user guide, where I combine a time course with a paired >>>>>> analyses. The targets file looks like; >>>>>> >>>>>> Sample treatment time >>>>>> 1 control 24 >>>>>> 1 control 72 >>>>>> 1 control 0 >>>>>> 1 treatment 24 >>>>>> 1 treatment 72 >>>>>> 2 control 24 >>>>>> 2 control 72 >>>>>> 2 control 0 >>>>>> 2 treatment 24 >>>>>> 2 treatment 72 >>>>>> 3 control 24 >>>>>> 3 control 72 >>>>>> 3 control 0 >>>>>> 3 treatment 24 >>>>>> 3 treatment 72 >>>>>> >>>>>> Sample number refers to an individuals cancer cells, treatment refers >>>>>> to added drug or not and numbers are in hours (time elapsed). So it is >>>>>> a kind of paired, as patient variability is to be considered. The >>>>>> control sample at 0 is the same as treatment at time 0 as these are >>>>>> the same cells without any time/treatment. >>>>>> >>>>>> Please could someone help me understand how I can construct a design >>>>>> matrix and to understand how I can extract differently expressed genes >>>>>> that come about due to time, due to treatment and interaction of them >>>>>> both. >>>>>> >>>>>> Any pointers appreciated, though I am trying to see if the examples in >>>>>> the manual can be applied to this scenario. >>>>> >>>>> See the multi-level experiment example in the user guide, starting on p. >>>>> 47. >>>>> >>>>> Best, >>>>> >>>>> Jim >>>>> >>>>>> Thank you. >>>>>> >>>>>> John. >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>>> -- >>>>> James W. MacDonald, M.S. >>>>> Biostatistician >>>>> University of Washington >>>>> Environmental and Occupational Health Sciences >>>>> 4225 Roosevelt Way NE, # 100 >>>>> Seattle WA 98105-6099 >>>>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 13.3 years ago James W. MacDonald 68k

0

Entering edit mode

Hi John, On 10/14/2012 6:51 AM, john herbert wrote: > Dear James, > I attempted "design<- model.matrix(~0+treatment*time+sample)"; the > design matrix did not make sense to me, unfortunately. So I stepped > back and did your first suggestion. > > patient<- factor(targets$patient) > treat<- factor(targets$treat, levels=c("N","Y")) > time<- factor(targets$time) > > treatment<- factor(paste(treat,time,sep=".")) design <- model.matrix(~0+treatment+patient) Then continue as you have, omitting the call to duplicateCorrelation(), and no block or correlation args for lmFit(). I don't think what you have done is technically wrong, you are just making different assumptions as compared to fitting patient as a random effect. If you fit patient as a fixed effect, you are assuming that you can model the inter-patient differences as a shift in mean expression. In other words, you are assuming that for a given gene, patient 1 has on average a higher expression than patient 2, and you can account for that by subtracting a constant from the expression of patient 1 and then making comparisons. This is a pretty simple assumption, and not likely to be questioned by a reviewer, in addition to being easy to explain. If you fit patient as a random effect, you are fitting a different type of model that incorporates the correlation structure between related samples. In this case you are estimating the intra-patient correlation and then assuming that a.) the intra-patient correlation is the same for all patients, and b.) that the intra-patient correlation is the same for all samples for a given patient. So for this model you are making additional assumptions. Usually you want to go with the simplest model possible, and since you are not precluded from fitting everything as a fixed effect, that is probably the safest way to go. Best, Jim > design<- model.matrix(~0+treatment) > colnames(design)<- levels(treatment) > > #> head(design) > # N.0 N.24 N.72 Y.24 Y.72 > # 1 0 0 0 1 0 > # 2 0 0 0 0 1 > > corfit<- duplicateCorrelation(y_norm_ave,design,block=targets$patient) > > fit<- lmFit(y_norm_ave, design, block=targets$patient, > correlation=corfit$consensus) > > cm<- makeContrasts( > Treat24_vs_Normal24 = Y.24-N.24, > Treat72_vs_Normal72 = Y.72-N.72, > Treat24_vs_Normal0 = Y.24-N.0, > Treat72_vs_Normal0 = Y.72-N.0, > Normal24_vs_Normal0 = N.24-N.0, > Normal72_vs_Normal0 = N.72-N.0, > Diff=(Y.72-N.0)-(N.72-N.0), > levels=design) > > fit2<- contrasts.fit(fit, cm) > fit2<- eBayes(fit2) > > topTable(fit2, coef="Treat24_vs_Normal24") > > topTable(fit2, coef="Treat72_vs_Normal72")....... > > A look at some top diff genes, looks reasonably encouraging based on > the biology but take your observation that there is no need for a > random effect model. This style of code and matrix I can make sense > of; is there a way of integrating your second suggestion into this > code, so that the design and contrast matrices make sense to me? > > design<- model.matrix(~0+treatment) # (above) > > and > > design<- model.matrix(~0+treatment*time+sample) > > produce very different looking design matrices. > > Kind regards, > > John. > > On Fri, Oct 12, 2012 at 8:23 AM, john herbert<arraystruggles at="" gmail.com=""> wrote: >> Thanks James, >> appreciated as you have saved me a lot of time. >> >> John. >> >> On Thu, Oct 11, 2012 at 7:48 PM, James W. MacDonald<jmacdon at="" uw.edu=""> wrote: >>> Ugh. Jumped the gun. This does *not* require you to fit a random effects >>> model, as you have done every treatment to cells from each patient. You can >>> just block on Sample and then make your comparisons. >>> >>> In other words, if you add Sample to your design matrix, you will in effect >>> be removing the patient-specific effect. Something like >>> >>> design<- model.matrix(~0+treatment*time+sample) >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> >>> On 10/11/2012 2:27 PM, john herbert wrote: >>>> Thanks James, >>>> This does not have time course but judging by your answer, I can just >>>> add this in, in place of, say, tissue. >>>> >>>> Kind regards, >>>> >>>> John. >>>> >>>> On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald<jmacdon at="" uw.edu=""> >>>> wrote: >>>>> Hi John, >>>>> >>>>> >>>>> On 10/11/2012 2:15 PM, john herbert wrote: >>>>>> Dear all. >>>>>> I have been pondering about constructing a design matrix based on the >>>>>> Limma user guide, where I combine a time course with a paired >>>>>> analyses. The targets file looks like; >>>>>> >>>>>> Sample treatment time >>>>>> 1 control 24 >>>>>> 1 control 72 >>>>>> 1 control 0 >>>>>> 1 treatment 24 >>>>>> 1 treatment 72 >>>>>> 2 control 24 >>>>>> 2 control 72 >>>>>> 2 control 0 >>>>>> 2 treatment 24 >>>>>> 2 treatment 72 >>>>>> 3 control 24 >>>>>> 3 control 72 >>>>>> 3 control 0 >>>>>> 3 treatment 24 >>>>>> 3 treatment 72 >>>>>> >>>>>> Sample number refers to an individuals cancer cells, treatment refers >>>>>> to added drug or not and numbers are in hours (time elapsed). So it is >>>>>> a kind of paired, as patient variability is to be considered. The >>>>>> control sample at 0 is the same as treatment at time 0 as these are >>>>>> the same cells without any time/treatment. >>>>>> >>>>>> Please could someone help me understand how I can construct a design >>>>>> matrix and to understand how I can extract differently expressed genes >>>>>> that come about due to time, due to treatment and interaction of them >>>>>> both. >>>>>> >>>>>> Any pointers appreciated, though I am trying to see if the examples in >>>>>> the manual can be applied to this scenario. >>>>> >>>>> See the multi-level experiment example in the user guide, starting on p. >>>>> 47. >>>>> >>>>> Best, >>>>> >>>>> Jim >>>>> >>>>>> Thank you. >>>>>> >>>>>> John. >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>>> -- >>>>> James W. MacDonald, M.S. >>>>> Biostatistician >>>>> University of Washington >>>>> Environmental and Occupational Health Sciences >>>>> 4225 Roosevelt Way NE, # 100 >>>>> Seattle WA 98105-6099 >>>>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 13.3 years ago James W. MacDonald 68k

0

Entering edit mode

Thanks James, appreciated as you have saved me a lot of time. John. On Thu, Oct 11, 2012 at 7:48 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > Ugh. Jumped the gun. This does *not* require you to fit a random effects > model, as you have done every treatment to cells from each patient. You can > just block on Sample and then make your comparisons. > > In other words, if you add Sample to your design matrix, you will in effect > be removing the patient-specific effect. Something like > > design <- model.matrix(~0+treatment*time+sample) > > Best, > > Jim > > > > > On 10/11/2012 2:27 PM, john herbert wrote: >> >> Thanks James, >> This does not have time course but judging by your answer, I can just >> add this in, in place of, say, tissue. >> >> Kind regards, >> >> John. >> >> On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald<jmacdon at="" uw.edu=""> >> wrote: >>> >>> Hi John, >>> >>> >>> On 10/11/2012 2:15 PM, john herbert wrote: >>>> >>>> Dear all. >>>> I have been pondering about constructing a design matrix based on the >>>> Limma user guide, where I combine a time course with a paired >>>> analyses. The targets file looks like; >>>> >>>> Sample treatment time >>>> 1 control 24 >>>> 1 control 72 >>>> 1 control 0 >>>> 1 treatment 24 >>>> 1 treatment 72 >>>> 2 control 24 >>>> 2 control 72 >>>> 2 control 0 >>>> 2 treatment 24 >>>> 2 treatment 72 >>>> 3 control 24 >>>> 3 control 72 >>>> 3 control 0 >>>> 3 treatment 24 >>>> 3 treatment 72 >>>> >>>> Sample number refers to an individuals cancer cells, treatment refers >>>> to added drug or not and numbers are in hours (time elapsed). So it is >>>> a kind of paired, as patient variability is to be considered. The >>>> control sample at 0 is the same as treatment at time 0 as these are >>>> the same cells without any time/treatment. >>>> >>>> Please could someone help me understand how I can construct a design >>>> matrix and to understand how I can extract differently expressed genes >>>> that come about due to time, due to treatment and interaction of them >>>> both. >>>> >>>> Any pointers appreciated, though I am trying to see if the examples in >>>> the manual can be applied to this scenario. >>> >>> >>> See the multi-level experiment example in the user guide, starting on p. >>> 47. >>> >>> Best, >>> >>> Jim >>> >>>> Thank you. >>>> >>>> John. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 >

ADD REPLY • link 13.3 years ago john herbert ▴ 560

0

Entering edit mode

Ugh. Jumped the gun. This does *not* require you to fit a random effects model, as you have done every treatment to cells from each patient. You can just block on Sample and then make your comparisons. In other words, if you add Sample to your design matrix, you will in effect be removing the patient-specific effect. Something like design <- model.matrix(~0+treatment*time+sample) Best, Jim On 10/11/2012 2:27 PM, john herbert wrote: > Thanks James, > This does not have time course but judging by your answer, I can just > add this in, in place of, say, tissue. > > Kind regards, > > John. > > On Thu, Oct 11, 2012 at 7:23 PM, James W. MacDonald<jmacdon at="" uw.edu=""> wrote: >> Hi John, >> >> >> On 10/11/2012 2:15 PM, john herbert wrote: >>> Dear all. >>> I have been pondering about constructing a design matrix based on the >>> Limma user guide, where I combine a time course with a paired >>> analyses. The targets file looks like; >>> >>> Sample treatment time >>> 1 control 24 >>> 1 control 72 >>> 1 control 0 >>> 1 treatment 24 >>> 1 treatment 72 >>> 2 control 24 >>> 2 control 72 >>> 2 control 0 >>> 2 treatment 24 >>> 2 treatment 72 >>> 3 control 24 >>> 3 control 72 >>> 3 control 0 >>> 3 treatment 24 >>> 3 treatment 72 >>> >>> Sample number refers to an individuals cancer cells, treatment refers >>> to added drug or not and numbers are in hours (time elapsed). So it is >>> a kind of paired, as patient variability is to be considered. The >>> control sample at 0 is the same as treatment at time 0 as these are >>> the same cells without any time/treatment. >>> >>> Please could someone help me understand how I can construct a design >>> matrix and to understand how I can extract differently expressed genes >>> that come about due to time, due to treatment and interaction of them >>> both. >>> >>> Any pointers appreciated, though I am trying to see if the examples in >>> the manual can be applied to this scenario. >> >> See the multi-level experiment example in the user guide, starting on p. 47. >> >> Best, >> >> Jim >> >>> Thank you. >>> >>> John. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 13.3 years ago James W. MacDonald 68k

Login before adding your answer.