ANCOVA microarray time-course continuous & categorical variables / Limma extensions
2
0
Entering edit mode
Michael Breen ▴ 370
@michael-breen-5999
Last seen 9.6 years ago
Hi all, Our lab analyzes gene-expression from microarray and RNAseq platforms. Currently, I am looking for a package to test differential expression (DE) while considering continuous variables that may alter gene-expression profiles. In other words, an ANCOVA type tool. I am quite familiar with Limma (ANOVA) but including continuous variables is not very well described. Specifically, we have a project were two groups can be modeled over the same 2 time points. One group starts healthy and ends in a disease state. The other group starts healthy and remains healthy. We are interested in identifying genes uniquely responding within one group and not in the other. Thus, we have implemented a longitudinal contrast with linear modeling through Limma. However, we are also interested in adding one or two continuous variables (tumor size, time spent meditating, the amount of drinks one consumes etc..) to check if gene expression differences or similarities may be due to these factors instead of due to belonging to a certain class. Limma seems to test categorical variables, but I don't think it is capable of either correlating gene-expression to continuous variables. If not, can someone recommend a tool which may be appropriate for such a situation? Yours, Michael [[alternative HTML version deleted]]
RNASeq Microarray limma RNASeq Microarray limma • 2.6k views
ADD COMMENT
0
Entering edit mode
Michael Breen ▴ 370
@michael-breen-5999
Last seen 9.6 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20131022="" e7f00ea2="" attachment-0001.pl="">
ADD COMMENT
0
Entering edit mode
On Oct 22, 2013, at 5:30 PM, Michael Breen wrote: > Hi all, > > Our lab analyzes gene-expression from microarray and RNAseq platforms. > Currently, I am looking for a package to test differential expression (DE) > while considering continuous variables that may alter gene- expression > profiles. In other words, an ANCOVA type tool. I am quite familiar with > Limma (ANOVA) but including continuous variables is not very well described. > > Specifically, we have a project were two groups can be modeled over the > same 2 time points. One group starts healthy and ends in a disease state. > The other group starts healthy and remains healthy. > > We are interested in identifying genes uniquely responding within one group > and not in the other. Thus, we have implemented a longitudinal contrast > with linear modeling through Limma. However, we are also interested in > adding one or two continuous variables (tumor size, time spent meditating, > the amount of drinks one consumes etc..) to check if gene expression > differences or similarities may be due to these factors instead of due to > belonging to a certain class. Limma seems to test categorical variables, > but I don't think it is capable of either correlating gene- expression to > continuous variables. > > If not, can someone recommend a tool which may be appropriate for such a > situation? > > Yours, > > Michael > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Dear Michael and list, I think that you write the design and contrast matrices exactly as you would for an ANCOVA in R only you do the fit and Bayesian correction in Limma. Perhaps someone who has had experience doing this kind of analysis can comment. Best wishes, Rich Richard A. Friedman, PhD Associate Research Scientist, Biomedical Informatics Shared Resource Herbert Irving Comprehensive Cancer Center (HICCC) Lecturer, Department of Biomedical Informatics (DBMI) Educational Coordinator, Center for Computational Biology and Bioinformatics (C2B2)/ National Center for Multiscale Analysis of Genomic Networks (MAGNet)/ Columbia Department of Systems Biology Room 824 Irving Cancer Research Center Columbia University 1130 St. Nicholas Ave New York, NY 10032 (212)851-4765 (voice) friedman at cancercenter.columbia.edu http://cancercenter.columbia.edu/~friedman/ In memoriam, Frederik Pohl
ADD REPLY
0
Entering edit mode
On Oct 22, 2013, at 5:30 PM, Michael Breen wrote: > Hi all, > > Our lab analyzes gene-expression from microarray and RNAseq platforms. > Currently, I am looking for a package to test differential expression (DE) > while considering continuous variables that may alter gene- expression > profiles. In other words, an ANCOVA type tool. I am quite familiar with > Limma (ANOVA) but including continuous variables is not very well described. > > Specifically, we have a project were two groups can be modeled over the > same 2 time points. One group starts healthy and ends in a disease state. > The other group starts healthy and remains healthy. > > We are interested in identifying genes uniquely responding within one group > and not in the other. Thus, we have implemented a longitudinal contrast > with linear modeling through Limma. However, we are also interested in > adding one or two continuous variables (tumor size, time spent meditating, > the amount of drinks one consumes etc..) to check if gene expression > differences or similarities may be due to these factors instead of due to > belonging to a certain class. Limma seems to test categorical variables, > but I don't think it is capable of either correlating gene- expression to > continuous variables. > > If not, can someone recommend a tool which may be appropriate for such a > situation? > > Yours, > > Michael > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Dear Michael and list, I think that you write the design and contrast matrices exactly as you would for an ANCOVA in R only you do the fit and Bayesian correction in Limma. Perhaps someone who has had experience doing this kind of analysis can comment. Best wishes, Rich Richard A. Friedman, PhD Associate Research Scientist, Biomedical Informatics Shared Resource Herbert Irving Comprehensive Cancer Center (HICCC) Lecturer, Department of Biomedical Informatics (DBMI) Educational Coordinator, Center for Computational Biology and Bioinformatics (C2B2)/ National Center for Multiscale Analysis of Genomic Networks (MAGNet)/ Columbia Department of Systems Biology Room 824 Irving Cancer Research Center Columbia University 1130 St. Nicholas Ave New York, NY 10032 (212)851-4765 (voice) friedman at cancercenter.columbia.edu http://cancercenter.columbia.edu/~friedman/ In memoriam, Frederik Pohl
ADD REPLY
0
Entering edit mode
Including continuous covariates in design matrices is R is just as easy as including categorical ones. Instead of creating a column for each degree of freedom in the categorical variable, you just end up with a single column that simply contains the values of the continuous variable. Try using the model.matrix function with a combination of your categorical variables and continuous ones to see what it does. -Ryan On Tue Oct 22 14:36:38 2013, Richard Friedman wrote: > > On Oct 22, 2013, at 5:30 PM, Michael Breen wrote: > >> Hi all, >> >> Our lab analyzes gene-expression from microarray and RNAseq platforms. >> Currently, I am looking for a package to test differential expression (DE) >> while considering continuous variables that may alter gene- expression >> profiles. In other words, an ANCOVA type tool. I am quite familiar with >> Limma (ANOVA) but including continuous variables is not very well described. >> >> Specifically, we have a project were two groups can be modeled over the >> same 2 time points. One group starts healthy and ends in a disease state. >> The other group starts healthy and remains healthy. >> >> We are interested in identifying genes uniquely responding within one group >> and not in the other. Thus, we have implemented a longitudinal contrast >> with linear modeling through Limma. However, we are also interested in >> adding one or two continuous variables (tumor size, time spent meditating, >> the amount of drinks one consumes etc..) to check if gene expression >> differences or similarities may be due to these factors instead of due to >> belonging to a certain class. Limma seems to test categorical variables, >> but I don't think it is capable of either correlating gene- expression to >> continuous variables. >> >> If not, can someone recommend a tool which may be appropriate for such a >> situation? >> >> Yours, >> >> Michael >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > Dear Michael and list, > > I think that you write the design and contrast matrices > exactly as you would for an ANCOVA in R only you do the > fit and Bayesian correction in Limma. > > Perhaps someone who has had experience doing this > kind of analysis can comment. > > Best wishes, > Rich > > Richard A. Friedman, PhD > Associate Research Scientist, > Biomedical Informatics Shared Resource > Herbert Irving Comprehensive Cancer Center (HICCC) > Lecturer, > Department of Biomedical Informatics (DBMI) > Educational Coordinator, > Center for Computational Biology and Bioinformatics (C2B2)/ > National Center for Multiscale Analysis of Genomic Networks (MAGNet)/ > Columbia Department of Systems Biology > Room 824 > Irving Cancer Research Center > Columbia University > 1130 St. Nicholas Ave > New York, NY 10032 > (212)851-4765 (voice) > friedman at cancercenter.columbia.edu > http://cancercenter.columbia.edu/~friedman/ > > In memoriam, Frederik Pohl > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Including continuous covariates in design matrices is R is just as easy as including categorical ones. Instead of creating a column for each degree of freedom in the categorical variable, you just end up with a single column that simply contains the values of the continuous variable. Try using the model.matrix function with a combination of your categorical variables and continuous ones to see what it does. -Ryan On Tue Oct 22 14:36:38 2013, Richard Friedman wrote: > > On Oct 22, 2013, at 5:30 PM, Michael Breen wrote: > >> Hi all, >> >> Our lab analyzes gene-expression from microarray and RNAseq platforms. >> Currently, I am looking for a package to test differential expression (DE) >> while considering continuous variables that may alter gene- expression >> profiles. In other words, an ANCOVA type tool. I am quite familiar with >> Limma (ANOVA) but including continuous variables is not very well described. >> >> Specifically, we have a project were two groups can be modeled over the >> same 2 time points. One group starts healthy and ends in a disease state. >> The other group starts healthy and remains healthy. >> >> We are interested in identifying genes uniquely responding within one group >> and not in the other. Thus, we have implemented a longitudinal contrast >> with linear modeling through Limma. However, we are also interested in >> adding one or two continuous variables (tumor size, time spent meditating, >> the amount of drinks one consumes etc..) to check if gene expression >> differences or similarities may be due to these factors instead of due to >> belonging to a certain class. Limma seems to test categorical variables, >> but I don't think it is capable of either correlating gene- expression to >> continuous variables. >> >> If not, can someone recommend a tool which may be appropriate for such a >> situation? >> >> Yours, >> >> Michael >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > Dear Michael and list, > > I think that you write the design and contrast matrices > exactly as you would for an ANCOVA in R only you do the > fit and Bayesian correction in Limma. > > Perhaps someone who has had experience doing this > kind of analysis can comment. > > Best wishes, > Rich > > Richard A. Friedman, PhD > Associate Research Scientist, > Biomedical Informatics Shared Resource > Herbert Irving Comprehensive Cancer Center (HICCC) > Lecturer, > Department of Biomedical Informatics (DBMI) > Educational Coordinator, > Center for Computational Biology and Bioinformatics (C2B2)/ > National Center for Multiscale Analysis of Genomic Networks (MAGNet)/ > Columbia Department of Systems Biology > Room 824 > Irving Cancer Research Center > Columbia University > 1130 St. Nicholas Ave > New York, NY 10032 > (212)851-4765 (voice) > friedman at cancercenter.columbia.edu > http://cancercenter.columbia.edu/~friedman/ > > In memoriam, Frederik Pohl > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20131023="" e2caa970="" attachment-0001.pl="">
ADD REPLY
0
Entering edit mode
I have done this in the past, although I did not have much faith in it, maybe because of the way I create the design matrix. For example I do it this way: Treat <- factor(paste(targets$Condition,targets$Time,sep=".")) design <- model.matrix(~0+Treat) colnames(design) <- levels(Treat) colnames(design) [1] "Case.Time2" "Case.Time1" "Control.Time2" "Control.Time1" #introduce continuous variable, add to design CES <- factor(targets$CES) design2 <- cbind(design, CES) colnames(design2) [1] "Case.Time2" "Case.Time1" "Control.Time2" "Control.Time1" "CES" I go on to duplicate correlation and lmfit then make these contrasts to substract away the effect of the continuous variable from our observations. cm <- makeContrasts( CaseEffect = Case.Time2-Case.Time1, ControlEffect = Control.Time2-Control.Time1, ContrastEffect = (Case.Time2-Case.Time1)-(Control.Time2-Control.Time1), CaseEffectNoCov = (Case.Time2-Case.Time1)-(CES), ControlEffectNoCov = (Case.Time2-Case.Time1)-(CES), ContrastEffectNoCov = (Case.Time2-Case.Time1)-(CES) - (Case.Time2-Time1)-(CES), levels=designnew) etc... Our results are 'different' so it would be useful to have another opinion regarding this set-up. Michael On Tue, Oct 22, 2013 at 11:03 PM, Ryan <rct@thompsonclan.org> wrote: > Including continuous covariates in design matrices is R is just as easy as > including categorical ones. Instead of creating a column for each degree of > freedom in the categorical variable, you just end up with a single column > that simply contains the values of the continuous variable. Try using the > model.matrix function with a combination of your categorical variables and > continuous ones to see what it does. > > -Ryan > > > On Tue Oct 22 14:36:38 2013, Richard Friedman wrote: > >> >> On Oct 22, 2013, at 5:30 PM, Michael Breen wrote: >> >> Hi all, >>> >>> Our lab analyzes gene-expression from microarray and RNAseq platforms. >>> Currently, I am looking for a package to test differential expression >>> (DE) >>> while considering continuous variables that may alter gene- expression >>> profiles. In other words, an ANCOVA type tool. I am quite familiar with >>> Limma (ANOVA) but including continuous variables is not very well >>> described. >>> >>> Specifically, we have a project were two groups can be modeled over the >>> same 2 time points. One group starts healthy and ends in a disease state. >>> The other group starts healthy and remains healthy. >>> >>> We are interested in identifying genes uniquely responding within one >>> group >>> and not in the other. Thus, we have implemented a longitudinal contrast >>> with linear modeling through Limma. However, we are also interested in >>> adding one or two continuous variables (tumor size, time spent >>> meditating, >>> the amount of drinks one consumes etc..) to check if gene expression >>> differences or similarities may be due to these factors instead of due to >>> belonging to a certain class. Limma seems to test categorical variables, >>> but I don't think it is capable of either correlating gene- expression to >>> continuous variables. >>> >>> If not, can someone recommend a tool which may be appropriate for such a >>> situation? >>> >>> Yours, >>> >>> Michael >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: http://news.gmane.org/gmane.** >>> science.biology.informatics.**conductor<http: news.gmane.org="" gman="" e.science.biology.informatics.conductor=""> >>> >> Dear Michael and list, >> >> I think that you write the design and contrast matrices >> exactly as you would for an ANCOVA in R only you do the >> fit and Bayesian correction in Limma. >> >> Perhaps someone who has had experience doing this >> kind of analysis can comment. >> >> Best wishes, >> Rich >> >> Richard A. Friedman, PhD >> Associate Research Scientist, >> Biomedical Informatics Shared Resource >> Herbert Irving Comprehensive Cancer Center (HICCC) >> Lecturer, >> Department of Biomedical Informatics (DBMI) >> Educational Coordinator, >> Center for Computational Biology and Bioinformatics (C2B2)/ >> National Center for Multiscale Analysis of Genomic Networks (MAGNet)/ >> Columbia Department of Systems Biology >> Room 824 >> Irving Cancer Research Center >> Columbia University >> 1130 St. Nicholas Ave >> New York, NY 10032 >> (212)851-4765 (voice) >> friedman@cancercenter.**columbia.edu <friedman@cancercenter.columbia.edu> >> http://cancercenter.columbia.**edu/~friedman/<http: cancercenter.c="" olumbia.edu="" ~friedman=""/> >> >> In memoriam, Frederik Pohl >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia
> Date: Wed, 23 Oct 2013 08:03:19 +0100 > From: Michael Breen <breenbioinformatics at="" gmail.com=""> > To: Ryan <rct at="" thompsonclan.org=""> > Cc: Bioconductor Mailing List <bioconductor at="" stat.math.ethz.ch="">, > "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Subject: Re: [BioC] ANCOVA microarray time-course continuous & > categorical variables / Limma extensions > > I have done this in the past, although I did not have much faith in it, > maybe because of the way I create the design matrix. > > For example I do it this way: > > Treat <- factor(paste(targets$Condition,targets$Time,sep=".")) > design <- model.matrix(~0+Treat) > colnames(design) <- levels(Treat) > colnames(design) > [1] "Case.Time2" "Case.Time1" "Control.Time2" "Control.Time1" > > > #introduce continuous variable, add to design > CES <- factor(targets$CES) Here is the problem. You are declaring your continuous variable to be categorical, which is presumably not want you want to do. In R, a "factor" means a categorical variable. Anything that is not a factor is considered to be continuous. You could either remove the factor() command here, or else simply use design <- model.matrix(~0+Treat+CES) Best wishes Gordon > design2 <- cbind(design, CES) > colnames(design2) > [1] "Case.Time2" "Case.Time1" "Control.Time2" "Control.Time1" > "CES" > > I go on to duplicate correlation and lmfit then make these contrasts to > substract away the effect of the continuous variable from our observations. > > cm <- makeContrasts( > CaseEffect = Case.Time2-Case.Time1, > ControlEffect = Control.Time2-Control.Time1, > ContrastEffect = (Case.Time2-Case.Time1)-(Control.Time2-Control.Time1), > CaseEffectNoCov = (Case.Time2-Case.Time1)-(CES), > ControlEffectNoCov = (Case.Time2-Case.Time1)-(CES), > ContrastEffectNoCov = (Case.Time2-Case.Time1)-(CES) - > (Case.Time2-Time1)-(CES), > levels=designnew) > > etc... > > Our results are 'different' so it would be useful to have another opinion > regarding this set-up. > > Michael > > > > > > > > > On Tue, Oct 22, 2013 at 11:03 PM, Ryan <rct at="" thompsonclan.org=""> wrote: > >> Including continuous covariates in design matrices is R is just as easy as >> including categorical ones. Instead of creating a column for each degree of >> freedom in the categorical variable, you just end up with a single column >> that simply contains the values of the continuous variable. Try using the >> model.matrix function with a combination of your categorical variables and >> continuous ones to see what it does. >> >> -Ryan >> >> >> On Tue Oct 22 14:36:38 2013, Richard Friedman wrote: >> >>> >>> On Oct 22, 2013, at 5:30 PM, Michael Breen wrote: >>> >>> Hi all, >>>> >>>> Our lab analyzes gene-expression from microarray and RNAseq platforms. >>>> Currently, I am looking for a package to test differential expression >>>> (DE) >>>> while considering continuous variables that may alter gene- expression >>>> profiles. In other words, an ANCOVA type tool. I am quite familiar with >>>> Limma (ANOVA) but including continuous variables is not very well >>>> described. >>>> >>>> Specifically, we have a project were two groups can be modeled over the >>>> same 2 time points. One group starts healthy and ends in a disease state. >>>> The other group starts healthy and remains healthy. >>>> >>>> We are interested in identifying genes uniquely responding within one >>>> group >>>> and not in the other. Thus, we have implemented a longitudinal contrast >>>> with linear modeling through Limma. However, we are also interested in >>>> adding one or two continuous variables (tumor size, time spent >>>> meditating, >>>> the amount of drinks one consumes etc..) to check if gene expression >>>> differences or similarities may be due to these factors instead of due to >>>> belonging to a certain class. Limma seems to test categorical variables, >>>> but I don't think it is capable of either correlating gene- expression to >>>> continuous variables. >>>> >>>> If not, can someone recommend a tool which may be appropriate for such a >>>> situation? >>>> >>>> Yours, >>>> >>>> Michael >>>> >>> Dear Michael and list, >>> >>> I think that you write the design and contrast matrices >>> exactly as you would for an ANCOVA in R only you do the >>> fit and Bayesian correction in Limma. >>> >>> Perhaps someone who has had experience doing this >>> kind of analysis can comment. >>> >>> Best wishes, >>> Rich >>> >>> Richard A. Friedman, PhD >>> Associate Research Scientist, >>> Biomedical Informatics Shared Resource >>> Herbert Irving Comprehensive Cancer Center (HICCC) >>> Lecturer, >>> Department of Biomedical Informatics (DBMI) >>> Educational Coordinator, >>> Center for Computational Biology and Bioinformatics (C2B2)/ >>> National Center for Multiscale Analysis of Genomic Networks (MAGNet)/ >>> Columbia Department of Systems Biology >>> Room 824 >>> Irving Cancer Research Center >>> Columbia University >>> 1130 St. Nicholas Ave >>> New York, NY 10032 >>> (212)851-4765 (voice) >>> friedman at cancercenter.**columbia.edu <friedman at="" cancercenter.columbia.edu=""> >>> http://cancercenter.columbia.**edu/~friedman/<http: cancercenter.="" columbia.edu="" ~friedman=""/> >>> >>> In memoriam, Frederik Pohl ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT
0
Entering edit mode
Hi all, Recently, there has been considerable interest in these types of studies, i.e. those outlined in section 8.7 Multi-level Experiments (in the limma manual) while including continuous variables that may have overall effects on gene-expression outcomes. Not only is it informative to correlate gene-expression to continuous variables, (for example gene-expression with tumor size) it is also informative to be able to subtract away any erroneous influences on gene-expression patters (for example, conducting a time-course study over 2 time points consisting of 2 groups, 1 disease and 1 healthy, while incorporating the number of cigarettes your cohorts smoke to the model, only to subtract that measure away in order to measure purely the main-effects- in this vague example = time and condition). >From this banter, I have gathered that these techniques are applicable to Limma, although not apparently so from the manual itself. Could Limma and the community with such scenarios benefit from an example added to the Limma manual incorporating continuous variables? For now, we believe to have the model set-up properly to take into consideration a continuous variable over time between two groups. We want to test the main-effect of Time and Condition while removing additive effects of a continuous variable - here I have labeled it CES. Given that the syntax below is logical, I hope it can bring clarity to others with similar scenarios. #Main effects of time and Condition Treat <- factor(paste(targets$Condition,targets$Time,sep=".")) #Continuous covariable for Cases. CES <- (targets$CES_Case) design <- model.matrix(~0+Treat+CES) colnames(design) [1] "Case.Time2" "Case.Time1" "Control.Time2" [4] "Control.Time1" "CES" corfit <- duplicateCorrelation(exprs,design,block=targets$Subject) corfit$consensus fit <- lmFit(exprs,design,block=targets$Subject,correlation=corfit$consensus) #As an example cm <- makeContrasts( Cases = Case.Time2-Case.Time1, Controls = Control.Time2-Control.Time1, Contrast= (Case.Time2-Case.Time1)-(Control.Time2-Control.Time1), aContinuousEffect = CES, Cases_no_ContinuousEffect=Case.Time2-Case.Time1-CES, levels=design) #Cases = Measures longitudinal effects overtime of cases #Controls =Measures longitudinal effects overtime of controls #Contrast=Measures longitudinal contrasts between cases and controls #aContinuousEffect= the effects of our continuous covariable #Cases_no_ContinuousEffect = Measures longitudinal effects overtime of cases without erroneous effects of the continuous variable fit2 <- contrasts.fit(fit, cm) fit2 <- eBayes(fit2) colnames(fit2) Best, Michael On Wed, Oct 23, 2013 at 11:16 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > > Date: Wed, 23 Oct 2013 08:03:19 +0100 >> From: Michael Breen <breenbioinformatics@gmail.com**> >> To: Ryan <rct@thompsonclan.org> >> Cc: Bioconductor Mailing List <bioconductor@stat.math.ethz.**ch<bioconductor@stat.math.ethz.ch> >> >, >> "bioconductor@r-project.org" <bioconductor@r-project.org> >> Subject: Re: [BioC] ANCOVA microarray time-course continuous & >> categorical variables / Limma extensions >> >> I have done this in the past, although I did not have much faith in it, >> maybe because of the way I create the design matrix. >> >> For example I do it this way: >> >> Treat <- factor(paste(targets$**Condition,targets$Time,sep="."**)) >> design <- model.matrix(~0+Treat) >> colnames(design) <- levels(Treat) >> colnames(design) >> [1] "Case.Time2" "Case.Time1" "Control.Time2" "Control.Time1" >> >> >> #introduce continuous variable, add to design >> CES <- factor(targets$CES) >> > > Here is the problem. You are declaring your continuous variable to be > categorical, which is presumably not want you want to do. > > In R, a "factor" means a categorical variable. Anything that is not a > factor is considered to be continuous. > > You could either remove the factor() command here, or else simply use > > design <- model.matrix(~0+Treat+CES) > > Best wishes > Gordon > > design2 <- cbind(design, CES) >> colnames(design2) >> [1] "Case.Time2" "Case.Time1" "Control.Time2" "Control.Time1" >> "CES" >> >> I go on to duplicate correlation and lmfit then make these contrasts to >> substract away the effect of the continuous variable from our >> observations. >> >> cm <- makeContrasts( >> CaseEffect = Case.Time2-Case.Time1, >> ControlEffect = Control.Time2-Control.Time1, >> ContrastEffect = (Case.Time2-Case.Time1)-(**Control.Time2-Control.Time1), >> CaseEffectNoCov = (Case.Time2-Case.Time1)-(CES), >> ControlEffectNoCov = (Case.Time2-Case.Time1)-(CES), >> ContrastEffectNoCov = (Case.Time2-Case.Time1)-(CES) - >> (Case.Time2-Time1)-(CES), >> levels=designnew) >> >> etc... >> >> Our results are 'different' so it would be useful to have another opinion >> regarding this set-up. >> >> Michael >> >> >> >> >> >> >> >> >> On Tue, Oct 22, 2013 at 11:03 PM, Ryan <rct@thompsonclan.org> wrote: >> >> Including continuous covariates in design matrices is R is just as easy >>> as >>> including categorical ones. Instead of creating a column for each degree >>> of >>> freedom in the categorical variable, you just end up with a single column >>> that simply contains the values of the continuous variable. Try using the >>> model.matrix function with a combination of your categorical variables >>> and >>> continuous ones to see what it does. >>> >>> -Ryan >>> >>> >>> On Tue Oct 22 14:36:38 2013, Richard Friedman wrote: >>> >>> >>>> On Oct 22, 2013, at 5:30 PM, Michael Breen wrote: >>>> >>>> Hi all, >>>> >>>>> >>>>> Our lab analyzes gene-expression from microarray and RNAseq platforms. >>>>> Currently, I am looking for a package to test differential expression >>>>> (DE) >>>>> while considering continuous variables that may alter gene- expression >>>>> profiles. In other words, an ANCOVA type tool. I am quite familiar with >>>>> Limma (ANOVA) but including continuous variables is not very well >>>>> described. >>>>> >>>>> Specifically, we have a project were two groups can be modeled over the >>>>> same 2 time points. One group starts healthy and ends in a disease >>>>> state. >>>>> The other group starts healthy and remains healthy. >>>>> >>>>> We are interested in identifying genes uniquely responding within one >>>>> group >>>>> and not in the other. Thus, we have implemented a longitudinal contrast >>>>> with linear modeling through Limma. However, we are also interested in >>>>> adding one or two continuous variables (tumor size, time spent >>>>> meditating, >>>>> the amount of drinks one consumes etc..) to check if gene expression >>>>> differences or similarities may be due to these factors instead of due >>>>> to >>>>> belonging to a certain class. Limma seems to test categorical >>>>> variables, >>>>> but I don't think it is capable of either correlating gene- expression >>>>> to >>>>> continuous variables. >>>>> >>>>> If not, can someone recommend a tool which may be appropriate for >>>>> such a >>>>> situation? >>>>> >>>>> Yours, >>>>> >>>>> Michael >>>>> >>>>> Dear Michael and list, >>>> >>>> I think that you write the design and contrast matrices >>>> exactly as you would for an ANCOVA in R only you do the >>>> fit and Bayesian correction in Limma. >>>> >>>> Perhaps someone who has had experience doing this >>>> kind of analysis can comment. >>>> >>>> Best wishes, >>>> Rich >>>> >>>> Richard A. Friedman, PhD >>>> Associate Research Scientist, >>>> Biomedical Informatics Shared Resource >>>> Herbert Irving Comprehensive Cancer Center (HICCC) >>>> Lecturer, >>>> Department of Biomedical Informatics (DBMI) >>>> Educational Coordinator, >>>> Center for Computational Biology and Bioinformatics (C2B2)/ >>>> National Center for Multiscale Analysis of Genomic Networks (MAGNet)/ >>>> Columbia Department of Systems Biology >>>> Room 824 >>>> Irving Cancer Research Center >>>> Columbia University >>>> 1130 St. Nicholas Ave >>>> New York, NY 10032 >>>> (212)851-4765 (voice) >>>> friedman@cancercenter.**columb**ia.edu <http: columbia.edu=""> < >>>> friedman@cancercenter.**columbia.edu<friedman@cancercenter.columb ia.edu=""> >>>> > >>>> http://cancercenter.columbia.****edu/~friedman/<http: **="">>>> cancercenter.columbia.edu/~**friedman/<http: cancercenter.columb="" ia.edu="" ~friedman=""/> >>>> > >>>> >>>> In memoriam, Frederik Pohl >>>> >>> > ______________________________**______________________________**____ ______ > The information in this email is confidential and inte...{{dropped:10}}
ADD REPLY
0
Entering edit mode
Dear Michael, On Thu, 24 Oct 2013, Michael Breen wrote: > Hi all, > > Recently, there has been considerable interest in these types of studies, > i.e. those outlined in section 8.7 Multi-level Experiments (in the limma > manual) while including continuous variables that may have overall effects > on gene-expression outcomes. Not only is it informative to correlate > gene-expression to continuous variables, (for example gene- expression with > tumor size) it is also informative to be able to subtract away any > erroneous influences on gene-expression patters (for example, conducting a > time-course study over 2 time points consisting of 2 groups, 1 disease and > 1 healthy, while incorporating the number of cigarettes your cohorts smoke > to the model, only to subtract that measure away in order to measure purely > the main-effects- in this vague example = time and condition). > > From this banter, I have gathered that these techniques are applicable to > Limma, although not apparently so from the manual itself. Could Limma and > the community with such scenarios benefit from an example added to the > Limma manual incorporating continuous variables? Limma can handle any linear model. Just form the design matrix and proceed. > For now, we believe to have the model set-up properly to take into > consideration a continuous variable over time between two groups. We want > to test the main-effect of Time and Condition while removing additive > effects of a continuous variable - here I have labeled it CES. Given that > the syntax below is logical, I hope it can bring clarity to others with > similar scenarios. > > #Main effects of time and Condition > Treat <- factor(paste(targets$Condition,targets$Time,sep=".")) > #Continuous covariable for Cases. > CES <- (targets$CES_Case) > design <- model.matrix(~0+Treat+CES) > colnames(design) > [1] "Case.Time2" "Case.Time1" "Control.Time2" > [4] "Control.Time1" "CES" > > corfit <- duplicateCorrelation(exprs,design,block=targets$Subject) > corfit$consensus > fit <- > lmFit(exprs,design,block=targets$Subject,correlation=corfit$consensus) > > #As an example > cm <- makeContrasts( > Cases = Case.Time2-Case.Time1, > Controls = Control.Time2-Control.Time1, > Contrast= (Case.Time2-Case.Time1)-(Control.Time2-Control.Time1), > aContinuousEffect = CES, > Cases_no_ContinuousEffect=Case.Time2-Case.Time1-CES, > levels=design) This last contrast is unnecessary (and incorrect). All the treatment effects have already been adjusted for the continuous covariate by including the covariate in the linear model. The other contrasts already do what you want. There is no need to manually subtract the covariate coefficient from the interaction. Moreover the subtracted contrast has no meaning because the continuous coefficient and the interaction are different scales. Best wishes Gordon > #Cases = Measures longitudinal effects overtime of cases > #Controls =Measures longitudinal effects overtime of controls > #Contrast=Measures longitudinal contrasts between cases and controls > #aContinuousEffect= the effects of our continuous covariable > #Cases_no_ContinuousEffect = Measures longitudinal effects overtime of > cases without erroneous effects of the continuous variable > > fit2 <- contrasts.fit(fit, cm) > fit2 <- eBayes(fit2) > colnames(fit2) > > > Best, > > Michael > > > > > > > > > > > > On Wed, Oct 23, 2013 at 11:16 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > >> >> Date: Wed, 23 Oct 2013 08:03:19 +0100 >>> From: Michael Breen <breenbioinformatics at="" gmail.com**=""> >>> To: Ryan <rct at="" thompsonclan.org=""> >>> Cc: Bioconductor Mailing List <bioconductor at="" stat.math.ethz.**ch<bioconductor="" at="" stat.math.ethz.ch=""> >>>> , >>> "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> >>> Subject: Re: [BioC] ANCOVA microarray time-course continuous & >>> categorical variables / Limma extensions >>> >>> I have done this in the past, although I did not have much faith in it, >>> maybe because of the way I create the design matrix. >>> >>> For example I do it this way: >>> >>> Treat <- factor(paste(targets$**Condition,targets$Time,sep="."**)) >>> design <- model.matrix(~0+Treat) >>> colnames(design) <- levels(Treat) >>> colnames(design) >>> [1] "Case.Time2" "Case.Time1" "Control.Time2" "Control.Time1" >>> >>> >>> #introduce continuous variable, add to design >>> CES <- factor(targets$CES) >>> >> >> Here is the problem. You are declaring your continuous variable to be >> categorical, which is presumably not want you want to do. >> >> In R, a "factor" means a categorical variable. Anything that is not a >> factor is considered to be continuous. >> >> You could either remove the factor() command here, or else simply use >> >> design <- model.matrix(~0+Treat+CES) >> >> Best wishes >> Gordon >> >> design2 <- cbind(design, CES) >>> colnames(design2) >>> [1] "Case.Time2" "Case.Time1" "Control.Time2" "Control.Time1" >>> "CES" >>> >>> I go on to duplicate correlation and lmfit then make these contrasts to >>> substract away the effect of the continuous variable from our >>> observations. >>> >>> cm <- makeContrasts( >>> CaseEffect = Case.Time2-Case.Time1, >>> ControlEffect = Control.Time2-Control.Time1, >>> ContrastEffect = (Case.Time2-Case.Time1)-(**Control.Time2-Control.Time1), >>> CaseEffectNoCov = (Case.Time2-Case.Time1)-(CES), >>> ControlEffectNoCov = (Case.Time2-Case.Time1)-(CES), >>> ContrastEffectNoCov = (Case.Time2-Case.Time1)-(CES) - >>> (Case.Time2-Time1)-(CES), >>> levels=designnew) >>> >>> etc... >>> >>> Our results are 'different' so it would be useful to have another opinion >>> regarding this set-up. >>> >>> Michael >>> >>> >>> >>> >>> >>> >>> >>> >>> On Tue, Oct 22, 2013 at 11:03 PM, Ryan <rct at="" thompsonclan.org=""> wrote: >>> >>> Including continuous covariates in design matrices is R is just as easy >>>> as >>>> including categorical ones. Instead of creating a column for each degree >>>> of >>>> freedom in the categorical variable, you just end up with a single column >>>> that simply contains the values of the continuous variable. Try using the >>>> model.matrix function with a combination of your categorical variables >>>> and >>>> continuous ones to see what it does. >>>> >>>> -Ryan >>>> >>>> >>>> On Tue Oct 22 14:36:38 2013, Richard Friedman wrote: >>>> >>>> >>>>> On Oct 22, 2013, at 5:30 PM, Michael Breen wrote: >>>>> >>>>> Hi all, >>>>> >>>>>> >>>>>> Our lab analyzes gene-expression from microarray and RNAseq >>>>>> platforms. Currently, I am looking for a package to test >>>>>> differential expression (DE) while considering continuous variables >>>>>> that may alter gene-expression profiles. In other words, an ANCOVA >>>>>> type tool. I am quite familiar with Limma (ANOVA) but including >>>>>> continuous variables is not very well described. >>>>>> >>>>>> Specifically, we have a project were two groups can be modeled over >>>>>> the same 2 time points. One group starts healthy and ends in a >>>>>> disease state. The other group starts healthy and remains healthy. >>>>>> >>>>>> We are interested in identifying genes uniquely responding within >>>>>> one group and not in the other. Thus, we have implemented a >>>>>> longitudinal contrast with linear modeling through Limma. However, >>>>>> we are also interested in adding one or two continuous variables >>>>>> (tumor size, time spent meditating, the amount of drinks one >>>>>> consumes etc..) to check if gene expression differences or >>>>>> similarities may be due to these factors instead of due to >>>>>> belonging to a certain class. Limma seems to test categorical >>>>>> variables, but I don't think it is capable of either correlating >>>>>> gene-expression to continuous variables. >>>>>> >>>>>> If not, can someone recommend a tool which may be appropriate for >>>>>> such a situation? >>>>>> >>>>>> Yours, >>>>>> >>>>>> Michael >>>>>> >>>>>> Dear Michael and list, >>>>> >>>>> I think that you write the design and contrast matrices >>>>> exactly as you would for an ANCOVA in R only you do the >>>>> fit and Bayesian correction in Limma. >>>>> >>>>> Perhaps someone who has had experience doing this >>>>> kind of analysis can comment. >>>>> >>>>> Best wishes, >>>>> Rich >>>>> >>>>> Richard A. Friedman, PhD >>>>> Associate Research Scientist, >>>>> Biomedical Informatics Shared Resource >>>>> Herbert Irving Comprehensive Cancer Center (HICCC) >>>>> Lecturer, >>>>> Department of Biomedical Informatics (DBMI) >>>>> Educational Coordinator, >>>>> Center for Computational Biology and Bioinformatics (C2B2)/ >>>>> National Center for Multiscale Analysis of Genomic Networks (MAGNet)/ >>>>> Columbia Department of Systems Biology >>>>> Room 824 >>>>> Irving Cancer Research Center >>>>> Columbia University >>>>> 1130 St. Nicholas Ave >>>>> New York, NY 10032 >>>>> (212)851-4765 (voice) >>>>> friedman at cancercenter.**columb**ia.edu <http: columbia.edu=""> < >>>>> friedman at cancercenter.**columbia.edu<friedman at="" cancercenter.columbia.edu=""> >>>>>> >>>>> http://cancercenter.columbia.****edu/~friedman/<http: **="">>>>> cancercenter.columbia.edu/~**friedman/<http: cancercenter.colum="" bia.edu="" ~friedman=""/> >>>>>> >>>>> >>>>> In memoriam, Frederik Pohl ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD REPLY

Login before adding your answer.

Traffic: 978 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6