A question about paired data and how to set up the design matrix in LIMMA
1
0
Entering edit mode
@johan-lindberg-815
Last seen 10.2 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20050711/ 8f5b283e/attachment.pl
• 667 views
ADD COMMENT
0
Entering edit mode
@johan-lindberg-815
Last seen 10.2 years ago
Dear Bioconductor users. I posted a question a while ago that did not get an answer. I'm sorry to have to nag and ask the same question again but I really need an answer. The question that I'm asking is how to do the paired analysis in LIMMA? If you don?t have the time to look at my previous post here is a shorter version. I have tried to read previous posts on this matter but I found the answers kind of vague (perhaps I missed something due to my limited knowledge in statistics, in that case please guide me to an explaining post and I apologize for asking the same question again). I have affydata on 4 patients (paired) before and after treatment (unbalanced due to different number of biopsies from each patient). I want to fit the patients as a fixed effect. I am interested in the effect due to treatment. In previous posts I have seen Gordon give directions about two ways to fit patients as a fixed effect with LIMMA. http://files.protsuggest.org/biocond/html/8297.html and http://files.protsuggest.org/biocond/html/8175.html My situation look like this: ######################################################### #Fitting the patients as a fixed effect, I am not totally shure of my #code, please correct me if I am wrong eset<-rma(Data) tmp<-pData(eset) tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) tmp<-cbind(tmp,Patient=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) pData(eset)<-tmp design <- model.matrix(~(After - Before) + Patient, data=as.data.frame(tmp)) fit<-lmFit(eset,design) fitModel <- eBayes(fit) ######################################################### #Using the block argument to ?block? patient. eset<-rma(Data) tmp<-pData(eset) tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) pData(eset) <- tmp design <- model.matrix(~(After - Before), data=pData(eset)) cor<-duplicateCorrelation(eset, design,block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) cor$cor # [1] 0.2739284 fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) fitcor <- eBayes(fit) #I also try this model using the default value, cor=0.75, when using lmFit. eset<-rma(Data) tmp<-pData(eset) tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) pData(eset) <- tmp design <- model.matrix(~(After - Before), data=pData(eset)) fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) fitCorDefault <- eBayes(fit) fitCorDefault$cor #[1] 0.75 ######################################################### So my questions are, if my code is right, which way is the right way to fit patients as a fixed effect since the result differ (the lod-scores differ between the different models)? If my code is wrong, please correct it. And last, may I be bald enough to suggest (since there seem to be a lot of post of how to fit fixed effect in LIMMA) that it would be great if there were an example of how to fit fixed effects in the LIMMA users guide? Thank you for your time Best regards Johan Lindberg -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Johan Lindberg Sent: Monday, July 11, 2005 11:35 AM To: bioconductor at stat.math.ethz.ch Subject: [BioC] A question about paired data and how to set up the designmatrix in LIMMA Hi all. I have tried to read previous posts on this matter but I found the answers kind of vague (perhaps I missed something due to my limited knowledge in statistics, in that case please guide me to an explaining post and I apologize for asking the same question again). I have affydata on 4 patients (paired) before and after treatment (unbalance due to different number of biopsies from each patient). I want to fit the patients as a fixed effect. I am interested in the effect due to treatment. In previous posts I have seen Gordon give directions about two ways to fit patients as a fixed effect with LIMMA. http://files.protsuggest.org/biocond/html/8297.html and http://files.protsuggest.org/biocond/html/8175.html In the first post the t-statistic is according to Gordon suppose to be the corresponding paired t-statistic. But in the second mail he states ?All forms of blocking are much the same from a mathematical point of view, whether they are technical replicates or blocking on patient. The direct extension of a paired t-test would actually arise from fitting patient as a fixed effect. If you did that, limma would compute for you paired t-tests with moderated denominators. If you use the 'block' argument instead of using a fixed effect, I would generally recommend that you estimate the common correction rather than take a preset value. However I suspect that, in this case, if you use a large correlation like 0.75 or higher the results might be very similar to using a fixed patient effect.? My situation look like this: ######################################################### #Fitting the patients as a fixed effect, I am not totally shure of my code, please correct me if I am #wrong eset<-rma(Data) tmp<-pData(eset) tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) tmp<-cbind(tmp,Patient=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) pData(eset)<-tmp design <- model.matrix(~(After - Before) + Patient, data=as.data.frame(tmp)) fit<-lmFit(eset,design) fitModel <- eBayes(fit) ######################################################### #Using the block argument to ?block? patient. eset<-rma(Data) tmp<-pData(eset) tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) pData(eset) <- tmp design <- model.matrix(~(After - Before), data=pData(eset)) cor<-duplicateCorrelation(eset, design,block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) cor$cor # [1] 0.2739284 fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) fitcor <- eBayes(fit) # I also try this model using the default value, cor=0.75, when using lmFit. eset<-rma(Data) tmp<-pData(eset) tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) pData(eset) <- tmp design <- model.matrix(~(After - Before), data=pData(eset)) fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) fitCorDefault <- eBayes(fit) fitCorDefault$cor [1] 0.75 ######################################################### First of all, is my code right? I am kind of sceptic since Gordon states that ?if you use a large correlation like 0.75 or higher the results might be very similar to using a fixed patient effect?. When I plot the B-scores of both models they differ somewhat in the ranking and in the number of genes with B-score >0. They look kind of the same but the B-scores is higher when I use the block argument to ?block? patient using the default correlation. colnames(fitModel$lods) #[1] "(Intercept)" "After" "Patient" length(which(fitModel$lods[,2] > 0)) #[1] 0 fitcor$cor #[1] 0.2739284 colnames(fitcor$lods) #[1] "(Intercept)" "After" length(which(fitcor$lods[,2] > 0)) #[1] 4 fitCorDefault$cor #[1] 0.75 colnames(fitCorDefault$lods) #[1] "(Intercept)" "After" length(which(fitCorDefault$lods[,2] > 0)) #[1] 109 ################# #To get a feeling of similarity between the ranking of the genes cor(fitcor$lods[,2], fitCorDefault$lods[,2], use="pairwise.complete.obs") [1] 0.9781474 cor(fitcor$lods[,2], fitModel$lods[,2], use="pairwise.complete.obs") [1] 0.9753385 cor(fitCorDefault$lods[,2], fitModel$lods[,2], use="pairwise.complete.obs") [1] 0.9610673 ################ So my questions are, if my code is right, which way is the right way to fit patients as a fixed effect since the result differ? If my code is wrong, please guide me of how to get it right. Thank you for your time Best regards Johan Lindberg ********************************************************************** ** ******************* Johan Lindberg Royal Institute of Technology AlbaNova University Center Stockholm Center for Physics, Astronomy and Biotechnology Department of Molecular Biotechnology 106 91 Stockholm, Sweden Phone (office) +46 8 553 783 44 Fax + 46 8 553 784 81 Visiting adress Roslagstullsbacken 21, Floor 3 Delivery adress Roslagsv?gen 30B http://www.biotech.kth.se/molbio/microarray/index.html ********************************************************************** ** ******************* [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Johan, The simple answer is that your data do not quite fit the paired t-test model. You probably want some form of random effects model as can be fit by lme (nlme pacakge) or lmer (Matrix package). Where patients are treated as a random effect and before/after as a fixed effect. I do not believe that limma fits these models, although it may be extended at some time. And neither of those pieces of software (lme or lmer) fits an empirical Bayes model to lots of arrays although they too could be extended. Hence, there is no easy solution. Nor is fitting and interpreting mixed effects models simple - if you have little statistical experience, the best advice is to find someone with a lot more who can help, as the analysis is not likely to be simple or straightforward. You can always take the expedient of discarding some of the data, so that you are reduced to one before and one after sample per patient. Then standard paired t-tests can be used. But you have lost data. Best wishes, Robert Johan Lindberg wrote: > Dear Bioconductor users. > I posted a question a while ago that did not get an answer. I'm sorry to > have to nag and ask the same question again but I really need an answer. > The question that I'm asking is how to do the paired analysis in LIMMA? > If you don?t have the time to look at my previous post here is a shorter > version. > > I have tried to read previous posts on this matter but I found the > answers kind of vague (perhaps I missed something due to my limited > knowledge in statistics, in that case please guide me to an explaining > post and I apologize for asking the same question again). > > I have affydata on 4 patients (paired) before and after treatment > (unbalanced due to different number of biopsies from each patient). I > want to fit the patients as a fixed effect. I am interested in the > effect due to treatment. > > In previous posts I have seen Gordon give directions about two ways to > fit patients as a fixed effect with LIMMA. > > http://files.protsuggest.org/biocond/html/8297.html > > and > > http://files.protsuggest.org/biocond/html/8175.html > > My situation look like this: > ######################################################### > #Fitting the patients as a fixed effect, I am not totally shure of my > #code, please correct me if I am wrong > eset<-rma(Data) > tmp<-pData(eset) > tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) > tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) > tmp<-cbind(tmp,Patient=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) > pData(eset)<-tmp > design <- model.matrix(~(After - Before) + Patient, > data=as.data.frame(tmp)) > fit<-lmFit(eset,design) > fitModel <- eBayes(fit) > > ######################################################### > #Using the block argument to ?block? patient. > eset<-rma(Data) > tmp<-pData(eset) > tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) > tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) > pData(eset) <- tmp > design <- model.matrix(~(After - Before), data=pData(eset)) > cor<-duplicateCorrelation(eset, > design,block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) > cor$cor > # [1] 0.2739284 > fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) > fitcor <- eBayes(fit) > > #I also try this model using the default value, cor=0.75, when using > lmFit. > eset<-rma(Data) > tmp<-pData(eset) > tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) > tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) > pData(eset) <- tmp > design <- model.matrix(~(After - Before), data=pData(eset)) > fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) > fitCorDefault <- eBayes(fit) > fitCorDefault$cor > #[1] 0.75 > ######################################################### > > So my questions are, if my code is right, which way is the right way to > fit patients as a fixed effect since the result differ (the lod- scores > differ between the different models)? If my code is wrong, please > correct it. And last, may I be bald enough to suggest (since there seem > to be a lot of post of how to fit fixed effect in LIMMA) that it would > be great if there were an example of how to fit fixed effects in the > LIMMA users guide? > > Thank you for your time > > Best regards > > Johan Lindberg > > > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Johan > Lindberg > Sent: Monday, July 11, 2005 11:35 AM > To: bioconductor at stat.math.ethz.ch > Subject: [BioC] A question about paired data and how to set up the > designmatrix in LIMMA > > Hi all. > > I have tried to read previous posts on this matter but I found the > answers kind of vague (perhaps I missed something due to my limited > knowledge in statistics, in that case please guide me to an explaining > post and I apologize for asking the same question again). > > I have affydata on 4 patients (paired) before and after treatment > (unbalance due to different number of biopsies from each patient). I > want to fit the patients as a fixed effect. I am interested in the > effect due to treatment. > > In previous posts I have seen Gordon give directions about two ways to > fit patients as a fixed effect with LIMMA. > > http://files.protsuggest.org/biocond/html/8297.html > > and > > http://files.protsuggest.org/biocond/html/8175.html > > In the first post the t-statistic is according to Gordon suppose to be > the corresponding paired t-statistic. > But in the second mail he states ?All forms of blocking are much the > same from a mathematical point of view, whether they are technical > replicates or blocking on patient. > The direct extension of a paired t-test would actually arise from > fitting patient as a fixed effect. If you did that, limma would compute > for you paired t-tests with moderated denominators. If you use the > 'block' argument instead of using a fixed effect, I would generally > recommend that you estimate the common correction rather than take a > preset value. However I suspect that, in this case, if you use a large > correlation like 0.75 or higher the results might be very similar to > using a fixed patient effect.? > > My situation look like this: > ######################################################### > #Fitting the patients as a fixed effect, I am not totally shure of my > code, please correct me if I am #wrong > eset<-rma(Data) > tmp<-pData(eset) > tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) > tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) > tmp<-cbind(tmp,Patient=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) > pData(eset)<-tmp > design <- model.matrix(~(After - Before) + Patient, > data=as.data.frame(tmp)) > fit<-lmFit(eset,design) > fitModel <- eBayes(fit) > > ######################################################### > #Using the block argument to ?block? patient. > eset<-rma(Data) > tmp<-pData(eset) > tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) > tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) > pData(eset) <- tmp > design <- model.matrix(~(After - Before), data=pData(eset)) > cor<-duplicateCorrelation(eset, > design,block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) > cor$cor > # [1] 0.2739284 > fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) > fitcor <- eBayes(fit) > > # I also try this model using the default value, cor=0.75, when using > lmFit. > eset<-rma(Data) > tmp<-pData(eset) > tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0)) > tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1)) > pData(eset) <- tmp > design <- model.matrix(~(After - Before), data=pData(eset)) > fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4)) > fitCorDefault <- eBayes(fit) > fitCorDefault$cor > [1] 0.75 > ######################################################### > > First of all, is my code right? I am kind of sceptic since Gordon states > that ?if you use a large correlation like 0.75 or higher the results > might be very similar to using a fixed patient effect?. When I plot the > B-scores of both models they differ somewhat in the ranking and in the > number of genes with B-score >0. They look kind of the same but the > B-scores is higher when I use the block argument to ?block? patient > using the default correlation. > > colnames(fitModel$lods) > #[1] "(Intercept)" "After" "Patient" > length(which(fitModel$lods[,2] > 0)) > #[1] 0 > > fitcor$cor > #[1] 0.2739284 > colnames(fitcor$lods) > #[1] "(Intercept)" "After" > length(which(fitcor$lods[,2] > 0)) > #[1] 4 > > fitCorDefault$cor > #[1] 0.75 > colnames(fitCorDefault$lods) > #[1] "(Intercept)" "After" > length(which(fitCorDefault$lods[,2] > 0)) > #[1] 109 > > ################# > #To get a feeling of similarity between the ranking of the genes > cor(fitcor$lods[,2], fitCorDefault$lods[,2], > use="pairwise.complete.obs") > [1] 0.9781474 > cor(fitcor$lods[,2], fitModel$lods[,2], use="pairwise.complete.obs") > [1] 0.9753385 > cor(fitCorDefault$lods[,2], fitModel$lods[,2], > use="pairwise.complete.obs") > [1] 0.9610673 > ################ > > So my questions are, if my code is right, which way is the right way to > fit patients as a fixed effect since the result differ? > > If my code is wrong, please guide me of how to get it right. > > Thank you for your time > > Best regards > > Johan Lindberg > > > ******************************************************************** **** > ******************* > Johan Lindberg > Royal Institute of Technology > AlbaNova University Center > Stockholm Center for Physics, Astronomy and Biotechnology > Department of Molecular Biotechnology > 106 91 Stockholm, Sweden > > Phone (office) +46 8 553 783 44 > Fax + 46 8 553 784 81 > Visiting adress Roslagstullsbacken 21, Floor 3 > Delivery adress Roslagsv?gen 30B > http://www.biotech.kth.se/molbio/microarray/index.html > ******************************************************************** **** > ******************* > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD REPLY

Login before adding your answer.

Traffic: 645 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6