Re: limma and affy (fwd)

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

>Date: Sat, 21 Jun 2003 00:37:37 EDT >From: Phguardiol@aol.com >Subject: limma and affy > >Sir, >I m trying to use limma after having extract background and normalize my data >(HU133A chips from Affy). then I have 3 different groups with duplicates ou >triplicates and I d like to know what are the genes differentially expressed >between these 3 groups. >here is what I tried to do so: >library(affy) >library(limma) >data <- ReadAffy() >data2<- rma(data) >design <- c(1,1,1,2,2,3,3,3) >fit<-lm.series(exprs(data2), design, weights=1/se.exprs(data2)^2) A couple of problems here. First 'design' has to be the actual design matrix, not just a vector of group identifiers. You can calculate the right matrix using design <- model.matrix(~ -1+factor(c(1,1,1,2,2,3,3,3))) colnames(design) <- c("group1", "group2", "group3") Secondly, 'rma' does not always calculate standard errors. In this case you should skip the weights and simply use fit <- lm.series(exprs(data2), design) To make all pairwise comparisons between your three groups you can estimate the appropriate contrasts by contrast.matrix <- cbind(group2vs1=c(-1,1,0), group3vs2=c(0,-1,1), group3vs1=c(-1,0,1)) fit2 <- contrasts.fit(fit, contrast.matrix) eb <- ebayes(fit2) If you download LIMMA version 1.1.2 then there are some further possibilities, for example clas <- classifyTests(eb$t, design=design, contrasts=contrast.matrix) will use F-tests to classify which of the pair-wise differences between the groups are significant for each gene. Gordon >eb<-ebayes(fit) >and then I got this message: Error in if (out$df.prior == Inf) out$s2.post <- >rep(out$s2.prior, length(sigma)) else out$s2.post <- (ifelse(df.residual == >: > missing value where TRUE/FALSE needed >I wonder if my design object is correctly define. the way it is constructed >is that my first 3 chips are representing group 1, the 2 thereafter are >belonging to group 2, and the last 3 to group 3. should it be something else ? >thanks for your help >yours sincerely >Philippe Guardiola

limma limma • 1.5k views

ADD COMMENT • link 20.9 years ago Gordon Smyth 50k

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

At 09:50 PM 21/06/2003, Phguardiol@aol.com wrote: >Gordon, >thank you very much for these precious information. >I have few additional questions... >once you have done this: >library(affy) >library(limma) ----> last version as suggested >data<-ReadAffy() >data2<-rma(data) >design <- model.matrix(~ -1+factor(c(1,1,1,2,2,3,3,3))) >colnames(design) <- c("group1", "group2", "group3") >fit <- lm.series(exprs(data2), design) >contrast.matrix <- cbind(group2vs1=c(-1,1,0), group3vs2=c(0,-1,1), >group3vs1=c(-1,0,1)) >fit2 <- contrasts.fit(fit, contrast.matrix) >eb <- ebayes(fit2) > > >>> how do you get the list that you can put for example in Excel...? can > I save eb in a way that I can open it later in Excel ? surely a stupid > question ! (same for fit2) > >>> then I tried: >clas <- classifyTests(eb$t, design=design, contrasts=contrast.matrix) > >>> and I have the same question again ! in addition is there a way to > just get the genes with significant variation: 1 / -1 I guess ? > >>> then I ran: >options(digits=3) >toptable(number=30, genelist=gal, fit=fit, eb=eb, adjust="fdr") > >>> and of course genelist is not recognized, so I have 2 options: either > delete the genelist : toptable(number=30, fit=fit, eb=eb, adjust="fdr") > but I d like to have the identification ! how can I include the names > from the U133A ? You can't use 'gal' because you don't have a GenePix Allocation List! You need instead tab <- toptable(coef="group2vs1",genelist=geneNames(data2), fit=fit, eb=eb, adjust="fdr") Then you can write the results to disk using 'write.table', for example write.table(tab,file="toptable2vs1.txt",quote=FALSE,row.names=FALSE ,sep="\t") > >>> then finally I ran >ord <- order(eb$lods, decreasing=TRUE) >top30 <- ord[1:30] >M <- fit$coef >A <- apply(exprs(data2), 1, mean) >plot(A, M, pch=16, cex=0.1) > >>> but here it says: Error in xy.coords(x, y, xlabel, ylabel, log) : x > and y lengths differ Well yes, because fit$coef contains three columns corresponding to your three contrasts while A is only one column. You need M <- fit$coef[,"group2vs1"] etc >I think I m going to stop here for this night ! >thanks for your help >Philippe Gordon

ADD COMMENT • link 20.9 years ago Gordon Smyth 50k

0

Entering edit mode

Rafael A. Irizarry ★ 2.3k

@rafael-a-irizarry-205

Last seen 9.6 years ago

if you want to use weights coming from standard errors i would write a summary function (for expresso) based on rlm as opposed to median polish. it will take much much longer but the resutls will be more meaningful (in my opinion) On Sat, 21 Jun 2003, Gordon Smyth wrote: > > >Date: Sat, 21 Jun 2003 00:37:37 EDT > >From: Phguardiol@aol.com > >Subject: limma and affy > > > >Sir, > >I m trying to use limma after having extract background and normalize my data > >(HU133A chips from Affy). then I have 3 different groups with duplicates ou > >triplicates and I d like to know what are the genes differentially expressed > >between these 3 groups. > >here is what I tried to do so: > >library(affy) > >library(limma) > >data <- ReadAffy() > >data2<- rma(data) > >design <- c(1,1,1,2,2,3,3,3) > >fit<-lm.series(exprs(data2), design, weights=1/se.exprs(data2)^2) > > A couple of problems here. First 'design' has to be the actual design > matrix, not just a vector of group identifiers. You can calculate the right > matrix using > > design <- model.matrix(~ -1+factor(c(1,1,1,2,2,3,3,3))) > colnames(design) <- c("group1", "group2", "group3") > > Secondly, 'rma' does not always calculate standard errors. In this case you > should skip the weights and simply use > > fit <- lm.series(exprs(data2), design) > > To make all pairwise comparisons between your three groups you can estimate > the appropriate contrasts by > > contrast.matrix <- cbind(group2vs1=c(-1,1,0), group3vs2=c(0,-1,1), > group3vs1=c(-1,0,1)) > fit2 <- contrasts.fit(fit, contrast.matrix) > eb <- ebayes(fit2) > > If you download LIMMA version 1.1.2 then there are some further > possibilities, for example > > clas <- classifyTests(eb$t, design=design, contrasts=contrast.matrix) > > will use F-tests to classify which of the pair-wise differences between the > groups are significant for each gene. > > Gordon > > >eb<-ebayes(fit) > >and then I got this message: Error in if (out$df.prior == Inf) out$s2.post <- > >rep(out$s2.prior, length(sigma)) else out$s2.post <- (ifelse(df.residual == > >: > > missing value where TRUE/FALSE needed > >I wonder if my design object is correctly define. the way it is constructed > >is that my first 3 chips are representing group 1, the 2 thereafter are > >belonging to group 2, and the last 3 to group 3. should it be something else ? > >thanks for your help > >yours sincerely > >Philippe Guardiola > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 20.9 years ago Rafael A. Irizarry ★ 2.3k

0

Entering edit mode

Phguardiol@aol.com ▴ 720

@phguardiolaolcom-152

Last seen 9.6 years ago

Hi Gordon, here I am now : with one potential problem in line 15 that maybe I solved ? 1 library(affy) 2 library(limma) 3 data <- ReadAffy() 4 data2 <- rma(data) 5 design <- model.matrix(~ -1+factor(c(1,1,2,2,3,3))) 6 colnames(design) <- c("group1", "group2", "group3") 7 fit <- lm.series(exprs(data2), design) 8 contrast.matrix <- cbind(group2vs1=c(-1,1,0), group3vs2=c(0,-1,1), group3vs1=c(-1,0,1)) 9 fit2 <- contrasts.fit(fit, contrast.matrix) 10 eb <- ebayes(fit2) 11 clas <- classifyTests(eb$t, design=design, contrasts=contrast.matrix) 12 qqt(eb$t, df=3+eb$df, pch=16, cex=0.1) 13 abline(0,1) 14 options(digits=3) #then if I do: 15 tab <- toptable(coef="group2vs1", number=30, genelist=geneNames(data2), fit=fit,eb=eb, adjust="fdr") #I get: Error in toptable(coef = "group2vs1", number = 30, genelist = geneNames(data2), : subscript out of #bounds #I wonder if it should not be : 16 tab <- toptable(coef="group2vs1", number=30, genelist=geneNames(data2), fit=fit2,eb=eb, adjust="fdr") #the more so that it works with fit=fit2 ! If so I guess then after I have to use fit2 instead of fit in lines 19, 24 ? 17 ord <- order(eb$lods, decreasing=TRUE) 18 top30 <- ord[1:30] 19 M <- fit2$coef 20 A <- apply(exprs(data2), 1, mean) 21 plot(A, M, pch=16, cex=0.1) 22 text(A[top30], M[top30], cex=0.8, col="blue") 23 abline(0,0,col="blue") 24 plot(fit2$coef[,2], eb$lods[,2], pch=16, cex=0.1, xlab="Log Fold Change", ylab="Log Odds", main="group1 vs group2") 25 write.table(clas,file="clastable2vs1b.txt",quote=TRUE,row.names=TRUE,s ep="\t") there if I open the file clastable2vs1b.txt it is like in affy, first line has to be moved one cell to the right. No cure for this "bug" ? I d like to have instead of 1,2,3 in the first column, the probe set id like in the tab. how can I do that ? I tried: clas <- classifyTests(eb$t, design=design, genelist=geneNames(data2), contrasts=contrast.matrix) but obtained: Error in classifyTests(eb$t, design = design, genelist = geneNames(data2), : unused argument(s) (genelist ...) I tried other options but same result and I ll not detail ! thanks Philippe [[alternative HTML version deleted]]

ADD COMMENT • link 20.9 years ago Phguardiol@aol.com ▴ 720

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

At 09:20 AM 24/06/2003, Phguardiol@aol.com wrote: >Gordon, > >if I have 4 groups instead of 3 and no missing values shall I change >df=3+eb... into df= 4+eb... ? ie: df=number of groups+... ? of course I d >have to change the contrast matrix If you have no missing values, then the degrees of freedom contributed from the fit will be (number of arrays) - (number of groups). But please use df = fit$df + eb$df as this is always correct. fit$df is the usual residual degrees of freedom from one-way analysis of variance model for each gene. eb$df is the degrees of freedom contributed by Bayesian smoothing of the residual standard deviations. >>Do you have any missing values? If you do, df = fit$df + eb$df would be >>better. >>-> I guess you mean df = fit2.... ? It doesn't matter, 'fit' and 'fit2' have the same residual degrees of freedom. Changing the contrasts or changing the parametrization doesn't change the residual standard error estimates. >>As we discussed in a previous email, lines 19-21 just can't go together. >>M is a matrix, A is just a vector. >>-> yes that was a mistake but I did correct it in my last command file > >>Now you are in the realm of the R language itself rather than LIMMA >>specifically. > >-> yes and I need to learn a lot there ! > >thanks a lot >Philippe Gordon

ADD COMMENT • link 20.9 years ago Gordon Smyth 50k

Login before adding your answer.