Understanding the columns in the limma results output
1
1
Entering edit mode
Jorge Miró ▴ 160
@jorge-miro-5469
Last seen 10.3 years ago
Hi, I have run the commands below to get an analysis of differential expressions in my Affymetrix arrays #Prepare the design and contrast matrices for my comparisons of the three groups in a loop-manner. > design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3))) > colnames(design)<- c('GroupA', 'GroupB', 'GroupC') > contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC-GroupA, GroupC-GroupB, levels=design) #Check design and contrast matrices > design GroupA GroupB GroupC 1 1 0 0 2 1 0 0 3 1 0 0 4 0 1 0 5 0 1 0 6 0 1 0 7 0 0 1 8 0 0 1 9 0 0 1 attr(,"assign") [1] 1 1 1 attr(,"contrasts") attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))` [1] "contr.treatment" > contrast.matrix Contrasts Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB GroupA -1 -1 0 GroupB 1 0 -1 GroupC 0 1 1 #Fitting the eset to to the design and contrast > fit <- lmFit(exprs, design) > fit2 <- contrasts.fit(fit, contrast.matrix) #Computing the statistics > fit2 <- eBayes(fit2) Then I check the results with topTable and get the following columns in the output > topTable(fit2) GroupB...GroupA GroupC...GroupA GroupC...GroupB AveExpr F P.Value adj.P.Val 25031 2.3602203 2.4273830 0.06716267 5.021412 29.06509 7.844834e-05 0.9587773 12902 -0.4572897 -0.5680943 -0.11080467 7.516681 25.41608 1.365021e-04 0.9587773 7158 -0.4478660 -0.4296077 0.01825833 7.057833 23.48871 1.880100e-04 0.9587773 18358 -0.1002647 0.3304903 0.43075500 7.352807 22.78417 2.125096e-04 0.9587773 28768 -0.7695883 -1.3837750 -0.61418667 3.983044 22.47514 2.244612e-04 0.9587773 28820 -0.1708800 -0.9939680 -0.82308800 5.470826 18.25071 5.081473e-04 0.9587773 15238 -0.4850297 -0.4658157 0.01921400 7.071662 17.15191 6.440979e-04 0.9587773 24681 -0.3759717 -0.3486450 0.02732667 9.281578 16.47813 7.493077e-04 0.9587773 19246 -0.8675393 -0.5082140 0.35932533 8.123538 16.27776 7.845150e-04 0.9587773 28808 0.2601277 0.6909140 0.43078633 4.814602 16.21283 7.963487e-04 0.9587773 I want to export my results and write > results <- decideTests(fit2) > write.fit(fit2, results, "limma_results.txt", adjust="BH") Now don't get the same columns as when using topTable which is quite confusing. Why don't I get the FC for the comparisons between the different groups as if I run topTable with the coef parameter ( "topTable(fit2, coef=1)" )? The columns I get are the following A Coef.GroupB - GroupA Coef.GroupC - GroupA Coef.GroupC - GroupB t.GroupB - GroupA t.GroupC - GroupA t.GroupC - GroupB p.value.GroupB - GroupA p.value.GroupC - GroupA p.value.GroupC - GroupB p.value.adj.GroupB - GroupA p.value.adj.GroupC - GroupA p.value.adj.GroupC - GroupB F F.p.value Res.GroupB - GroupA Res.GroupC - GroupA Res.GroupC - GroupB Could some body please try to explain what do the columns A, Coef, F, F.p.value and Res mean? #Session info > sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C LC_TIME=Swedish_Sweden.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] limma_3.12.1 Biobase_2.16.0 BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] affylmGUI_1.30.0 IRanges_1.14.4 oneChannelGUI_1.22.10 stats4_2.15.0 tcltk_2.15.0 > Best regards JMA [[alternative HTML version deleted]]
• 5.7k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States
Hi Jorge, On 8/28/2012 11:20 AM, Jorge Mir? wrote: > Hi, > > I have run the commands below to get an analysis of differential > expressions in my Affymetrix arrays > > #Prepare the design and contrast matrices for my comparisons of the three > groups in a loop-manner. >> design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3))) >> colnames(design)<- c('GroupA', 'GroupB', 'GroupC') >> contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC-GroupA, > GroupC-GroupB, levels=design) > > #Check design and contrast matrices >> design > GroupA GroupB GroupC > 1 1 0 0 > 2 1 0 0 > 3 1 0 0 > 4 0 1 0 > 5 0 1 0 > 6 0 1 0 > 7 0 0 1 > 8 0 0 1 > 9 0 0 1 > attr(,"assign") > [1] 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))` > [1] "contr.treatment" > >> contrast.matrix > Contrasts > Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB > GroupA -1 -1 0 > GroupB 1 0 -1 > GroupC 0 1 1 > > #Fitting the eset to to the design and contrast >> fit<- lmFit(exprs, design) >> fit2<- contrasts.fit(fit, contrast.matrix) > #Computing the statistics >> fit2<- eBayes(fit2) > > Then I check the results with topTable and get the following columns in the > output >> topTable(fit2) > GroupB...GroupA GroupC...GroupA GroupC...GroupB AveExpr F > P.Value adj.P.Val > 25031 2.3602203 2.4273830 0.06716267 5.021412 29.06509 > 7.844834e-05 0.9587773 > 12902 -0.4572897 -0.5680943 -0.11080467 7.516681 25.41608 > 1.365021e-04 0.9587773 > 7158 -0.4478660 -0.4296077 0.01825833 7.057833 23.48871 > 1.880100e-04 0.9587773 > 18358 -0.1002647 0.3304903 0.43075500 7.352807 22.78417 > 2.125096e-04 0.9587773 > 28768 -0.7695883 -1.3837750 -0.61418667 3.983044 22.47514 > 2.244612e-04 0.9587773 > 28820 -0.1708800 -0.9939680 -0.82308800 5.470826 18.25071 > 5.081473e-04 0.9587773 > 15238 -0.4850297 -0.4658157 0.01921400 7.071662 17.15191 > 6.440979e-04 0.9587773 > 24681 -0.3759717 -0.3486450 0.02732667 9.281578 16.47813 > 7.493077e-04 0.9587773 > 19246 -0.8675393 -0.5082140 0.35932533 8.123538 16.27776 > 7.845150e-04 0.9587773 > 28808 0.2601277 0.6909140 0.43078633 4.814602 16.21283 > 7.963487e-04 0.9587773 > > I want to export my results and write > >> results<- decideTests(fit2) >> write.fit(fit2, results, "limma_results.txt", adjust="BH") > Now don't get the same columns as when using topTable which is quite > confusing. Why don't I get the FC for the comparisons between the different > groups as if I run topTable with the coef parameter ( "topTable(fit2, > coef=1)" )? The columns I get are the following The simple answer is that they are two different functions with different goals. But note that you do get the same information. > > A > > Coef.GroupB - GroupA > Coef.GroupC - GroupA > Coef.GroupC - GroupB > > t.GroupB - GroupA > t.GroupC - GroupA > t.GroupC - GroupB > > p.value.GroupB - GroupA > p.value.GroupC - GroupA > p.value.GroupC - GroupB > > p.value.adj.GroupB - GroupA > p.value.adj.GroupC - GroupA > p.value.adj.GroupC - GroupB > > F > F.p.value > > Res.GroupB - GroupA > Res.GroupC - GroupA > Res.GroupC - GroupB > > > Could some body please try to explain what do the columns A, Coef, F, > F.p.value and Res mean? A - are your log fold change values Coef - are your coefficients (you set up a cell means model, so these are the sample means) F - is an F-statistic, which tests the null hypothesis that none of the sample means are different F.p.value - is the p-value for the F-statistic Res - is the results matrix you passed into write.fit(), showing which contrast(s) were significant Best, Jim > > > > #Session info >> sessionInfo() > R version 2.15.0 (2012-03-30) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 > LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C > LC_TIME=Swedish_Sweden.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_3.12.1 Biobase_2.16.0 BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] affylmGUI_1.30.0 IRanges_1.14.4 oneChannelGUI_1.22.10 > stats4_2.15.0 tcltk_2.15.0 > Best regards > JMA > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi James, thanks for the explanation. I do not really understand the columns yet. Shouldn't the FC be printed for every comparison as is done for the Coef-columns? I just get one A-column. Is there any way of printing the results to a file with the same columns as in topTable()? What I'm really want to have in the output is a p-value column, an FC column and an adjusted p-value column. Best regards On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Jorge, > > > On 8/28/2012 11:20 AM, Jorge Miró wrote: > >> Hi, >> >> I have run the commands below to get an analysis of differential >> expressions in my Affymetrix arrays >> >> #Prepare the design and contrast matrices for my comparisons of the three >> groups in a loop-manner. >> >>> design<- model.matrix(~0+factor(c(1,1,**1,2,2,2,3,3,3))) >>> colnames(design)<- c('GroupA', 'GroupB', 'GroupC') >>> contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC-GroupA, >>> >> GroupC-GroupB, levels=design) >> >> #Check design and contrast matrices >> >>> design >>> >> GroupA GroupB GroupC >> 1 1 0 0 >> 2 1 0 0 >> 3 1 0 0 >> 4 0 1 0 >> 5 0 1 0 >> 6 0 1 0 >> 7 0 0 1 >> 8 0 0 1 >> 9 0 0 1 >> attr(,"assign") >> [1] 1 1 1 >> attr(,"contrasts") >> attr(,"contrasts")$`factor(c(**1, 1, 1, 2, 2, 2, 3, 3, 3))` >> [1] "contr.treatment" >> >> contrast.matrix >>> >> Contrasts >> Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB >> GroupA -1 -1 0 >> GroupB 1 0 -1 >> GroupC 0 1 1 >> >> #Fitting the eset to to the design and contrast >> >>> fit<- lmFit(exprs, design) >>> fit2<- contrasts.fit(fit, contrast.matrix) >>> >> #Computing the statistics >> >>> fit2<- eBayes(fit2) >>> >> >> Then I check the results with topTable and get the following columns in >> the >> output >> >>> topTable(fit2) >>> >> GroupB...GroupA GroupC...GroupA GroupC...GroupB AveExpr F >> P.Value adj.P.Val >> 25031 2.3602203 2.4273830 0.06716267 5.021412 29.06509 >> 7.844834e-05 0.9587773 >> 12902 -0.4572897 -0.5680943 -0.11080467 7.516681 25.41608 >> 1.365021e-04 0.9587773 >> 7158 -0.4478660 -0.4296077 0.01825833 7.057833 23.48871 >> 1.880100e-04 0.9587773 >> 18358 -0.1002647 0.3304903 0.43075500 7.352807 22.78417 >> 2.125096e-04 0.9587773 >> 28768 -0.7695883 -1.3837750 -0.61418667 3.983044 22.47514 >> 2.244612e-04 0.9587773 >> 28820 -0.1708800 -0.9939680 -0.82308800 5.470826 18.25071 >> 5.081473e-04 0.9587773 >> 15238 -0.4850297 -0.4658157 0.01921400 7.071662 17.15191 >> 6.440979e-04 0.9587773 >> 24681 -0.3759717 -0.3486450 0.02732667 9.281578 16.47813 >> 7.493077e-04 0.9587773 >> 19246 -0.8675393 -0.5082140 0.35932533 8.123538 16.27776 >> 7.845150e-04 0.9587773 >> 28808 0.2601277 0.6909140 0.43078633 4.814602 16.21283 >> 7.963487e-04 0.9587773 >> >> I want to export my results and write >> >> results<- decideTests(fit2) >>> write.fit(fit2, results, "limma_results.txt", adjust="BH") >>> >> Now don't get the same columns as when using topTable which is quite >> confusing. Why don't I get the FC for the comparisons between the >> different >> groups as if I run topTable with the coef parameter ( "topTable(fit2, >> coef=1)" )? The columns I get are the following >> > > The simple answer is that they are two different functions with different > goals. But note that you do get the same information. > > > >> A >> >> Coef.GroupB - GroupA >> Coef.GroupC - GroupA >> Coef.GroupC - GroupB >> >> t.GroupB - GroupA >> t.GroupC - GroupA >> t.GroupC - GroupB >> >> p.value.GroupB - GroupA >> p.value.GroupC - GroupA >> p.value.GroupC - GroupB >> >> p.value.adj.GroupB - GroupA >> p.value.adj.GroupC - GroupA >> p.value.adj.GroupC - GroupB >> >> F >> F.p.value >> >> Res.GroupB - GroupA >> Res.GroupC - GroupA >> Res.GroupC - GroupB >> >> >> Could some body please try to explain what do the columns A, Coef, F, >> F.p.value and Res mean? >> > > A - are your log fold change values > Coef - are your coefficients (you set up a cell means model, so these are > the sample means) > F - is an F-statistic, which tests the null hypothesis that none of the > sample means are different > F.p.value - is the p-value for the F-statistic > Res - is the results matrix you passed into write.fit(), showing which > contrast(s) were significant > > Best, > > Jim > > > >> >> >> #Session info >> >>> sessionInfo() >>> >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 >> LC_MONETARY=Swedish_Sweden.**1252 LC_NUMERIC=C >> LC_TIME=Swedish_Sweden.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] limma_3.12.1 Biobase_2.16.0 BiocGenerics_0.2.0 >> >> loaded via a namespace (and not attached): >> [1] affylmGUI_1.30.0 IRanges_1.14.4 oneChannelGUI_1.22.10 >> stats4_2.15.0 tcltk_2.15.0 >> Best regards >> JMA >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Jorge, My bad. The coefficients for write.fit() are the fold changes, and you do get a column for each. As an example, with fake data. > library(limma) > mat <- matrix(rnorm(12e5), ncol = 12) > design <- model.matrix(~0+factor(rep(1:4, each = 3))) > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3) > fit2 <- contrasts.fit(fit, contrast) > fit2 <- eBayes(fit2) > rslt <- decideTests(fit2) > write.fit(fit2, rslt, "tmp.txt") > dat <- read.table("tmp.txt", header=T) > head(dat) Coef.1 Coef.2 Coef.3 t.1 t.2 t.3 p.value.1 p.value.2 p.value.3 F 1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53 0.30621 0.66699 0.59932 0.35 2 -1.200 -0.599 0.524 -1.47 -0.73 0.64 0.14148 0.46259 0.52034 1.67 3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85 0.21627 0.06176 0.06492 1.54 4 0.520 0.308 1.059 0.64 0.38 1.30 0.52390 0.70573 0.19449 0.60 5 -0.848 0.269 -0.611 -1.04 0.33 -0.75 0.29880 0.74132 0.45410 0.81 6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58 0.90847 0.76396 0.56088 0.13 F.p.value Res.1 Res.2 Res.3 1 0.78697 0 0 0 2 0.17130 0 0 0 3 0.20359 0 0 0 4 0.61671 0 0 0 5 0.48698 0 0 0 6 0.94300 0 0 0 > nrow(mat) [1] 100000 > nrow(dat) [1] 100000 An alternative you might consider is the writeFit() function from my affycoretools package, which outputs the data in a slightly different format. Best, Jim On 8/29/2012 2:55 AM, Jorge Mir? wrote: > Hi James, > > thanks for the explanation. I do not really understand the columns > yet. Shouldn't the FC be printed for every comparison as is done for > the Coef-columns? I just get one A-column. > Is there any way of printing the results to a file with the same > columns as in topTable()? > What I'm really want to have in the output is a p-value column, an FC > column and an adjusted p-value column. > > Best regards > > > On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Jorge, > > > On 8/28/2012 11:20 AM, Jorge Mir? wrote: > > Hi, > > I have run the commands below to get an analysis of differential > expressions in my Affymetrix arrays > > #Prepare the design and contrast matrices for my comparisons > of the three > groups in a loop-manner. > > design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3))) > colnames(design)<- c('GroupA', 'GroupB', 'GroupC') > contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC- GroupA, > > GroupC-GroupB, levels=design) > > #Check design and contrast matrices > > design > > GroupA GroupB GroupC > 1 1 0 0 > 2 1 0 0 > 3 1 0 0 > 4 0 1 0 > 5 0 1 0 > 6 0 1 0 > 7 0 0 1 > 8 0 0 1 > 9 0 0 1 > attr(,"assign") > [1] 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))` > [1] "contr.treatment" > > contrast.matrix > > Contrasts > Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB > GroupA -1 -1 0 > GroupB 1 0 -1 > GroupC 0 1 1 > > #Fitting the eset to to the design and contrast > > fit<- lmFit(exprs, design) > fit2<- contrasts.fit(fit, contrast.matrix) > > #Computing the statistics > > fit2<- eBayes(fit2) > > > Then I check the results with topTable and get the following > columns in the > output > > topTable(fit2) > > GroupB...GroupA GroupC...GroupA GroupC...GroupB > AveExpr F > P.Value adj.P.Val > 25031 2.3602203 2.4273830 0.06716267 5.021412 > 29.06509 > 7.844834e-05 0.9587773 > 12902 -0.4572897 -0.5680943 -0.11080467 7.516681 > 25.41608 > 1.365021e-04 0.9587773 > 7158 -0.4478660 -0.4296077 0.01825833 7.057833 > 23.48871 > 1.880100e-04 0.9587773 > 18358 -0.1002647 0.3304903 0.43075500 7.352807 > 22.78417 > 2.125096e-04 0.9587773 > 28768 -0.7695883 -1.3837750 -0.61418667 3.983044 > 22.47514 > 2.244612e-04 0.9587773 > 28820 -0.1708800 -0.9939680 -0.82308800 5.470826 > 18.25071 > 5.081473e-04 0.9587773 > 15238 -0.4850297 -0.4658157 0.01921400 7.071662 > 17.15191 > 6.440979e-04 0.9587773 > 24681 -0.3759717 -0.3486450 0.02732667 9.281578 > 16.47813 > 7.493077e-04 0.9587773 > 19246 -0.8675393 -0.5082140 0.35932533 8.123538 > 16.27776 > 7.845150e-04 0.9587773 > 28808 0.2601277 0.6909140 0.43078633 4.814602 > 16.21283 > 7.963487e-04 0.9587773 > > I want to export my results and write > > results<- decideTests(fit2) > write.fit(fit2, results, "limma_results.txt", adjust="BH") > > Now don't get the same columns as when using topTable which is > quite > confusing. Why don't I get the FC for the comparisons between > the different > groups as if I run topTable with the coef parameter ( > "topTable(fit2, > coef=1)" )? The columns I get are the following > > > The simple answer is that they are two different functions with > different goals. But note that you do get the same information. > > > > A > > Coef.GroupB - GroupA > Coef.GroupC - GroupA > Coef.GroupC - GroupB > > t.GroupB - GroupA > t.GroupC - GroupA > t.GroupC - GroupB > > p.value.GroupB - GroupA > p.value.GroupC - GroupA > p.value.GroupC - GroupB > > p.value.adj.GroupB - GroupA > p.value.adj.GroupC - GroupA > p.value.adj.GroupC - GroupB > > F > F.p.value > > Res.GroupB - GroupA > Res.GroupC - GroupA > Res.GroupC - GroupB > > > Could some body please try to explain what do the columns A, > Coef, F, > F.p.value and Res mean? > > > A - are your log fold change values > Coef - are your coefficients (you set up a cell means model, so > these are the sample means) > F - is an F-statistic, which tests the null hypothesis that none > of the sample means are different > F.p.value - is the p-value for the F-statistic > Res - is the results matrix you passed into write.fit(), showing > which contrast(s) were significant > > Best, > > Jim > > > > > > #Session info > > sessionInfo() > > R version 2.15.0 (2012-03-30) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 > LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C > LC_TIME=Swedish_Sweden.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] limma_3.12.1 Biobase_2.16.0 BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] affylmGUI_1.30.0 IRanges_1.14.4 > oneChannelGUI_1.22.10 > stats4_2.15.0 tcltk_2.15.0 > Best regards > JMA > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Hi, My bad too :) What I really meant with fold change was the fold ratio change between the groups. I though that was what was in topTable but now I think I get it. The fold change that is printed in limma is the difference between the means in log2 and not the ratio of the mean of the gene expressions in their linear form. The data I'm analysing is from RMA and from PLIER so I have linear values for the gene expressions summarized with PLIER. Should I log2-transform them before passing them to the functions in limma? Also, if I now wanted to calculated the ratios between the RMA gene expressions, do you recommend that I transform the expressions to linear values before calculating the ratio or is it OK to take the ratio in log2? Best regards Jorge On Wed, Aug 29, 2012 at 4:56 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Jorge, > > My bad. The coefficients for write.fit() are the fold changes, and you do > get a column for each. As an example, with fake data. > > > library(limma) > > mat <- matrix(rnorm(12e5), ncol = 12) > > design <- model.matrix(~0+factor(rep(1:**4, each = 3))) > > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,**0,0,-1), ncol = 3) > > fit2 <- contrasts.fit(fit, contrast) > > fit2 <- eBayes(fit2) > > rslt <- decideTests(fit2) > > write.fit(fit2, rslt, "tmp.txt") > > dat <- read.table("tmp.txt", header=T) > > head(dat) > Coef.1 Coef.2 Coef.3 t.1 t.2 t.3 p.value.1 p.value.2 p.value.3 F > 1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53 0.30621 0.66699 0.59932 0.35 > 2 -1.200 -0.599 0.524 -1.47 -0.73 0.64 0.14148 0.46259 0.52034 1.67 > 3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85 0.21627 0.06176 0.06492 1.54 > 4 0.520 0.308 1.059 0.64 0.38 1.30 0.52390 0.70573 0.19449 0.60 > 5 -0.848 0.269 -0.611 -1.04 0.33 -0.75 0.29880 0.74132 0.45410 0.81 > 6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58 0.90847 0.76396 0.56088 0.13 > F.p.value Res.1 Res.2 Res.3 > 1 0.78697 0 0 0 > 2 0.17130 0 0 0 > 3 0.20359 0 0 0 > 4 0.61671 0 0 0 > 5 0.48698 0 0 0 > 6 0.94300 0 0 0 > > > nrow(mat) > [1] 100000 > > nrow(dat) > [1] 100000 > > An alternative you might consider is the writeFit() function from my > affycoretools package, which outputs the data in a slightly different > format. > > Best, > > Jim > > > > > On 8/29/2012 2:55 AM, Jorge Miró wrote: > >> Hi James, >> >> thanks for the explanation. I do not really understand the columns yet. >> Shouldn't the FC be printed for every comparison as is done for the >> Coef-columns? I just get one A-column. >> Is there any way of printing the results to a file with the same columns >> as in topTable()? >> What I'm really want to have in the output is a p-value column, an FC >> column and an adjusted p-value column. >> >> Best regards >> >> >> On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald <jmacdon@uw.edu<mailto:>> jmacdon@uw.edu>> wrote: >> >> Hi Jorge, >> >> >> On 8/28/2012 11:20 AM, Jorge Miró wrote: >> >> Hi, >> >> I have run the commands below to get an analysis of differential >> expressions in my Affymetrix arrays >> >> #Prepare the design and contrast matrices for my comparisons >> of the three >> groups in a loop-manner. >> >> design<- model.matrix(~0+factor(c(1,1,**1,2,2,2,3,3,3))) >> colnames(design)<- c('GroupA', 'GroupB', 'GroupC') >> contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC- GroupA, >> >> GroupC-GroupB, levels=design) >> >> #Check design and contrast matrices >> >> design >> >> GroupA GroupB GroupC >> 1 1 0 0 >> 2 1 0 0 >> 3 1 0 0 >> 4 0 1 0 >> 5 0 1 0 >> 6 0 1 0 >> 7 0 0 1 >> 8 0 0 1 >> 9 0 0 1 >> attr(,"assign") >> [1] 1 1 1 >> attr(,"contrasts") >> attr(,"contrasts")$`factor(c(**1, 1, 1, 2, 2, 2, 3, 3, 3))` >> [1] "contr.treatment" >> >> contrast.matrix >> >> Contrasts >> Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB >> GroupA -1 -1 0 >> GroupB 1 0 -1 >> GroupC 0 1 1 >> >> #Fitting the eset to to the design and contrast >> >> fit<- lmFit(exprs, design) >> fit2<- contrasts.fit(fit, contrast.matrix) >> >> #Computing the statistics >> >> fit2<- eBayes(fit2) >> >> >> Then I check the results with topTable and get the following >> columns in the >> output >> >> topTable(fit2) >> >> GroupB...GroupA GroupC...GroupA GroupC...GroupB >> AveExpr F >> P.Value adj.P.Val >> 25031 2.3602203 2.4273830 0.06716267 5.021412 >> 29.06509 >> 7.844834e-05 0.9587773 >> 12902 -0.4572897 -0.5680943 -0.11080467 7.516681 >> 25.41608 >> 1.365021e-04 0.9587773 >> 7158 -0.4478660 -0.4296077 0.01825833 7.057833 >> 23.48871 >> 1.880100e-04 0.9587773 >> 18358 -0.1002647 0.3304903 0.43075500 7.352807 >> 22.78417 >> 2.125096e-04 0.9587773 >> 28768 -0.7695883 -1.3837750 -0.61418667 3.983044 >> 22.47514 >> 2.244612e-04 0.9587773 >> 28820 -0.1708800 -0.9939680 -0.82308800 5.470826 >> 18.25071 >> 5.081473e-04 0.9587773 >> 15238 -0.4850297 -0.4658157 0.01921400 7.071662 >> 17.15191 >> 6.440979e-04 0.9587773 >> 24681 -0.3759717 -0.3486450 0.02732667 9.281578 >> 16.47813 >> 7.493077e-04 0.9587773 >> 19246 -0.8675393 -0.5082140 0.35932533 8.123538 >> 16.27776 >> 7.845150e-04 0.9587773 >> 28808 0.2601277 0.6909140 0.43078633 4.814602 >> 16.21283 >> 7.963487e-04 0.9587773 >> >> I want to export my results and write >> >> results<- decideTests(fit2) >> write.fit(fit2, results, "limma_results.txt", adjust="BH") >> >> Now don't get the same columns as when using topTable which is >> quite >> confusing. Why don't I get the FC for the comparisons between >> the different >> groups as if I run topTable with the coef parameter ( >> "topTable(fit2, >> coef=1)" )? The columns I get are the following >> >> >> The simple answer is that they are two different functions with >> different goals. But note that you do get the same information. >> >> >> >> A >> >> Coef.GroupB - GroupA >> Coef.GroupC - GroupA >> Coef.GroupC - GroupB >> >> t.GroupB - GroupA >> t.GroupC - GroupA >> t.GroupC - GroupB >> >> p.value.GroupB - GroupA >> p.value.GroupC - GroupA >> p.value.GroupC - GroupB >> >> p.value.adj.GroupB - GroupA >> p.value.adj.GroupC - GroupA >> p.value.adj.GroupC - GroupB >> >> F >> F.p.value >> >> Res.GroupB - GroupA >> Res.GroupC - GroupA >> Res.GroupC - GroupB >> >> >> Could some body please try to explain what do the columns A, >> Coef, F, >> F.p.value and Res mean? >> >> >> A - are your log fold change values >> Coef - are your coefficients (you set up a cell means model, so >> these are the sample means) >> F - is an F-statistic, which tests the null hypothesis that none >> of the sample means are different >> F.p.value - is the p-value for the F-statistic >> Res - is the results matrix you passed into write.fit(), showing >> which contrast(s) were significant >> >> Best, >> >> Jim >> >> >> >> >> >> #Session info >> >> sessionInfo() >> >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 >> LC_MONETARY=Swedish_Sweden.**1252 LC_NUMERIC=C >> LC_TIME=Swedish_Sweden.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods >> base >> >> other attached packages: >> [1] limma_3.12.1 Biobase_2.16.0 BiocGenerics_0.2.0 >> >> loaded via a namespace (and not attached): >> [1] affylmGUI_1.30.0 IRanges_1.14.4 >> oneChannelGUI_1.22.10 >> stats4_2.15.0 tcltk_2.15.0 >> Best regards >> JMA >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.** >> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Jorge, On 8/29/2012 4:15 PM, Jorge Mir? wrote: > Hi, > > My bad too :) What I really meant with fold change was the fold ratio > change between the groups. I though that was what was in topTable but > now I think I get it. The fold change that is printed in limma is the > difference between the means in log2 and not the ratio of the mean of > the gene expressions in their linear form. > > The data I'm analysing is from RMA and from PLIER so I have linear > values for the gene expressions summarized with PLIER. Should I > log2-transform them before passing them to the functions in limma? Yes. > > Also, if I now wanted to calculated the ratios between the RMA gene > expressions, do you recommend that I transform the expressions to > linear values before calculating the ratio or is it OK to take the > ratio in log2? You don't take the ratio, you take the difference. Remember that log(x/y) = log(x) - log(y). Best, Jim > > Best regards > Jorge > > On Wed, Aug 29, 2012 at 4:56 PM, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Jorge, > > My bad. The coefficients for write.fit() are the fold changes, and > you do get a column for each. As an example, with fake data. > > > library(limma) > > mat <- matrix(rnorm(12e5), ncol = 12) > > design <- model.matrix(~0+factor(rep(1:4, each = 3))) > > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3) > > fit2 <- contrasts.fit(fit, contrast) > > fit2 <- eBayes(fit2) > > rslt <- decideTests(fit2) > > write.fit(fit2, rslt, "tmp.txt") > > dat <- read.table("tmp.txt", header=T) > > head(dat) > Coef.1 Coef.2 Coef.3 t.1 t.2 t.3 p.value.1 p.value.2 > p.value.3 F > 1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53 0.30621 0.66699 > 0.59932 0.35 > 2 -1.200 -0.599 0.524 -1.47 -0.73 0.64 0.14148 0.46259 > 0.52034 1.67 > 3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85 0.21627 0.06176 > 0.06492 1.54 > 4 0.520 0.308 1.059 0.64 0.38 1.30 0.52390 0.70573 > 0.19449 0.60 > 5 -0.848 0.269 -0.611 -1.04 0.33 -0.75 0.29880 0.74132 > 0.45410 0.81 > 6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58 0.90847 0.76396 > 0.56088 0.13 > F.p.value Res.1 Res.2 Res.3 > 1 0.78697 0 0 0 > 2 0.17130 0 0 0 > 3 0.20359 0 0 0 > 4 0.61671 0 0 0 > 5 0.48698 0 0 0 > 6 0.94300 0 0 0 > > > nrow(mat) > [1] 100000 > > nrow(dat) > [1] 100000 > > An alternative you might consider is the writeFit() function from > my affycoretools package, which outputs the data in a slightly > different format. > > Best, > > Jim > > > > > On 8/29/2012 2:55 AM, Jorge Mir? wrote: > > Hi James, > > thanks for the explanation. I do not really understand the > columns yet. Shouldn't the FC be printed for every comparison > as is done for the Coef-columns? I just get one A-column. > Is there any way of printing the results to a file with the > same columns as in topTable()? > What I'm really want to have in the output is a p-value > column, an FC column and an adjusted p-value column. > > Best regards > > > On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald > <jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu=""> <mailto:jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">>> wrote: > > Hi Jorge, > > > On 8/28/2012 11:20 AM, Jorge Mir? wrote: > > Hi, > > I have run the commands below to get an analysis of > differential > expressions in my Affymetrix arrays > > #Prepare the design and contrast matrices for my > comparisons > of the three > groups in a loop-manner. > > design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3))) > colnames(design)<- c('GroupA', 'GroupB', 'GroupC') > contrast.matrix<- makeContrasts(GroupB-GroupA, > GroupC-GroupA, > > GroupC-GroupB, levels=design) > > #Check design and contrast matrices > > design > > GroupA GroupB GroupC > 1 1 0 0 > 2 1 0 0 > 3 1 0 0 > 4 0 1 0 > 5 0 1 0 > 6 0 1 0 > 7 0 0 1 > 8 0 0 1 > 9 0 0 1 > attr(,"assign") > [1] 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))` > [1] "contr.treatment" > > contrast.matrix > > Contrasts > Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB > GroupA -1 -1 0 > GroupB 1 0 -1 > GroupC 0 1 1 > > #Fitting the eset to to the design and contrast > > fit<- lmFit(exprs, design) > fit2<- contrasts.fit(fit, contrast.matrix) > > #Computing the statistics > > fit2<- eBayes(fit2) > > > Then I check the results with topTable and get the > following > columns in the > output > > topTable(fit2) > > GroupB...GroupA GroupC...GroupA GroupC...GroupB > AveExpr F > P.Value adj.P.Val > 25031 2.3602203 2.4273830 0.06716267 > 5.021412 > 29.06509 > 7.844834e-05 0.9587773 > 12902 -0.4572897 -0.5680943 -0.11080467 > 7.516681 > 25.41608 > 1.365021e-04 0.9587773 > 7158 -0.4478660 -0.4296077 0.01825833 > 7.057833 > 23.48871 > 1.880100e-04 0.9587773 > 18358 -0.1002647 0.3304903 0.43075500 > 7.352807 > 22.78417 > 2.125096e-04 0.9587773 > 28768 -0.7695883 -1.3837750 -0.61418667 > 3.983044 > 22.47514 > 2.244612e-04 0.9587773 > 28820 -0.1708800 -0.9939680 -0.82308800 > 5.470826 > 18.25071 > 5.081473e-04 0.9587773 > 15238 -0.4850297 -0.4658157 0.01921400 > 7.071662 > 17.15191 > 6.440979e-04 0.9587773 > 24681 -0.3759717 -0.3486450 0.02732667 > 9.281578 > 16.47813 > 7.493077e-04 0.9587773 > 19246 -0.8675393 -0.5082140 0.35932533 > 8.123538 > 16.27776 > 7.845150e-04 0.9587773 > 28808 0.2601277 0.6909140 0.43078633 > 4.814602 > 16.21283 > 7.963487e-04 0.9587773 > > I want to export my results and write > > results<- decideTests(fit2) > write.fit(fit2, results, "limma_results.txt", > adjust="BH") > > Now don't get the same columns as when using topTable > which is > quite > confusing. Why don't I get the FC for the comparisons > between > the different > groups as if I run topTable with the coef parameter ( > "topTable(fit2, > coef=1)" )? The columns I get are the following > > > The simple answer is that they are two different functions > with > different goals. But note that you do get the same > information. > > > > A > > Coef.GroupB - GroupA > Coef.GroupC - GroupA > Coef.GroupC - GroupB > > t.GroupB - GroupA > t.GroupC - GroupA > t.GroupC - GroupB > > p.value.GroupB - GroupA > p.value.GroupC - GroupA > p.value.GroupC - GroupB > > p.value.adj.GroupB - GroupA > p.value.adj.GroupC - GroupA > p.value.adj.GroupC - GroupB > > F > F.p.value > > Res.GroupB - GroupA > Res.GroupC - GroupA > Res.GroupC - GroupB > > > Could some body please try to explain what do the > columns A, > Coef, F, > F.p.value and Res mean? > > > A - are your log fold change values > Coef - are your coefficients (you set up a cell means > model, so > these are the sample means) > F - is an F-statistic, which tests the null hypothesis > that none > of the sample means are different > F.p.value - is the p-value for the F-statistic > Res - is the results matrix you passed into write.fit(), > showing > which contrast(s) were significant > > Best, > > Jim > > > > > > #Session info > > sessionInfo() > > R version 2.15.0 (2012-03-30) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=Swedish_Sweden.1252 > LC_CTYPE=Swedish_Sweden.1252 > LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C > LC_TIME=Swedish_Sweden.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets > methods > base > > other attached packages: > [1] limma_3.12.1 Biobase_2.16.0 > BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] affylmGUI_1.30.0 IRanges_1.14.4 > oneChannelGUI_1.22.10 > stats4_2.15.0 tcltk_2.15.0 > Best regards > JMA > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-project.org=""> <mailto:bioconductor at="" r-project.org="">> > > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Hi, Thanks :) Then I will transform the PLIER data. I have seen people that takes the ratio instead of the FC ( http://wiki.c2b2.columbia.edu/workbench/index.php/Fold_Change) and that's why I wondered if I really should tranform my RMA data to their linear form. I think I should but I am a little unsure. Best regards Jorge On Wed, Aug 29, 2012 at 10:18 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > Hi Jorge, > > > On 8/29/2012 4:15 PM, Jorge Mir? wrote: >> >> Hi, >> >> My bad too :) What I really meant with fold change was the fold ratio >> change between the groups. I though that was what was in topTable but now I >> think I get it. The fold change that is printed in limma is the difference >> between the means in log2 and not the ratio of the mean of the gene >> expressions in their linear form. >> >> The data I'm analysing is from RMA and from PLIER so I have linear values >> for the gene expressions summarized with PLIER. Should I log2-transform them >> before passing them to the functions in limma? > > > Yes. > > >> >> Also, if I now wanted to calculated the ratios between the RMA gene >> expressions, do you recommend that I transform the expressions to linear >> values before calculating the ratio or is it OK to take the ratio in log2? > > > You don't take the ratio, you take the difference. Remember that log(x/y) = > log(x) - log(y). > > Best, > > Jim > > >> >> Best regards >> Jorge >> >> >> On Wed, Aug 29, 2012 at 4:56 PM, James W. MacDonald <jmacdon at="" uw.edu="">> <mailto:jmacdon at="" uw.edu="">> wrote: >> >> Hi Jorge, >> >> My bad. The coefficients for write.fit() are the fold changes, and >> you do get a column for each. As an example, with fake data. >> >> > library(limma) >> > mat <- matrix(rnorm(12e5), ncol = 12) >> > design <- model.matrix(~0+factor(rep(1:4, each = 3))) >> > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3) >> > fit2 <- contrasts.fit(fit, contrast) >> > fit2 <- eBayes(fit2) >> > rslt <- decideTests(fit2) >> > write.fit(fit2, rslt, "tmp.txt") >> > dat <- read.table("tmp.txt", header=T) >> > head(dat) >> Coef.1 Coef.2 Coef.3 t.1 t.2 t.3 p.value.1 p.value.2 >> p.value.3 F >> 1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53 0.30621 0.66699 >> 0.59932 0.35 >> 2 -1.200 -0.599 0.524 -1.47 -0.73 0.64 0.14148 0.46259 >> 0.52034 1.67 >> 3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85 0.21627 0.06176 >> 0.06492 1.54 >> 4 0.520 0.308 1.059 0.64 0.38 1.30 0.52390 0.70573 >> 0.19449 0.60 >> 5 -0.848 0.269 -0.611 -1.04 0.33 -0.75 0.29880 0.74132 >> 0.45410 0.81 >> 6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58 0.90847 0.76396 >> 0.56088 0.13 >> F.p.value Res.1 Res.2 Res.3 >> 1 0.78697 0 0 0 >> 2 0.17130 0 0 0 >> 3 0.20359 0 0 0 >> 4 0.61671 0 0 0 >> 5 0.48698 0 0 0 >> 6 0.94300 0 0 0 >> >> > nrow(mat) >> [1] 100000 >> > nrow(dat) >> [1] 100000 >> >> An alternative you might consider is the writeFit() function from >> my affycoretools package, which outputs the data in a slightly >> different format. >> >> Best, >> >> Jim >> >> >> >> >> On 8/29/2012 2:55 AM, Jorge Mir? wrote: >> >> Hi James, >> >> thanks for the explanation. I do not really understand the >> columns yet. Shouldn't the FC be printed for every comparison >> as is done for the Coef-columns? I just get one A-column. >> Is there any way of printing the results to a file with the >> same columns as in topTable()? >> What I'm really want to have in the output is a p-value >> column, an FC column and an adjusted p-value column. >> >> Best regards >> >> >> On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald >> <jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> >> <mailto:jmacdon at="" uw.edu="">>> wrote: >> >> Hi Jorge, >> >> >> On 8/28/2012 11:20 AM, Jorge Mir? wrote: >> >> Hi, >> >> I have run the commands below to get an analysis of >> differential >> expressions in my Affymetrix arrays >> >> #Prepare the design and contrast matrices for my >> comparisons >> of the three >> groups in a loop-manner. >> >> design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3))) >> colnames(design)<- c('GroupA', 'GroupB', 'GroupC') >> contrast.matrix<- makeContrasts(GroupB-GroupA, >> GroupC-GroupA, >> >> GroupC-GroupB, levels=design) >> >> #Check design and contrast matrices >> >> design >> >> GroupA GroupB GroupC >> 1 1 0 0 >> 2 1 0 0 >> 3 1 0 0 >> 4 0 1 0 >> 5 0 1 0 >> 6 0 1 0 >> 7 0 0 1 >> 8 0 0 1 >> 9 0 0 1 >> attr(,"assign") >> [1] 1 1 1 >> attr(,"contrasts") >> attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))` >> [1] "contr.treatment" >> >> contrast.matrix >> >> Contrasts >> Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB >> GroupA -1 -1 0 >> GroupB 1 0 -1 >> GroupC 0 1 1 >> >> #Fitting the eset to to the design and contrast >> >> fit<- lmFit(exprs, design) >> fit2<- contrasts.fit(fit, contrast.matrix) >> >> #Computing the statistics >> >> fit2<- eBayes(fit2) >> >> >> Then I check the results with topTable and get the >> following >> columns in the >> output >> >> topTable(fit2) >> >> GroupB...GroupA GroupC...GroupA GroupC...GroupB >> AveExpr F >> P.Value adj.P.Val >> 25031 2.3602203 2.4273830 0.06716267 >> 5.021412 >> 29.06509 >> 7.844834e-05 0.9587773 >> 12902 -0.4572897 -0.5680943 -0.11080467 >> 7.516681 >> 25.41608 >> 1.365021e-04 0.9587773 >> 7158 -0.4478660 -0.4296077 0.01825833 >> 7.057833 >> 23.48871 >> 1.880100e-04 0.9587773 >> 18358 -0.1002647 0.3304903 0.43075500 >> 7.352807 >> 22.78417 >> 2.125096e-04 0.9587773 >> 28768 -0.7695883 -1.3837750 -0.61418667 >> 3.983044 >> 22.47514 >> 2.244612e-04 0.9587773 >> 28820 -0.1708800 -0.9939680 -0.82308800 >> 5.470826 >> 18.25071 >> 5.081473e-04 0.9587773 >> 15238 -0.4850297 -0.4658157 0.01921400 >> 7.071662 >> 17.15191 >> 6.440979e-04 0.9587773 >> 24681 -0.3759717 -0.3486450 0.02732667 >> 9.281578 >> 16.47813 >> 7.493077e-04 0.9587773 >> 19246 -0.8675393 -0.5082140 0.35932533 >> 8.123538 >> 16.27776 >> 7.845150e-04 0.9587773 >> 28808 0.2601277 0.6909140 0.43078633 >> 4.814602 >> 16.21283 >> 7.963487e-04 0.9587773 >> >> I want to export my results and write >> >> results<- decideTests(fit2) >> write.fit(fit2, results, "limma_results.txt", >> adjust="BH") >> >> Now don't get the same columns as when using topTable >> which is >> quite >> confusing. Why don't I get the FC for the comparisons >> between >> the different >> groups as if I run topTable with the coef parameter ( >> "topTable(fit2, >> coef=1)" )? The columns I get are the following >> >> >> The simple answer is that they are two different functions >> with >> different goals. But note that you do get the same >> information. >> >> >> >> A >> >> Coef.GroupB - GroupA >> Coef.GroupC - GroupA >> Coef.GroupC - GroupB >> >> t.GroupB - GroupA >> t.GroupC - GroupA >> t.GroupC - GroupB >> >> p.value.GroupB - GroupA >> p.value.GroupC - GroupA >> p.value.GroupC - GroupB >> >> p.value.adj.GroupB - GroupA >> p.value.adj.GroupC - GroupA >> p.value.adj.GroupC - GroupB >> >> F >> F.p.value >> >> Res.GroupB - GroupA >> Res.GroupC - GroupA >> Res.GroupC - GroupB >> >> >> Could some body please try to explain what do the >> columns A, >> Coef, F, >> F.p.value and Res mean? >> >> >> A - are your log fold change values >> Coef - are your coefficients (you set up a cell means >> model, so >> these are the sample means) >> F - is an F-statistic, which tests the null hypothesis >> that none >> of the sample means are different >> F.p.value - is the p-value for the F-statistic >> Res - is the results matrix you passed into write.fit(), >> showing >> which contrast(s) were significant >> >> Best, >> >> Jim >> >> >> >> >> >> #Session info >> >> sessionInfo() >> >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=Swedish_Sweden.1252 >> LC_CTYPE=Swedish_Sweden.1252 >> LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C >> LC_TIME=Swedish_Sweden.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets >> methods >> base >> >> other attached packages: >> [1] limma_3.12.1 Biobase_2.16.0 >> BiocGenerics_0.2.0 >> >> loaded via a namespace (and not attached): >> [1] affylmGUI_1.30.0 IRanges_1.14.4 >> oneChannelGUI_1.22.10 >> stats4_2.15.0 tcltk_2.15.0 >> Best regards >> JMA >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >> <mailto:bioconductor at="" r-project.org="">> >> <mailto:bioconductor at="" r-project.org="">> >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 >
ADD REPLY
0
Entering edit mode
Hi, My bad too :) What I really meant with fold change was the fold ratio change between the groups. I though that was what was in topTable but now I think I get it. The fold change that is printed in limma is the difference between the means in log2 and not the ratio of the mean of the gene expressions in their linear form. The data I'm analysing is from RMA and from PLIER so I have linear values for the gene expressions summarized with PLIER. Should I log2-transform them before passing them to the functions in limma? Also, if I now wanted to calculated the ratios between the RMA gene expressions, do you recommend that I transform the expressions to linear values before calculating the ratio or is it OK to take the ratio in log2? Best regards Jorge On Wed, Aug 29, 2012 at 4:56 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > > Hi Jorge, > > My bad. The coefficients for write.fit() are the fold changes, and you do get a column for each. As an example, with fake data. > > > library(limma) > > mat <- matrix(rnorm(12e5), ncol = 12) > > design <- model.matrix(~0+factor(rep(1:4, each = 3))) > > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3) > > fit2 <- contrasts.fit(fit, contrast) > > fit2 <- eBayes(fit2) > > rslt <- decideTests(fit2) > > write.fit(fit2, rslt, "tmp.txt") > > dat <- read.table("tmp.txt", header=T) > > head(dat) > Coef.1 Coef.2 Coef.3 t.1 t.2 t.3 p.value.1 p.value.2 p.value.3 F > 1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53 0.30621 0.66699 0.59932 0.35 > 2 -1.200 -0.599 0.524 -1.47 -0.73 0.64 0.14148 0.46259 0.52034 1.67 > 3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85 0.21627 0.06176 0.06492 1.54 > 4 0.520 0.308 1.059 0.64 0.38 1.30 0.52390 0.70573 0.19449 0.60 > 5 -0.848 0.269 -0.611 -1.04 0.33 -0.75 0.29880 0.74132 0.45410 0.81 > 6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58 0.90847 0.76396 0.56088 0.13 > F.p.value Res.1 Res.2 Res.3 > 1 0.78697 0 0 0 > 2 0.17130 0 0 0 > 3 0.20359 0 0 0 > 4 0.61671 0 0 0 > 5 0.48698 0 0 0 > 6 0.94300 0 0 0 > > > nrow(mat) > [1] 100000 > > nrow(dat) > [1] 100000 > > An alternative you might consider is the writeFit() function from my affycoretools package, which outputs the data in a slightly different format. > > Best, > > Jim > > > > > On 8/29/2012 2:55 AM, Jorge Mir? wrote: >> >> Hi James, >> >> thanks for the explanation. I do not really understand the columns yet. Shouldn't the FC be printed for every comparison as is done for the Coef-columns? I just get one A-column. >> Is there any way of printing the results to a file with the same columns as in topTable()? >> What I'm really want to have in the output is a p-value column, an FC column and an adjusted p-value column. >> >> Best regards >> >> >> On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald <jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu="">> wrote: >> >> Hi Jorge, >> >> >> On 8/28/2012 11:20 AM, Jorge Mir? wrote: >> >> Hi, >> >> I have run the commands below to get an analysis of differential >> expressions in my Affymetrix arrays >> >> #Prepare the design and contrast matrices for my comparisons >> of the three >> groups in a loop-manner. >> >> design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3))) >> colnames(design)<- c('GroupA', 'GroupB', 'GroupC') >> contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC- GroupA, >> >> GroupC-GroupB, levels=design) >> >> #Check design and contrast matrices >> >> design >> >> GroupA GroupB GroupC >> 1 1 0 0 >> 2 1 0 0 >> 3 1 0 0 >> 4 0 1 0 >> 5 0 1 0 >> 6 0 1 0 >> 7 0 0 1 >> 8 0 0 1 >> 9 0 0 1 >> attr(,"assign") >> [1] 1 1 1 >> attr(,"contrasts") >> attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))` >> [1] "contr.treatment" >> >> contrast.matrix >> >> Contrasts >> Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB >> GroupA -1 -1 0 >> GroupB 1 0 -1 >> GroupC 0 1 1 >> >> #Fitting the eset to to the design and contrast >> >> fit<- lmFit(exprs, design) >> fit2<- contrasts.fit(fit, contrast.matrix) >> >> #Computing the statistics >> >> fit2<- eBayes(fit2) >> >> >> Then I check the results with topTable and get the following >> columns in the >> output >> >> topTable(fit2) >> >> GroupB...GroupA GroupC...GroupA GroupC...GroupB >> AveExpr F >> P.Value adj.P.Val >> 25031 2.3602203 2.4273830 0.06716267 5.021412 >> 29.06509 >> 7.844834e-05 0.9587773 >> 12902 -0.4572897 -0.5680943 -0.11080467 7.516681 >> 25.41608 >> 1.365021e-04 0.9587773 >> 7158 -0.4478660 -0.4296077 0.01825833 7.057833 >> 23.48871 >> 1.880100e-04 0.9587773 >> 18358 -0.1002647 0.3304903 0.43075500 7.352807 >> 22.78417 >> 2.125096e-04 0.9587773 >> 28768 -0.7695883 -1.3837750 -0.61418667 3.983044 >> 22.47514 >> 2.244612e-04 0.9587773 >> 28820 -0.1708800 -0.9939680 -0.82308800 5.470826 >> 18.25071 >> 5.081473e-04 0.9587773 >> 15238 -0.4850297 -0.4658157 0.01921400 7.071662 >> 17.15191 >> 6.440979e-04 0.9587773 >> 24681 -0.3759717 -0.3486450 0.02732667 9.281578 >> 16.47813 >> 7.493077e-04 0.9587773 >> 19246 -0.8675393 -0.5082140 0.35932533 8.123538 >> 16.27776 >> 7.845150e-04 0.9587773 >> 28808 0.2601277 0.6909140 0.43078633 4.814602 >> 16.21283 >> 7.963487e-04 0.9587773 >> >> I want to export my results and write >> >> results<- decideTests(fit2) >> write.fit(fit2, results, "limma_results.txt", adjust="BH") >> >> Now don't get the same columns as when using topTable which is >> quite >> confusing. Why don't I get the FC for the comparisons between >> the different >> groups as if I run topTable with the coef parameter ( >> "topTable(fit2, >> coef=1)" )? The columns I get are the following >> >> >> The simple answer is that they are two different functions with >> different goals. But note that you do get the same information. >> >> >> >> A >> >> Coef.GroupB - GroupA >> Coef.GroupC - GroupA >> Coef.GroupC - GroupB >> >> t.GroupB - GroupA >> t.GroupC - GroupA >> t.GroupC - GroupB >> >> p.value.GroupB - GroupA >> p.value.GroupC - GroupA >> p.value.GroupC - GroupB >> >> p.value.adj.GroupB - GroupA >> p.value.adj.GroupC - GroupA >> p.value.adj.GroupC - GroupB >> >> F >> F.p.value >> >> Res.GroupB - GroupA >> Res.GroupC - GroupA >> Res.GroupC - GroupB >> >> >> Could some body please try to explain what do the columns A, >> Coef, F, >> F.p.value and Res mean? >> >> >> A - are your log fold change values >> Coef - are your coefficients (you set up a cell means model, so >> these are the sample means) >> F - is an F-statistic, which tests the null hypothesis that none >> of the sample means are different >> F.p.value - is the p-value for the F-statistic >> Res - is the results matrix you passed into write.fit(), showing >> which contrast(s) were significant >> >> Best, >> >> Jim >> >> >> >> >> >> #Session info >> >> sessionInfo() >> >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 >> LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C >> LC_TIME=Swedish_Sweden.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods >> base >> >> other attached packages: >> [1] limma_3.12.1 Biobase_2.16.0 BiocGenerics_0.2.0 >> >> loaded via a namespace (and not attached): >> [1] affylmGUI_1.30.0 IRanges_1.14.4 oneChannelGUI_1.22.10 >> stats4_2.15.0 tcltk_2.15.0 >> Best regards >> JMA >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 >
ADD REPLY

Login before adding your answer.

Traffic: 708 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6