Problem with A mean in Limma
0
0
Entering edit mode
@gordon-smyth
Last seen 26 minutes ago
WEHI, Melbourne, Australia
The component Amean in the linear model fit output contains the mean A-values for all the arrays in the original linear model fit. It does not change when a contrast is applied to the fit object, in the same way that other quantities such as sigma do not change. This is the intended behavior and is not a bug. This issue has been discussed a few times on the list before. Try searching the archives for "Amean" or similar. I personally do not see any good reason why Amean should change with the contrast, but I remain open to a good argument. Best wishes Gordon >Date: Wed, 22 Feb 2006 14:45:07 +0100 >From: david hot <david.hot at="" pasteur-lille.fr=""> >Subject: [BioC] Problem with A mean in Limma >To: Bioconductor Mailing list <bioconductor at="" stat.math.ethz.ch=""> >Message-ID: <43FC6AE2.D7BF9597 at pasteur-lille.fr> >Content-Type: text/plain; charset=iso-8859-2 > >Dear All, > >I have posted a mail to the list 2 weeks ago with no answer so far. >My first message was probably not very clear so I try again. > >I am using Limma 2.4.7 on R 2.2.1 to analyse two-color arrays and I >have noticed something strange in the way the mean 'A' value is >calculated when one use a contrast matrix. >In my experiment I want to compare the effect of a stimulation on >Tcells. This stimulation was applied on Tcells during 6h or during >24h. The 6h stimulated Tcells and 24h stimulated Tcells were >compared to the same non stimulated cells (= reference). I have got >the following Targets Frame : > >FileName Cy3 Cy5 >CtrlCy5vsTh2_24hCy3.dav 24h Ref >CtrlCy5vsTh2_6hCy3.dav 6h Ref >Th2_24hCy5vsCtrlCy3.dav Ref 24h >Th2_6hCy5vsCtrlCy3.dav Ref 6h > >I would like to know which genes are modulated after 6h of >stimulation, which are modulated after 24h and genes modulated >between 6h and 24h. After normalisation of the data I use the >following script to create a matrix and a contrast matrix: > > > design <- cbind("6h"=c(0,-1,0,1),"24h"=c(-1,0,1,0)) > > targets <- readTargets() > > rownames(design) <- removeExt(targets$FileName) > > design > 6h 24h >CtrlCy5vsTh2_24hCy3 0 -1 >CtrlCy5vsTh2_6hCy3 -1 0 >Th2_24hCy5vsCtrlCy3 0 1 >Th2_6hCy5vsCtrlCy3 1 0 > > fit <- lmFit(MA, design) > > cont.matrix <- cbind("24h-6h"=c(-1, 1), "6h"=c(1,0), "24h"=c(0,1)) > > rownames(cont.matrix) <- cbind("6h","24h") > > cont.matrix > 24h-6h 6h 24h >6h -1 1 0 >24h 1 0 1 > > fitcont <- contrasts.fit(fit, cont.matrix) > > fitcontbayes<-eBayes(fitcont) > >If one look at the results of the 3 coef of the contrast matrix (: >24h-6h, 6h and 24h) everything seem OK : > > > topTable(fitcontbayes, coef=1, n=10, adjust="fdr") > Block Column Row ID Name > Status M A t P.Value B >21897 39 9 1 46861 PRAC Gene 2.876222 >8.631290 8.433385 0.03855747 3.506137 >20473 36 1 14 28828 PAPD1 Gene -2.383738 6.927029 >-7.473090 0.03855747 2.783536 >19040 34 8 2 78497 CLDN23 Gene -3.139618 5.859508 >-8.299092 0.03855747 2.386795 >6221 11 5 20 84193 LOC115749 Gene 3.010418 >5.792223 7.957571 0.03855747 2.193344 >19497 34 9 21 125848 CRISP3 Gene -2.376080 6.925017 >-6.742931 0.07172051 2.143568 >21099 37 3 16 147050 HCG2P7 Gene -2.724596 6.706554 >-7.202047 0.07172051 1.713956 >6196 11 4 19 64892 ETR101 Gene 1.954074 >8.169356 6.048570 0.13538457 1.453035 >6199 11 7 19 149357 #NA Gene 1.898843 >8.060294 5.954056 0.14110973 1.352464 >21132 37 12 17 155265 LOC440590 Gene -2.527218 6.638719 >-6.680307 0.10272218 1.336678 >20556 36 12 17 160410 #NA Gene 2.082918 >7.656239 5.881841 0.14362662 1.274528 > > topTable(fitcontbayes, coef=2, n=10, adjust="fdr") > Block Column Row ID Name > Status M A t P.Value B >2603 5 11 13 >128320 CA2 Gene 2.312298 8.323775 12.690864 0.0004900596 7.446470 >23250 41 18 9 83486 CYP1B1 Gene 1.365898 >9.351358 7.529896 0.0168728265 4.118037 >14667 26 3 12 89981 TPO Gene >-1.370724 9.825983 -7.441705 0.0168728265 4.018976 >17543 31 23 11 95124 TNC Gene 1.362982 >11.478777 7.450669 0.0176675806 3.633118 >27313 48 1 11 59209 CTSC Gene 1.337229 >10.259389 6.727203 0.0276792498 3.169911 >2542 5 22 10 91105 FABP5 Gene 1.180159 >10.683417 6.708199 0.0276792498 3.146168 >19497 34 9 21 >125848 CRISP3 Gene 2.254023 6.925017 7.834144 0.0168728265 3.042473 >4862 9 14 11 >158129 CAPN14 Gene 2.252979 5.170231 8.422209 0.0168728265 3.002382 >18922 33 10 21 49700 >DKFZp547F072 Gene 2.241397 6.368605 8.378910 0.0168728265 2.975528 >20210 36 2 3 >129217 MGC10854 Gene 2.155348 6.019866 8.057238 0.0176289711 2.768940 > > topTable(fitcontbayes, coef=3, n=10, adjust="fdr") > Block Column Row ID Name > Status M A t P.Value B >23257 41 1 10 >119072 CDH26 Gene 2.045561 8.618405 10.977306 0.0007520237 8.012605 >2550 5 6 11 >112799 SERPINB3 Gene 1.905378 9.951108 10.146438 0.0007520237 7.297240 >21897 39 9 1 46861 PRAC Gene 3.036560 8.631290 > 10.904529 0.0007520237 6.084763 >6221 11 5 20 84193 LOC115749 Gene 3.157852 5.792223 > 11.804853 0.0007520237 5.857994 >20473 36 1 14 28828 PAPD1 Gene -2.693871 6.927029 >-10.343418 0.0009737256 5.746556 >27313 48 1 11 59209 CTSC Gene 1.677554 >10.259389 8.439275 0.0036026318 5.575635 >17543 31 23 11 95124 TNC Gene 2.432964 >11.478777 9.404284 0.0022965417 5.110162 >18097 32 1 11 >104580 CTSC Gene 1.414608 9.663292 8.014200 0.0056176380 5.087315 >22151 39 23 11 >104788 COL6A3 Gene 1.507607 9.470752 7.775592 0.0069709347 4.801786 >8275 15 19 9 53347 TXN delta 3 Gene 1.340699 >11.341639 7.522172 0.0090096504 4.489250 > > >But when I sort the table according to the 'A' value one can see >that the genes have exactly the same 'A' values for the 1st the 2d >and the 3d coef : > > > topTable(fitcontbayes, coef=1, n=10, adjust="fdr", sort.by="A") > Block Column Row ID Name Status M A t P.Value B >4716 9 12 5 17228 CPSF4 Gene NA 15.16203 NA NA NA >2944 6 16 3 96063 RBM4 Gene NA 15.12503 NA NA NA >772 2 4 9 5198 KCNAB3 Gene NA 15.06048 NA NA NA >18041 32 17 8 101204 GABBR1 Gene NA 15.03191 NA NA NA >16817 30 17 5 15058 PRKAA1 Gene NA 14.98546 NA NA NA >7642 14 10 7 90324 GSTA3 Gene NA 14.85254 NA NA NA >5105 9 17 21 149752 LOC199800 Gene NA 14.83062 NA NA NA >20702 36 14 23 104223 #NA Gene NA 14.81930 NA NA NA >4270 8 22 10 94545 SERPINA6 Gene NA 14.79895 NA NA NA >6429 12 21 4 96944 MAD1L1 Gene NA 14.79082 NA NA NA > > topTable(fitcontbayes, coef=2, n=10, adjust="fdr", sort.by="A") > Block Column Row ID Name Status M A t P.Value B >4716 9 12 5 17228 CPSF4 Gene NA 15.16203 NA NA NA >2944 6 16 3 96063 RBM4 Gene NA 15.12503 NA NA NA >772 2 4 9 5198 KCNAB3 Gene NA 15.06048 NA NA NA >18041 32 17 8 101204 GABBR1 Gene NA 15.03191 NA NA NA >16817 30 17 5 15058 PRKAA1 Gene NA 14.98546 NA NA NA >7642 14 10 7 90324 GSTA3 Gene NA 14.85254 NA NA NA >5105 9 17 21 149752 LOC199800 Gene NA 14.83062 NA NA NA >20702 36 14 23 104223 #NA Gene NA 14.81930 NA NA NA >4270 8 22 10 94545 SERPINA6 Gene NA 14.79895 NA NA NA >6429 12 21 4 96944 MAD1L1 Gene NA 14.79082 NA NA NA > > topTable(fitcontbayes, coef=3, n=10, adjust="fdr", sort.by="A") > Block Column Row ID Name Status M A t P.Value B >4716 9 12 5 17228 CPSF4 Gene NA 15.16203 NA NA NA >2944 6 16 3 96063 RBM4 Gene NA 15.12503 NA NA NA >772 2 4 9 5198 KCNAB3 Gene NA 15.06048 NA NA NA >18041 32 17 8 101204 GABBR1 Gene NA 15.03191 NA NA NA >16817 30 17 5 15058 PRKAA1 Gene NA 14.98546 NA NA NA >7642 14 10 7 90324 GSTA3 Gene NA 14.85254 NA NA NA >5105 9 17 21 149752 LOC199800 Gene NA 14.83062 NA NA NA >20702 36 14 23 104223 #NA Gene NA 14.81930 NA NA NA >4270 8 22 10 94545 SERPINA6 Gene NA 14.79895 NA NA NA >6429 12 21 4 96944 MAD1L1 Gene NA 14.79082 NA NA NA > >In all the 3 tables the mean 'A' value is the mean of the 'A' values >from the 4 slides where I expected for the second and third table >(coef=2 and coef=3) a mean between only 2 slides (slides 2 and 4 for >6h (coef=2) and slides 1 and 3 for 24h (coef=3)). I first though it >was just a problem in the report of 'A' values in topTable but it >seems to have an effect also on the P.value and B value. > >Is that a mistake in 'lmFit' or 'contrasts.fit' functions or am I >doing something wrong in the design of my matrix and contrast matrix? > >Many thanks for any help or comments. > >David > > >###################################### >David HOT, PhD. >Laboratoire d'Etudes Transcriptomiques et G?nomiques Appliqu?es - >Plate-forme Biopuces >UMR8161 - IFR142 > >Institut Pasteur de Lille >1, rue du professeur Calmette >59 019 LILLE, Cedex. >FRANCE >T?l : +33 (0)3 20 87 72 09 >Fax : +33 (0)3 20 87 73 11
limma limma • 766 views
ADD COMMENT

Login before adding your answer.

Traffic: 967 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6