diferent pvalues for a treatment when other contrasts areremoved from targets file

0

Entering edit mode

James W. MacDonald 66k

@james-w-macdonald-5106

Last seen 1 hour ago

United States

Hi Agnieszka, Neither approach is more correct than the other, but the second is very likely to be more powerful. In either case the numerator of the contrast will be the same. However, the denominator of the contrast is based on the intra-group variability for all of the groups in your linear model. So if you only include the first treatment comparison in your model then the denominator will be based on just those two groups. If you use all 20 of your slides, then you increase the amount of data that can be used to compute the intra-group variability. Since the accuracy of a variance estimate is dependent on the amount of data (it gets more accurate as you increase the amount of data used), you will probably have more power to detect differences if you include all the data in your linear model. And more power is usually considered to be better. Best, Jim James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662 >>> Agnieszka Zmienko <akisiel at="" ibch.poznan.pl=""> 02/17/09 9:05 AM >>> Hi! I am "pure" biologist so I strictly follow the limma userguide commands in my analysis but I have a problem. I have a set of microarrays with a common control channel. I have 4 biological replicates of each experiment. If I perform the simplest possible analysis for the first treatment (wt.NaCl/wt.pool) using only files 043,046,048,077 as targets ("targetsA.txt"), I obtain different adjusted p.values (and different topTable) comparing to analysis with all twenty input files ("targetsB.txt") and extracting wt.NaCl contrast. Why? Which topTable is correct? Agnieszka Here are my targets files and the codes: TargetsA.txt SlideNumber Name FileName Cy3 Cy5 43 wt.Na1 043.gpr wt.pool wt.NaCl 46 wt.Na2 046.gpr wt.pool wt.NaCl 48 wt.Na3 048.gpr wt.pool wt.NaCl 77 wt.Na4 077.gpr wt.pool wt.NaCl TargetsB.txt SlideNumber Name FileName Cy3 Cy5 43 wt.Na1 043.gpr wt.pool wt.NaCl 46 wt.Na2 046.gpr wt.pool wt.NaCl 48 wt.Na3 048.gpr wt.pool wt.NaCl 77 wt.Na4 077.gpr wt.pool wt.NaCl 44 wt.Cd1 044.gpr wt.pool wt.CdCl2 47 wt.Cd2 047.gpr wt.pool wt.CdCl2 49 wt.Cd3 049.gpr wt.pool wt.CdCl2 78 wt.Cd4 078.gpr wt.pool wt.CdCl2 75 mu1.U1 075.gpr wt.pool mu1.U 70 mu1.U2 070.gpr wt.pool mu1.U 71 mu1.U3 071.gpr wt.pool mu1.U 80 mu1.U4 080.gpr wt.pool mu1.U 67 mu2.U1 067.gpr wt.pool mu2.U 74 mu2.U2 074.gpr wt.pool mu2.U 79 mu2.U3 079.gpr wt.pool mu2.U 68 mu2.U4 068.gpr wt.pool mu2.U 72 mu3.U1 072.gpr wt.pool mu3.U 73 mu3.U2 073.gpr wt.pool mu3.U 69 mu3.U3 069.gpr wt.pool mu3.U 88 mu3.U4 088.gpr wt.pool mu3.U > targetsA=readTargets("targetsA.txt") > RGA=read.maimages(targetsA,source="genepix",wt.fun=wtflags(weight=0, cutoff=-50)) Read 043.gpr Read 046.gpr Read 048.gpr Read 077.gpr > spottypes=readSpotTypes("SpotTypes.txt") > RG$genes$Status=controlStatus(spottypes,RGA) Matching patterns for: ID Name Found 31200 cDNA Found 48 no_change Setting attributes: values Color > RGAb=backgroundCorrect(RGA, method="normexp", offset=50) Green channel Corrected array 1 Corrected array 2 Corrected array 3 Corrected array 4 Red channel Corrected array 1 Corrected array 2 Corrected array 3 Corrected array 4 > MAA=normalizeWithinArrays(RGAb) > fitA=lmFit(MAA) Warning message: In lmFit(MAA) : Some coefficients not estimable: coefficient interpretation may vary. > fitA=eBayes(fitA) > write.table(topTable(fitA, + number=100,adjust.method="BH",p.value=0.05,lfc=1,sort.by="P",resort.by ="logFC"),"topTableA.txt") > write.fit(fitA, digits=6,F.adjust="BH",file="resultsA.txt") --------------------------------- > targetsB=readTargets("targetsB.txt") > RGB=read.maimages(targetsB,source="genepix",wt.fun=wtflags(weight=0, cutoff=-50)) Read 043.gpr Read 046.gpr Read 048.gpr ReRead 074.gpr Read 079.gpr Read 068.gpr Read 072.gpr Read 073.gpr Read 069.gpr Read 088.gpr > spottypes=readSpotTypes("SpotTypes.txt") > RG$genes$Status=controlStatus(spottypes,RGB) Matching patterns for: ID Name Found 31200 cDNA Found 48 no_change Setting attributes: values Color > RGBb=backgroundCorrect(RGB, method="normexp", offset=50) Green channel Corrected array 1 Corrected array 2 Corrected array 3 Corrected array 4 Corrected array 5 Corrected array 6 Corrected array 7 Corrected array 8 Corrected array 9 Corrected array 10 Corrected array 11 Corrected array 12 Corrected array 13 Corrected array 14 Corrected array 15 Corrected array 16 Corrected array 17 Corrected array 18 Corrected array 19 Corrected array 20 Red channel Corrected array 1 Corrected array 2 Corrected array 3 Corrected array 4 Corrected array 5 Corrected array 6 Corrected array 7 Corrected array 8 Corrected array 9 Corrected array 10 Corrected array 11 Corrected array 12 Corrected array 13 Corrected array 14 Corrected array 15 Corrected array 16 Corrected array 17 Corrected array 18 Corrected array 19 Corrected array 20 > MAB=normalizeWithinArrays(RGBb) > designB=modelMatrix(targetsB,ref="wt.pool") Found unique target names: mu1.U mu2.U mu3.U wt.CdCl2 wt.NaCl wt.pool > designB mu1.U mu2.U mu3.U wt.CdCl2 wt.NaCl [1,] 0 0 0 0 1 [2,] 0 0 0 0 1 [3,] 0 0 0 0 1 [4,] 0 0 0 0 1 [5,] 0 0 0 1 0 [6,] 0 0 0 1 0 [7,] 0 0 0 1 0 [8,] 0 0 0 1 0 [9,] 1 0 0 0 0 [10,] 1 0 0 0 0 [11,] 1 0 0 0 0 [12,] 1 0 0 0 0 [13,] 0 1 0 0 0 [14,] 0 1 0 0 0 [15,] 0 1 0 0 0 [16,] 0 1 0 0 0 [17,] 0 0 1 0 0 [18,] 0 0 1 0 0 [19,] 0 0 1 0 0 [20,] 0 0 1 0 0 > fitB=lmFit(MAB,designB) Warning message: In lmFit(MAB, designB) : Some coefficients not estimable: coefficient interpretation may vary. > contrast.matrix=makeContrasts(wt.NaCl,levels=designB) > contrast.matrix Contrasts Levels wt.NaCl mu1.U 0 mu2.U 0 mu3.U 0 wt.CdCl2 0 wt.NaCl 1 > fitB=contrasts.fit(fitB,contrast.matrix) > fitB=eBayes(fitB) > write.table(topTable(fitB, + number=100,adjust.method="BH",p.value=0.05,lfc=1,sort.by="P",resort.by ="logFC"),"topTableB.txt") > write.fit(fitB, digits=6,F.adjust="BH",file="resultsB.txt") Dr Agnieszka ?mie?ko Centrum Doskonalosci CENAT Instytut Chemii Bioorganicznej Polskiej Akademii Nauk Noskowskiego 12/14 61-704 Pozna? tel. (61) 8528503 wew. 249 fax: (61) 8520532 Agnieszka Zmienko, Ph.D. CENAT Institute of Bioorganic Chemistry Polish Academy of Sciences Noskowskiego 12/14 61-704 Poznan, Poland phone (0048) 61-8528503 ext. 249 fax: (0048) 61-8520532 _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

• 815 views

ADD COMMENT • link updated 15.3 years ago by Agnieszka Zmienko ▴ 50 • written 15.3 years ago by James W. MacDonald 66k

0

Entering edit mode

Agnieszka Zmienko ▴ 50

@agnieszka-zmienko-3165

Last seen 9.7 years ago

Hi Jim! Thanks for explanation. Its very helpful. Agnieszka At 16:31 2009-02-17, James MacDonald wrote: >Hi Agnieszka, > >Neither approach is more correct than the other, but the second is very >likely to be more powerful. > >In either case the numerator of the contrast will be the same. However, >the denominator of the contrast is based on the intra-group variability >for all of the groups in your linear model. So if you only include the >first treatment comparison in your model then the denominator will be >based on just those two groups. If you use all 20 of your slides, then >you increase the amount of data that can be used to compute the >intra-group variability. > >Since the accuracy of a variance estimate is dependent on the amount of >data (it gets more accurate as you increase the amount of data used), >you will probably have more power to detect differences if you include >all the data in your linear model. And more power is usually considered >to be better. > >Best, > >Jim > > >James W. MacDonald, M.S. >Biostatistician >Hildebrandt Lab >8220D MSRB III >1150 W. Medical Center Drive >Ann Arbor MI 48109-0646 >734-936-8662 > >>> Agnieszka Zmienko <akisiel at="" ibch.poznan.pl=""> 02/17/09 9:05 AM >>> >Hi! > >I am "pure" biologist so I strictly follow the >limma userguide commands in my analysis but I have a problem. >I have a set of microarrays with a common control >channel. I have 4 biological replicates of each >experiment. If I perform the simplest possible >analysis for the first treatment >(wt.NaCl/wt.pool) using only files >043,046,048,077 as targets ("targetsA.txt"), I >obtain different adjusted p.values (and different >topTable) comparing to analysis with all twenty >input files ("targetsB.txt") and extracting >wt.NaCl contrast. Why? Which topTable is correct? > > >Agnieszka > > >Here are my targets files and the codes: > >TargetsA.txt > >SlideNumber Name FileName Cy3 Cy5 >43 wt.Na1 043.gpr wt.pool wt.NaCl >46 wt.Na2 046.gpr wt.pool wt.NaCl >48 wt.Na3 048.gpr wt.pool wt.NaCl >77 wt.Na4 077.gpr wt.pool wt.NaCl > >TargetsB.txt > >SlideNumber Name FileName Cy3 Cy5 >43 wt.Na1 043.gpr wt.pool wt.NaCl >46 wt.Na2 046.gpr wt.pool wt.NaCl >48 wt.Na3 048.gpr wt.pool wt.NaCl >77 wt.Na4 077.gpr wt.pool wt.NaCl >44 wt.Cd1 044.gpr wt.pool wt.CdCl2 >47 wt.Cd2 047.gpr wt.pool wt.CdCl2 >49 wt.Cd3 049.gpr wt.pool wt.CdCl2 >78 wt.Cd4 078.gpr wt.pool wt.CdCl2 >75 mu1.U1 075.gpr wt.pool mu1.U >70 mu1.U2 070.gpr wt.pool mu1.U >71 mu1.U3 071.gpr wt.pool mu1.U >80 mu1.U4 080.gpr wt.pool mu1.U >67 mu2.U1 067.gpr wt.pool mu2.U >74 mu2.U2 074.gpr wt.pool mu2.U >79 mu2.U3 079.gpr wt.pool mu2.U >68 mu2.U4 068.gpr wt.pool mu2.U >72 mu3.U1 072.gpr wt.pool mu3.U >73 mu3.U2 073.gpr wt.pool mu3.U >69 mu3.U3 069.gpr wt.pool mu3.U >88 mu3.U4 088.gpr wt.pool mu3.U > > > targetsA=readTargets("targetsA.txt") > > >RGA=read.maimages(targetsA,source="genepix",wt.fun=wtflags(weight=0, >cutoff=-50)) >Read 043.gpr >Read 046.gpr >Read 048.gpr >Read 077.gpr > > spottypes=readSpotTypes("SpotTypes.txt") > > RG$genes$Status=controlStatus(spottypes,RGA) >Matching patterns for: ID Name >Found 31200 cDNA >Found 48 no_change >Setting attributes: values Color > > RGAb=backgroundCorrect(RGA, method="normexp", offset=50) >Green channel >Corrected array 1 >Corrected array 2 >Corrected array 3 >Corrected array 4 >Red channel >Corrected array 1 >Corrected array 2 >Corrected array 3 >Corrected array 4 > > MAA=normalizeWithinArrays(RGAb) > > fitA=lmFit(MAA) >Warning message: >In lmFit(MAA) : > Some coefficients not estimable: coefficient interpretation may vary. > > fitA=eBayes(fitA) > > write.table(topTable(fitA, >+ >number=100,adjust.method="BH",p.value=0.05,lfc=1,sort.by="P",resort.b y="logFC"),"topTableA.txt") > > write.fit(fitA, digits=6,F.adjust="BH",file="resultsA.txt") > >--------------------------------- > > targetsB=readTargets("targetsB.txt") > > >RGB=read.maimages(targetsB,source="genepix",wt.fun=wtflags(weight=0, >cutoff=-50)) >Read 043.gpr >Read 046.gpr >Read 048.gpr >ReRead 074.gpr >Read 079.gpr >Read 068.gpr >Read 072.gpr >Read 073.gpr >Read 069.gpr >Read 088.gpr > > spottypes=readSpotTypes("SpotTypes.txt") > > RG$genes$Status=controlStatus(spottypes,RGB) >Matching patterns for: ID Name >Found 31200 cDNA >Found 48 no_change >Setting attributes: values Color > > RGBb=backgroundCorrect(RGB, method="normexp", offset=50) >Green channel >Corrected array 1 >Corrected array 2 >Corrected array 3 >Corrected array 4 >Corrected array 5 >Corrected array 6 >Corrected array 7 >Corrected array 8 >Corrected array 9 >Corrected array 10 >Corrected array 11 >Corrected array 12 >Corrected array 13 >Corrected array 14 >Corrected array 15 >Corrected array 16 >Corrected array 17 >Corrected array 18 >Corrected array 19 >Corrected array 20 >Red channel >Corrected array 1 >Corrected array 2 >Corrected array 3 >Corrected array 4 >Corrected array 5 >Corrected array 6 >Corrected array 7 >Corrected array 8 >Corrected array 9 >Corrected array 10 >Corrected array 11 >Corrected array 12 >Corrected array 13 >Corrected array 14 >Corrected array 15 >Corrected array 16 >Corrected array 17 >Corrected array 18 >Corrected array 19 >Corrected array 20 > > MAB=normalizeWithinArrays(RGBb) > > designB=modelMatrix(targetsB,ref="wt.pool") >Found unique target names: > mu1.U mu2.U mu3.U wt.CdCl2 wt.NaCl wt.pool > > designB > mu1.U mu2.U mu3.U wt.CdCl2 wt.NaCl > [1,] 0 0 0 0 1 > [2,] 0 0 0 0 1 > [3,] 0 0 0 0 1 > [4,] 0 0 0 0 1 > [5,] 0 0 0 1 0 > [6,] 0 0 0 1 0 > [7,] 0 0 0 1 0 > [8,] 0 0 0 1 0 > [9,] 1 0 0 0 0 >[10,] 1 0 0 0 0 >[11,] 1 0 0 0 0 >[12,] 1 0 0 0 0 >[13,] 0 1 0 0 0 >[14,] 0 1 0 0 0 >[15,] 0 1 0 0 0 >[16,] 0 1 0 0 0 >[17,] 0 0 1 0 0 >[18,] 0 0 1 0 0 >[19,] 0 0 1 0 0 >[20,] 0 0 1 0 0 > > fitB=lmFit(MAB,designB) >Warning message: >In lmFit(MAB, designB) : > Some coefficients not estimable: coefficient interpretation may vary. > > contrast.matrix=makeContrasts(wt.NaCl,levels=designB) > > contrast.matrix > Contrasts >Levels wt.NaCl > mu1.U 0 > mu2.U 0 > mu3.U 0 > wt.CdCl2 0 > wt.NaCl 1 > > fitB=contrasts.fit(fitB,contrast.matrix) > > fitB=eBayes(fitB) > > write.table(topTable(fitB, >+ >number=100,adjust.method="BH",p.value=0.05,lfc=1,sort.by="P",resort.b y="logFC"),"topTableB.txt") > > write.fit(fitB, digits=6,F.adjust="BH",file="resultsB.txt") > > > >Dr Agnieszka ??mie??ko > >Centrum Doskonalosci CENAT >Instytut Chemii Bioorganicznej Polskiej Akademii Nauk >Noskowskiego 12/14 >61-704 Pozna?? >tel. (61) 8528503 wew. 249 >fax: (61) 8520532 > >Agnieszka Zmienko, Ph.D. > >CENAT >Institute of Bioorganic Chemistry >Polish Academy of Sciences >Noskowskiego 12/14 >61-704 Poznan, Poland >phone (0048) 61-8528503 ext. 249 >fax: (0048) 61-8520532 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor > >********************************************************** >Electronic Mail is not secure, may not be read every day, and should not >be used for urgent or sensitive issues Dr Agnieszka ?mie?ko Centrum Doskonalosci CENAT Instytut Chemii Bioorganicznej Polskiej Akademii Nauk Noskowskiego 12/14 61-704 Pozna? tel. (61) 8528503 wew. 249 fax: (61) 8520532 Agnieszka Zmienko, Ph.D. CENAT Institute of Bioorganic Chemistry Polish Academy of Sciences Noskowskiego 12/14 61-704 Poznan, Poland phone (0048) 61-8528503 ext. 249 fax: (0048) 61-8520532

ADD COMMENT • link 15.3 years ago Agnieszka Zmienko ▴ 50

Login before adding your answer.