limma - background correct method=none

0

Entering edit mode

Helen Cattan ▴ 100

@helen-cattan-687

Last seen 11.2 years ago

Hi, I have been looking at backgroundCorrect, method=none in limma and compared this to files I have manually altered so that the background values were zero, without using the backgroundCorrect method. I thought that these would produce the same results but they were very different - could anyone explain why to me please? Code is below. Thanks, Helen > library(limma) > files=dir(pattern="*\\.gpr") > RG=read.maimages(files, columns=list(Rf="F635 Median", Gf="F532 Median", Rb="B635 > names(RG) > RG$genes=readGAL() > RG$printer=getLayout(RG$genes) > samples=read.table("sampleinformationa.txt", header=TRUE, sep="\t", as.is=TRUE) > samples > spottypes=readSpotTypes() > RG$genes$Status=controlStatus(spottypes, RG) > MA1=normalizeWithinArrays(RG) > MA2=normalizeBetweenArrays(MA1) > design=c(1,-1) > cor=dupcor.series(MA2$M, design, ndups=2, spacing=1) > cor$cor > fit=gls.series(MA2$M,design,ndups=2,correlation=0.7628949) > eb=ebayes(fit) > genenames=uniquegenelist(RG$genes, ndups=2) > ord=order(eb$lods, decreasing=TRUE) > toptable(number=30,genelist=genenames,fit=fit,eb=eb,adjust="fdr") Block Row Column ID Name 2986 10 24 19 209274 H63351:Hs.203509::::3:211600 5981 20 24 9 No_seq :Data not found:::::212310 8083 27 24 13 No_seq :Data not found:::::211255 2013 7 18 17 3846240 BE617901:In multiple clusters::::3:152735 7492 25 25 7 No_seq :Data not found:::::223083 Status M t P.Value B 2986 cDNA 2.032293 21.15504 0.001026994 8.082774 5981 cDNA 1.946782 17.10056 0.001115381 6.944303 8083 cDNA 1.957218 16.81269 0.001115381 6.847414 2013 cDNA 1.659813 15.65550 0.001115381 6.431781 7492 cDNA 1.739500 15.31887 0.001115381 6.302453 > top30=ord[1:30] > plot(fit$coef,eb$lods,xlab="Log2 Fold Change", ylab="Log Odds",pch=16,cex=0.1) and then included RG=backgroundCorrect(RG, method="none") without manually altering the files, before the normalizations and got the following top table. The MA plots were also very different between the two tests. Block Row Column ID Name 4062 14 14 11 130050 R11620:Data not found::::0:130050 1033 4 12 1 124297 R02202:Data not found::::0:124297 1551 6 5 5 5195930 BI754638:Data not found::::10:39204 6900 23 25 23 214572 H73225:In multiple clusters::::0:214572 5131 18 3 13 N/A BQ025821:Data not found::::10:33091 Status M t P.Value B 4062 cDNA 2.421487 11.601562 0.1968434 -0.08869236 1033 cDNA 1.681192 8.298373 0.1968434 -0.58215799 1551 cDNA 1.904751 8.293908 0.1968434 -0.58312702 6900 cDNA 2.094721 8.171346 0.1968434 -0.61016473 5131 cDNA 2.314669 7.761475 0.1968434 -0.70707369 [[alternative HTML version deleted]]

limma limma • 1.9k views

ADD COMMENT • link 21.6 years ago Helen Cattan ▴ 100

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

At 06:50 AM 19/04/2004, Helen Cattan wrote: > Hi, > >I have been looking at backgroundCorrect, method=none in limma and >compared this to files I have manually altered so that the background >values were zero, without using the backgroundCorrect method. I thought >that these would produce the same results but they were very different - >could anyone explain why to me please? They do produce the same results. > Code is below. The code example you give below uses the default background correction which is subtraction, so you have not given an example of what you claim above. Gordon >Thanks, >Helen > > > library(limma) > > files=dir(pattern="*\\.gpr") > > RG=read.maimages(files, columns=list(Rf="F635 Median", Gf="F532 >Median", Rb="B635 > > names(RG) > > RG$genes=readGAL() > > RG$printer=getLayout(RG$genes) > > samples=read.table("sampleinformationa.txt", header=TRUE, sep="\t", >as.is=TRUE) > > samples > > spottypes=readSpotTypes() > > RG$genes$Status=controlStatus(spottypes, RG) > > MA1=normalizeWithinArrays(RG) > > MA2=normalizeBetweenArrays(MA1) > > design=c(1,-1) > > cor=dupcor.series(MA2$M, design, ndups=2, spacing=1) > > cor$cor > > fit=gls.series(MA2$M,design,ndups=2,correlation=0.7628949) > > eb=ebayes(fit) > > genenames=uniquegenelist(RG$genes, ndups=2) > > ord=order(eb$lods, decreasing=TRUE) > > toptable(number=30,genelist=genenames,fit=fit,eb=eb,adjust="fdr") > Block Row Column ID >Name >2986 10 24 19 209274 >H63351:Hs.203509::::3:211600 >5981 20 24 9 No_seq :Data not >found:::::212310 >8083 27 24 13 No_seq :Data not >found:::::211255 >2013 7 18 17 3846240 BE617901:In multiple >clusters::::3:152735 >7492 25 25 7 No_seq :Data not >found:::::223083 > Status M t P.Value B >2986 cDNA 2.032293 21.15504 0.001026994 8.082774 >5981 cDNA 1.946782 17.10056 0.001115381 6.944303 >8083 cDNA 1.957218 16.81269 0.001115381 6.847414 >2013 cDNA 1.659813 15.65550 0.001115381 6.431781 >7492 cDNA 1.739500 15.31887 0.001115381 6.302453 > > top30=ord[1:30] > > plot(fit$coef,eb$lods,xlab="Log2 Fold Change", ylab="Log >Odds",pch=16,cex=0.1) > > and then included RG=backgroundCorrect(RG, method="none") without >manually altering the files, before the normalizations and got the >following top table. >The MA plots were also very different between the two tests. > >Block Row Column ID >Name >4062 14 14 11 130050 R11620:Data not >found::::0:130050 >1033 4 12 1 124297 R02202:Data not >found::::0:124297 >1551 6 5 5 5195930 BI754638:Data not >found::::10:39204 >6900 23 25 23 214572 H73225:In multiple >clusters::::0:214572 >5131 18 3 13 N/A BQ025821:Data not >found::::10:33091 > Status M t P.Value B >4062 cDNA 2.421487 11.601562 0.1968434 -0.08869236 >1033 cDNA 1.681192 8.298373 0.1968434 -0.58215799 >1551 cDNA 1.904751 8.293908 0.1968434 -0.58312702 >6900 cDNA 2.094721 8.171346 0.1968434 -0.61016473 >5131 cDNA 2.314669 7.761475 0.1968434 -0.70707369 > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

ADD COMMENT • link 21.6 years ago Gordon Smyth 53k

0

Entering edit mode

Helen Cattan ▴ 100

@helen-cattan-687

Last seen 11.2 years ago

Hi, The bit of code I listed is using the default background correction which is subtraction - which I did on my manually altered files so B635 and B532 were zero. A bit further down the email I (hope I) explained that I included in the code RG=backgroundCorrect(RG, method="none") before the normalizations which should change the default background correction and I used my original .gpr files for this. I provided the top 5 for both of these examples. The first top table I gave were my results for my altered files with default background correction and the second top table was for unaltered files with background correct method = none. The results are not the same. Sorry if my last email was unclear - I hope this now makes sense. Helen -----Original Message----- From: Gordon Smyth [mailto:smyth@wehi.edu.au] Sent: 18 April 2004 23:28 To: Helen Cattan Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] limma - background correct method=none At 06:50 AM 19/04/2004, Helen Cattan wrote: > Hi, > >I have been looking at backgroundCorrect, method=none in limma and >compared this to files I have manually altered so that the background >values were zero, without using the backgroundCorrect method. I thought >that these would produce the same results but they were very different >- could anyone explain why to me please? They do produce the same results. > Code is below. The code example you give below uses the default background correction which is subtraction, so you have not given an example of what you claim above. Gordon >Thanks, >Helen > > > library(limma) > > files=dir(pattern="*\\.gpr") > > RG=read.maimages(files, columns=list(Rf="F635 Median", Gf="F532 >Median", Rb="B635 > > names(RG) > > RG$genes=readGAL() > > RG$printer=getLayout(RG$genes) > > samples=read.table("sampleinformationa.txt", header=TRUE, sep="\t", >as.is=TRUE) > > samples > > spottypes=readSpotTypes() RG$genes$Status=controlStatus(spottypes, > > RG) > > MA1=normalizeWithinArrays(RG) > > MA2=normalizeBetweenArrays(MA1) > > design=c(1,-1) > > cor=dupcor.series(MA2$M, design, ndups=2, spacing=1) cor$cor > > fit=gls.series(MA2$M,design,ndups=2,correlation=0.7628949) > > eb=ebayes(fit) > > genenames=uniquegenelist(RG$genes, ndups=2) > > ord=order(eb$lods, decreasing=TRUE) > > toptable(number=30,genelist=genenames,fit=fit,eb=eb,adjust="fdr") > Block Row Column ID >Name >2986 10 24 19 209274 >H63351:Hs.203509::::3:211600 >5981 20 24 9 No_seq :Data not >found:::::212310 >8083 27 24 13 No_seq :Data not >found:::::211255 >2013 7 18 17 3846240 BE617901:In multiple >clusters::::3:152735 >7492 25 25 7 No_seq :Data not >found:::::223083 > Status M t P.Value B >2986 cDNA 2.032293 21.15504 0.001026994 8.082774 >5981 cDNA 1.946782 17.10056 0.001115381 6.944303 >8083 cDNA 1.957218 16.81269 0.001115381 6.847414 >2013 cDNA 1.659813 15.65550 0.001115381 6.431781 >7492 cDNA 1.739500 15.31887 0.001115381 6.302453 > > top30=ord[1:30] > > plot(fit$coef,eb$lods,xlab="Log2 Fold Change", ylab="Log >Odds",pch=16,cex=0.1) > > and then included RG=backgroundCorrect(RG, method="none") without >manually altering the files, before the normalizations and got the >following top table. The MA plots were also very different between the >two tests. > >Block Row Column ID >Name >4062 14 14 11 130050 R11620:Data not >found::::0:130050 >1033 4 12 1 124297 R02202:Data not >found::::0:124297 >1551 6 5 5 5195930 BI754638:Data not >found::::10:39204 >6900 23 25 23 214572 H73225:In multiple >clusters::::0:214572 >5131 18 3 13 N/A BQ025821:Data not >found::::10:33091 > Status M t P.Value B >4062 cDNA 2.421487 11.601562 0.1968434 -0.08869236 >1033 cDNA 1.681192 8.298373 0.1968434 -0.58215799 >1551 cDNA 1.904751 8.293908 0.1968434 -0.58312702 >6900 cDNA 2.094721 8.171346 0.1968434 -0.61016473 >5131 cDNA 2.314669 7.761475 0.1968434 -0.70707369 > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

ADD COMMENT • link 21.6 years ago Helen Cattan ▴ 100

0

Entering edit mode

Matthew Hannah ▴ 940

@matthew-hannah-621

Last seen 11.2 years ago

The first code and top table is with default subtract (with the file with 0's for BG, and the second top table is that obtained using RG=backgroundCorrect(RG, method="none") on the original data. The obvious is to double check that the .gpr files are correct, and that you haven't mixed medians and means by mistake when replacing BG with the 0's. A problem with these files may also explain the high NA's and missing values in your other post. I'm not sure on the file format, but check the column headings correspond to the correct columns. I once had such a column shift when writing spot output to .txt files with all sorts of 'interesting' results. The other suggestion/question is could the wt.fun=wtflags(0), be having any effect? Try the same without this, but I'd definetely look at the source data first. HTH, Matt

ADD COMMENT • link 21.6 years ago Matthew Hannah ▴ 940

0

Entering edit mode

Helen Cattan ▴ 100

@helen-cattan-687

Last seen 11.2 years ago

I have checked that the gpr files are correct and the column headings are fine and they are. I've done the same without the flagging and get (not very) different results so does this mean I can assume my flagging is working ok? Is there a way for me to check the data is read and flagged correctly? - to look at part of the RG list and the flagged values? Also I have no missing values in either foreground or background on the gpr files in my other post so suspect there is a peculiarity in the reading somewhere. Thanks, Helen -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Matthew Hannah Sent: 19 April 2004 11:00 To: bioconductor@stat.math.ethz.ch Subject: [BioC] limma - background correct method=none The first code and top table is with default subtract (with the file with 0's for BG, and the second top table is that obtained using RG=backgroundCorrect(RG, method="none") on the original data. The obvious is to double check that the .gpr files are correct, and that you haven't mixed medians and means by mistake when replacing BG with the 0's. A problem with these files may also explain the high NA's and missing values in your other post. I'm not sure on the file format, but check the column headings correspond to the correct columns. I once had such a column shift when writing spot output to .txt files with all sorts of 'interesting' results. The other suggestion/question is could the wt.fun=wtflags(0), be having any effect? Try the same without this, but I'd definetely look at the source data first. HTH, Matt _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

ADD COMMENT • link 21.6 years ago Helen Cattan ▴ 100

Login before adding your answer.