Question

Limma and Genepix

0

Entering edit mode

lepalmer@notes.cc.sunysb.edu ▴ 40

@lepalmernotesccsunysbedu-1254

Last seen 9.7 years ago

This is the pipeline I have been currently using for analysis. I just wanted peoples opinions on if things can be done better. (Its a 3 sets of dye-swaps with 2 spots per orf per chip) library(limma) targets<-readTargets("targets.txt") RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0)) RG$printer<-getLayout(RG$genes) RG$genes<-readGAL("Y_pestis.sorted.gal") spottypes<-readSpotTypes("spotTypes.txt") RG$genes$Status<-controlStatus(spottypes,RG) RGb<-backgroundCorrect(RG,method="normexp") MA<-normalizeWithinArrays(RGb) MA<-normalizeBetweenArrays(MA) cor<-duplicateCorrelation(MA,ndups=2,spacing=240) design<-c(1,-1,1,-1,-1,1) fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,spa cing=240) fit<-eBayes(fit) tt<-topTable(fit,adjust="fdr",n=6000) write.table(tt,file="tmp.txt",sep="\t") I have also recently read about the Kooperberg method for background correction. Is this a preferred method? I have been able to do this with the following commands targets<-readTargets("targets.txt") # RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0)) RG$printer<-getLayout(RG$genes) RG$genes<-readGAL("Y_pestis.sorted.gal") spottypes<-readSpotTypes("spotTypes.txt") RG$genes$Status<-controlStatus(spottypes,RG) read.series(targets$FileName, suffix=NULL, skip=31, sep="\t") RGb <- kooperberg(targets$FileName, layout=RG$printer) RGb$genes<-RG$genes RGb$printer<-RG$printer RGb$weights<-RG$weights RGb$targets<-RG$targets MA<-normalizeWithinArrays(RGb) MA<-normalizeBetweenArrays(MA) cor<-duplicateCorrelation(MA,ndups=2,spacing=240) design<-c(1,-1,1,-1,-1,1) fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,spa cing=240) fit<-eBayes(fit) topTable(fit,adjust="fdr",n=32) tt<-topTable(fit,adjust="fdr",n=6000) write.table(tt,file="tmp.txt",sep="\t") I recently had a small argument with an advisor who told me to do background correction by subtracting background from foreground and flagging negative numbers. This is obviously the default for limma. BUt when doing this approach, a lot of spots popped up that didnt make sense (ie non-specific DNA), while the normexp fixed that problem. I recently discovered Kooperberg, which was designed for the problem of negative intensitie with Genepix data. So which is the best method, and how do I convince this guy that I should use this method? One last question I have is that these methods will give you some statistics on gene expression differences. Often people report genes that are differentially regulated by more than two-fold. It seems to me that to do this, one would need an intensity cutoff, as genes with little, or no expression can easily slip into that category. How would one calculate such a cutoff? There are spots on the array that contain oligos that are definitely not found in the species being studied. (Bacteria vs arabidopsis). Can this information be used. Thanks, Lance Palmer [[alternative HTML version deleted]]

limma Category limma Category • 959 views

ADD COMMENT • link updated 19.0 years ago by Gordon Smyth 50k • written 19.0 years ago by lepalmer@notes.cc.sunysb.edu ▴ 40

score 0 · Answer 1 · 2005-05-18

> Date: Tue, 17 May 2005 07:45:43 -0400 > From: lepalmer@notes.cc.sunysb.edu > Subject: [BioC] Limma and Genepix > To: bioconductor@stat.math.ethz.ch > > Content-Type: text/plain > > This is the pipeline I have been currently using for analysis. I just > wanted peoples opinions on if things can be done better. (Its a 3 sets > of dye-swaps with 2 spots per orf per chip) > > library(limma) > targets<-readTargets("targets.txt") > RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0)) > RG$printer<-getLayout(RG$genes) > RG$genes<-readGAL("Y_pestis.sorted.gal") > spottypes<-readSpotTypes("spotTypes.txt") > RG$genes$Status<-controlStatus(spottypes,RG) > RGb<-backgroundCorrect(RG,method="normexp") > MA<-normalizeWithinArrays(RGb) > MA<-normalizeBetweenArrays(MA) > cor<-duplicateCorrelation(MA,ndups=2,spacing=240) > design<-c(1,-1,1,-1,-1,1) > fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,s pacing=240) > fit<-eBayes(fit) > tt<-topTable(fit,adjust="fdr",n=6000) > write.table(tt,file="tmp.txt",sep="\t") > > I have also recently read about the Kooperberg method for background > correction. Is this a preferred method? > I have been able to do this with the following commands > > targets<-readTargets("targets.txt") # > RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0)) > RG$printer<-getLayout(RG$genes) > RG$genes<-readGAL("Y_pestis.sorted.gal") > spottypes<-readSpotTypes("spotTypes.txt") > RG$genes$Status<-controlStatus(spottypes,RG) > read.series(targets$FileName, suffix=NULL, skip=31, sep="\t") > RGb <- kooperberg(targets$FileName, layout=RG$printer) > RGb$genes<-RG$genes > RGb$printer<-RG$printer > RGb$weights<-RG$weights > RGb$targets<-RG$targets > MA<-normalizeWithinArrays(RGb) > MA<-normalizeBetweenArrays(MA) > cor<-duplicateCorrelation(MA,ndups=2,spacing=240) > design<-c(1,-1,1,-1,-1,1) > fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,s pacing=240) > fit<-eBayes(fit) > topTable(fit,adjust="fdr",n=32) > tt<-topTable(fit,adjust="fdr",n=6000) > write.table(tt,file="tmp.txt",sep="\t") > > I recently had a small argument with an advisor who told me to do > background correction by subtracting background from foreground and > flagging negative numbers. This is obviously the default for limma. BUt > when doing this approach, a lot of spots popped up that didnt make sense > (ie non-specific DNA), while the normexp fixed that problem. I recently > discovered Kooperberg, which was designed for the problem of negative > intensitie with Genepix data. So which is the best method, and how do I > convince this guy that I should use this method? I don't think anyone knows which is the best method, but normexp and koorperberg are clearly better than subtracting, as you have observed. > One last question I have is that these methods will give you some > statistics on gene expression differences. Often people report genes that > are differentially regulated by more than two-fold. It seems to me that > to do this, one would need an intensity cutoff, as genes with little, or > no expression can easily slip into that category. How would one calculate > such a cutoff? One of the beauties of using normexp or similar offset background and statistical criteria for differential expression is that an intensity cutoff is not required. Gordon > There are spots on the array that contain oligos that are > definitely not found in the species being studied. (Bacteria vs > arabidopsis). Can this information be used. > > Thanks, > Lance Palmer > [[alternative HTML version deleted]]