Limma and Genepix
1
0
Entering edit mode
@lepalmernotesccsunysbedu-1254
Last seen 9.7 years ago
This is the pipeline I have been currently using for analysis. I just wanted peoples opinions on if things can be done better. (Its a 3 sets of dye-swaps with 2 spots per orf per chip) library(limma) targets<-readTargets("targets.txt") RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0)) RG$printer<-getLayout(RG$genes) RG$genes<-readGAL("Y_pestis.sorted.gal") spottypes<-readSpotTypes("spotTypes.txt") RG$genes$Status<-controlStatus(spottypes,RG) RGb<-backgroundCorrect(RG,method="normexp") MA<-normalizeWithinArrays(RGb) MA<-normalizeBetweenArrays(MA) cor<-duplicateCorrelation(MA,ndups=2,spacing=240) design<-c(1,-1,1,-1,-1,1) fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,spa cing=240) fit<-eBayes(fit) tt<-topTable(fit,adjust="fdr",n=6000) write.table(tt,file="tmp.txt",sep="\t") I have also recently read about the Kooperberg method for background correction. Is this a preferred method? I have been able to do this with the following commands targets<-readTargets("targets.txt") # RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0)) RG$printer<-getLayout(RG$genes) RG$genes<-readGAL("Y_pestis.sorted.gal") spottypes<-readSpotTypes("spotTypes.txt") RG$genes$Status<-controlStatus(spottypes,RG) read.series(targets$FileName, suffix=NULL, skip=31, sep="\t") RGb <- kooperberg(targets$FileName, layout=RG$printer) RGb$genes<-RG$genes RGb$printer<-RG$printer RGb$weights<-RG$weights RGb$targets<-RG$targets MA<-normalizeWithinArrays(RGb) MA<-normalizeBetweenArrays(MA) cor<-duplicateCorrelation(MA,ndups=2,spacing=240) design<-c(1,-1,1,-1,-1,1) fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,spa cing=240) fit<-eBayes(fit) topTable(fit,adjust="fdr",n=32) tt<-topTable(fit,adjust="fdr",n=6000) write.table(tt,file="tmp.txt",sep="\t") I recently had a small argument with an advisor who told me to do background correction by subtracting background from foreground and flagging negative numbers. This is obviously the default for limma. BUt when doing this approach, a lot of spots popped up that didnt make sense (ie non-specific DNA), while the normexp fixed that problem. I recently discovered Kooperberg, which was designed for the problem of negative intensitie with Genepix data. So which is the best method, and how do I convince this guy that I should use this method? One last question I have is that these methods will give you some statistics on gene expression differences. Often people report genes that are differentially regulated by more than two-fold. It seems to me that to do this, one would need an intensity cutoff, as genes with little, or no expression can easily slip into that category. How would one calculate such a cutoff? There are spots on the array that contain oligos that are definitely not found in the species being studied. (Bacteria vs arabidopsis). Can this information be used. Thanks, Lance Palmer [[alternative HTML version deleted]]
limma Category limma Category • 959 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia
> Date: Tue, 17 May 2005 07:45:43 -0400 > From: lepalmer@notes.cc.sunysb.edu > Subject: [BioC] Limma and Genepix > To: bioconductor@stat.math.ethz.ch > > Content-Type: text/plain > > This is the pipeline I have been currently using for analysis. I just > wanted peoples opinions on if things can be done better. (Its a 3 sets > of dye-swaps with 2 spots per orf per chip) > > library(limma) > targets<-readTargets("targets.txt") > RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0)) > RG$printer<-getLayout(RG$genes) > RG$genes<-readGAL("Y_pestis.sorted.gal") > spottypes<-readSpotTypes("spotTypes.txt") > RG$genes$Status<-controlStatus(spottypes,RG) > RGb<-backgroundCorrect(RG,method="normexp") > MA<-normalizeWithinArrays(RGb) > MA<-normalizeBetweenArrays(MA) > cor<-duplicateCorrelation(MA,ndups=2,spacing=240) > design<-c(1,-1,1,-1,-1,1) > fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,s pacing=240) > fit<-eBayes(fit) > tt<-topTable(fit,adjust="fdr",n=6000) > write.table(tt,file="tmp.txt",sep="\t") > > I have also recently read about the Kooperberg method for background > correction. Is this a preferred method? > I have been able to do this with the following commands > > targets<-readTargets("targets.txt") # > RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0)) > RG$printer<-getLayout(RG$genes) > RG$genes<-readGAL("Y_pestis.sorted.gal") > spottypes<-readSpotTypes("spotTypes.txt") > RG$genes$Status<-controlStatus(spottypes,RG) > read.series(targets$FileName, suffix=NULL, skip=31, sep="\t") > RGb <- kooperberg(targets$FileName, layout=RG$printer) > RGb$genes<-RG$genes > RGb$printer<-RG$printer > RGb$weights<-RG$weights > RGb$targets<-RG$targets > MA<-normalizeWithinArrays(RGb) > MA<-normalizeBetweenArrays(MA) > cor<-duplicateCorrelation(MA,ndups=2,spacing=240) > design<-c(1,-1,1,-1,-1,1) > fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,s pacing=240) > fit<-eBayes(fit) > topTable(fit,adjust="fdr",n=32) > tt<-topTable(fit,adjust="fdr",n=6000) > write.table(tt,file="tmp.txt",sep="\t") > > I recently had a small argument with an advisor who told me to do > background correction by subtracting background from foreground and > flagging negative numbers. This is obviously the default for limma. BUt > when doing this approach, a lot of spots popped up that didnt make sense > (ie non-specific DNA), while the normexp fixed that problem. I recently > discovered Kooperberg, which was designed for the problem of negative > intensitie with Genepix data. So which is the best method, and how do I > convince this guy that I should use this method? I don't think anyone knows which is the best method, but normexp and koorperberg are clearly better than subtracting, as you have observed. > One last question I have is that these methods will give you some > statistics on gene expression differences. Often people report genes that > are differentially regulated by more than two-fold. It seems to me that > to do this, one would need an intensity cutoff, as genes with little, or > no expression can easily slip into that category. How would one calculate > such a cutoff? One of the beauties of using normexp or similar offset background and statistical criteria for differential expression is that an intensity cutoff is not required. Gordon > There are spots on the array that contain oligos that are > definitely not found in the species being studied. (Bacteria vs > arabidopsis). Can this information be used. > > Thanks, > Lance Palmer > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 581 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6