Question: Removing probes from AffyBatch
0
10.5 years ago by
lgautier@altern.org950 wrote:
> Hi Nathan, > > No, I never did get around to making a package for the remove > probes/probe sets functions, mostly because I don't know how! > I just used it again myself, and had to update the code slightly. The code > below works with R 2.7.2. As for how many probes you can remove, > there probably is no set answer. I remember a paper where it was shown possible to lower significantly the number of probes in a probe set (see Antipova et al. 2002). http://genomebiology.com/2002/3/12/research/0073 > There may be an issue with using > different numbers of probes per probe set - I seem to recall some > discussion on this in regards to using MBNI's custom re-mapped cdf > files for Affy's arrays?? It cannot be excluded that there might a "probes -> probe set summary" algorithm defeated by different number of probes, but I do not think of any at the moment. Someone on the list will correct this statement if necessary. L. > Cheers, > Jenny > > > ### The first part is just creating two ojects (ResetEnvir and > RemoveProbes) originally > ### written by Ariel Chernomoretz and modified by Jenny Drnevich to > remove individual > ### probes and/or entire probesets. Just highlight everything from here > until > ### you see STOP and paste it to R all at once > > ResetEnvir<-function(cleancdf){ > cdfpackagename <- paste(cleancdf,"cdf",sep="") > probepackagename <- paste(cleancdf,"probe",sep="") > ll<-search() > cdfpackagepos <- grep(cdfpackagename,ll) > if(length(cdfpackagepos)>0) detach(pos=cdfpackagepos) > ll<-search() > probepackagepos <- grep(probepackagename,ll) > if(length(probepackagepos)>0) detach(pos=probepackagepos) > require(cdfpackagename,character.only=T) > require(probepackagename,character.only=T) > require(affy) > } > > RemoveProbes<-function(listOutProbes=NULL, > listOutProbeSets=NULL, > > cleancdf,destructive=TRUE){ > > > #default probe dataset values > cdfpackagename <- paste(cleancdf,"cdf",sep="") > probepackagename <- paste(cleancdf,"probe",sep="") > require(cdfpackagename,character.only = TRUE) > require(probepackagename,character.only = TRUE) > probe.env.orig <- get(probepackagename) > > > if(!is.null(listOutProbes)){ > # taking probes out from CDF env > probes<- unlist(lapply(listOutProbes,function(x){ > a<-strsplit(x,"at") > aux1<-paste(a[[1]][1],"at",sep="") > aux2<-as.integer(a[[1]][2]) > c(aux1,aux2) > })) > n1<-as.character(probes[seq(1,(length(probes)/2))*2-1]) > n2<-as.integer(probes[seq(1,(length(probes)/2))*2]) > probes<-data.frame(I(n1),n2) > probes[,1]<-as.character(probes[,1]) > probes[,2]<-as.integer(probes[,2]) > pset<-unique(probes[,1]) > for(i in seq(along=pset)){ > ii <-grep(pset[i],probes[,1]) > iout<-probes[ii,2] > a<-get(pset[i],env=get(cdfpackagename)) > a<-a[-iout,] > assign(pset[i],a,env=get(cdfpackagename)) > } > } > > > # taking probesets out from CDF env > if(!is.null(listOutProbeSets)){ > rm(list=listOutProbeSets,envir=get(cdfpackagename)) > } > > > # setting the PROBE env accordingly (idea from gcrma > compute.affinities.R) > tmp <- get("xy2indices",paste("package:",cdfpackagename,sep="")) > newAB <- new("AffyBatch",cdfName=cleancdf) > pmIndex <- unlist(indexProbes(newAB,"pm")) > subIndex<- > match(tmp(probe.env.orig$x,probe.env.orig$y,cdf=cdfpackagename),pmIn dex) > rm(newAB) > iNA <- whichis.na(subIndex)) > > > if(length(iNA)>0){ > ipos<-grep(probepackagename,search()) > assign(probepackagename,probe.env.orig[-iNA,],pos=ipos) > } > } > > ### STOP HERE!!!! PASTE THE ABOVE INTO R AND CHECK TO SEE YOU HAVE > THE TWO OBJECTS > ### (ResetEnvir and RemoveProbes) IN YOUR WORKSPACE WITH ls() > > # All you need now is your affybatch object, and a character vector > of probe set names > # and/or another vector of individual probes that you want to remove. > If your affybatch > # object is called 'rawdata' and the vector of probesets is > 'maskedprobes', all > # you need to do is: > > cleancdf <- cleancdfname(rawdata at cdfName,addcdf=FALSE) > > # Make sure you are starting with the original cdf with all the > probes and probesets. > > ResetEnvir(cleancdf) > > # Double-check to make sure all probesets are present in your > affybatch by typing in > # the name of your affybatch and looking at the output. > > rawdata > > # To remove some probe sets (but not individual probes in this example), > use: > RemoveProbes(listOutProbes=NULL, listOutProbeSets=maskedprobes, cleancdf) > > # The cdf file will be temporarily modified to mask the indicated > probesets & probes, > # which you can check by typing in the name of your affybatch again > and seeing that > # the number of probesets have decreased. The masking can be undone > by using ResetEnvir > # as above, or by quitting the session. However, any Expression Set > objects created > # when the cdf is modified will have the masked probesets removed > permanently because > # they do not refer to the cdf like an affybatch object does. > > > > At 04:59 AM 9/24/2008, Nathan Harmston wrote: >>HI everyone, >> >>I m trying to remove individual probes from a AffyBatch and have found >>a previous post: >> >>https://stat.ethz.ch/pipermail/bioconductor/2006-September/014242.ht ml >> >>I was wondering if this ever got put into a package? >> >>And also how many probes can be removed from a probeset before it >>becomes unreliable? I am going to try to use BioStrings to remove >>probes based on their sequences and other criteria. >> >>Many thanks in advance, >> >>Nathan >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor at stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor >>Search the archives: >>http://news.gmane.org/gmane.science.biology.informatics.conductor > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > W.M. Keck Center for Comparative and Functional Genomics > Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > 330 ERML > 1201 W. Gregory Dr. > Urbana, IL 61801 > USA > > ph: 217-244-7355 > fax: 217-265-5066 > e-mail: drnevich at illinois.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
cdf probe biostrings • 460 views