Farms/Call.INI using oligo package?
3
0
Entering edit mode
Fraser Sim ▴ 270
@fraser-sim-3567
Last seen 9.6 years ago
Hi, I am analyzing Affymetrix hugene2.0st arrays using the oligo package and had hoped to be able to use informative/non-informative probe calling (FARMS, Tolloen et 2007) as a means on filtering data. I had used this on older 3' biased arrays analyzed with affy but the farms package only works with an affyBatch object. Is this possible with another package or should I be considering a different approach to filter array data on the newer Affy arrays? Cheers, Fraser [[alternative HTML version deleted]]
probe affy oligo farms probe affy oligo farms • 1.6k views
ADD COMMENT
1
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 5.5 years ago
Austria
Dear Fraser, You could use package xps for this purpose, which has implemented I/NI calls as function 'ini.call()'. However, Affymetrix recommends for these arrays to use DABG calls, which are also implemented in xps as function 'dabg.call()'. Best regards, Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ On 11/13/13 2:32 PM, Fraser Sim wrote: > Hi, > > > > I am analyzing Affymetrix hugene2.0st arrays using the oligo package and had > hoped to be able to use informative/non-informative probe calling (FARMS, > Tolloen et 2007) as a means on filtering data. I had used this on older 3' > biased arrays analyzed with affy but the farms package only works with an > affyBatch object. Is this possible with another package or should I be > considering a different approach to filter array data on the newer Affy > arrays? > > > > Cheers, > > Fraser > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
1
Entering edit mode
Djork Clevert ▴ 210
@djork-clevert-422
Last seen 9.6 years ago
Hi Fraser, thanks for reporting this issue. We are aware of this issue. Reason is that since R versions R-3.x.x AffyBatch does no longer support particular array types. We are modifying the FARMS package that array types that no longer supported by AffyBatch objects can be used via the oligo package. Enclosed you will find a workaround to use FARMS for summarization and also I/NI calls filtering. Please notice, that the I/NI call filter will not work for probe sets which contain less than three probes. Let me know, if you have any problems. Cheers, Okko The code is: library(oligo) library(farms) farms <- function(object, background=FALSE, normalize=TRUE, subset=NULL, target="core", weight=.5, mu=0, weighted.mean=FALSE, laplacian=FALSE, robust=TRUE, correction=0, centering=c("median","mean"), spuriousCorrelation=0, verbose=TRUE ) { target <- match.arg(target, c("core", "probeset")) featureInfo <- oligo:::stArrayPmInfo(object, target=target) theClass <- class(exprs(object)) pmi <- featureInfo[["fid"]] pnVec <- as.character(featureInfo[["fsetid"]]) if ("matrix" %in% theClass){ pms <- exprs(object)[pmi,, drop=FALSE] dimnames(pms) <- NULL colnames(pms) <- sampleNames(object) theExprs <- basicFARMS(pms, pnVec, normalize, background, weight=weight, mu=mu, weighted.mean=weighted.mean, laplacian=laplacian, robust=robust, correction=correction, centering=centering, spuriousCorrelation=spuriousCorrelation) rm(pms) } else{ stop("farms is currently not implemented for '", theClass, "' objects.") } colnames(theExprs$exprs) <- colnames(theExprs$se.exprs) <- sampleNames(object) rownames(theExprs$exprs) <- rownames(theExprs$se.exprs) <- unique(pnVec) out <- new("ExpressionSet", phenoData=phenoData(object), experimentData=experimentData(object), exprs=theExprs$exprs, se.exprs=theExprs$se.exprs, annotation=annotation(object), protocolData=protocolData(object)) if (validObject(out)){ return(out) }else{ stop("Resulting object is invalid.") } } basicFARMS <- function(pmMat, pnVec, normalize, background, weight, mu, weighted.mean, laplacian, robust, correction, centering, spuriousCorrelation){ pns <- unique(pnVec) nPn <- length(unique(pnVec)) pnVec <- split(1:(length(pnVec)), pnVec) narray <- ncol(pmMat) if(background){ pmMat <- oligo:::backgroundCorrect(pmMat) } if(normalize){ pmMat <- oligo:::normalize(pmMat) } ini.mat <- matrix(NA, nPn, narray ) exprs.mat <- matrix(0, nPn, narray ) message("Summarizing... ", appendLF=FALSE) for(i in 1:length(pnVec)){ if(length(pnVec[[i]])<3){ if(length(pnVec[[i]])==1){ exprs.mat[i,] <- pmMat[pnVec[[i]],] } else { exprs.mat[i,] <- colMeans(pmMat[pnVec[[i]],]) } } else{ res <- generateExprVal.method.farms(probes=pmMat[pnVec[[i]],], weight=weight, mu=mu, cyc=30, tol=0.00001, weighted.mean=weighted.mean, robust=robust, minNoise=minNoise, correction=correction, laplacian=laplacian, centering=centering, spuriousCorrelation=spuriousCorrelation) exprs.mat[i,] <- res$exprs ini.mat[i,] <- res$se.exprs } } message("OK") return(list(exprs=exprs.mat, se.exprs=ini.mat)) } -- Djork-Arné Clevert, PhD Phone: +49 30 6883 5306 Fax: +49 30 6883 5307 Email: okko@clevert.de Am 13.11.2013 um 14:32 schrieb Fraser Sim <fjsim@buffalo.edu>: > Hi, > > > > I am analyzing Affymetrix hugene2.0st arrays using the oligo package and had > hoped to be able to use informative/non-informative probe calling (FARMS, > Tolloen et 2007) as a means on filtering data. I had used this on older 3' > biased arrays analyzed with affy but the farms package only works with an > affyBatch object. Is this possible with another package or should I be > considering a different approach to filter array data on the newer Affy > arrays? > > > > Cheers, > > Fraser > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Dear Okko, Thanks! This function worked and I am able to call INIcalls and getI_ProbeSets on my affyExpressionFS object. What is a little odd is that I’m only getting 4836 features that pass, out of 53617 on the hugene2.0st on the entire array. I realize you said that call filter would not work with probe sets with less than three probes. So I checked the totalprobe column of the fData derived from a call to getNetAffx. There are 51524 features with >2 totalprobes. What do you think? Is this just a feature of the current dataset? Or is this common to the newer arrays which have fewer probes per probe set spread over a longer span of the mRNA. Here’s a snip of my code (after sourcing your code): library(oligo) celFiles <- list.celfiles() affyExpressionFS <- read.celfiles(celFiles) eset.farms <- farms(affyExpressionFS) INIs = INIcalls(eset.farms) All_probes = featureNames(eset.farms) I_probes = getI_ProbeSets(INIs) eset <- oligo::rma(affyExpressionFS, target = 'core') eset # 53617 features featureData(eset) <- getNetAffx(eset,"transcript") eset[featureNames(eset) %in% I_probes,] # 4836 features length(which(fData(eset)$totalprobes > 2)) # 51524 Cheers, Fraser Fraser Sim, PhD Assistant Professor of Pharmacology and Toxicology University at Buffalo 3435 Main Street 119 Farber Hall Buffalo, NY 14214 T: (716) 829-2151 fjsim@buffalo.edu Confidentiality Notice: This message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure, or distribution is strictly prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. From: Djork-Arné Clevert [mailto:okko@clevert.de] Sent: Friday, November 15, 2013 6:04 AM To: Fraser Sim Cc: bioconductor@r-project.org Subject: Re: [BioC] Farms/Call.INI using oligo package? Hi Fraser, thanks for reporting this issue. We are aware of this issue. Reason is that since R versions R-3.x.x AffyBatch does no longer support particular array types. We are modifying the FARMS package that array types that no longer supported by AffyBatch objects can be used via the oligo package. Enclosed you will find a workaround to use FARMS for summarization and also I/NI calls filtering. Please notice, that the I/NI call filter will not work for probe sets which contain less than three probes. Let me know, if you have any problems. Cheers, Okko The code is: library(oligo) library(farms) farms <- function(object, background=FALSE, normalize=TRUE, subset=NULL, target="core", weight=.5, mu=0, weighted.mean=FALSE, laplacian=FALSE, robust=TRUE, correction=0, centering=c("median","mean"), spuriousCorrelation=0, verbose=TRUE ) { target <- match.arg(target, c("core", "probeset")) featureInfo <- oligo:::stArrayPmInfo(object, target=target) theClass <- class(exprs(object)) pmi <- featureInfo[["fid"]] pnVec <- as.character(featureInfo[["fsetid"]]) if ("matrix" %in% theClass){ pms <- exprs(object)[pmi,, drop=FALSE] dimnames(pms) <- NULL colnames(pms) <- sampleNames(object) theExprs <- basicFARMS(pms, pnVec, normalize, background, weight=weight, mu=mu, weighted.mean=weighted.mean, laplacian=laplacian, robust=robust, correction=correction, centering=centering, spuriousCorrelation=spuriousCorrelation) rm(pms) } else{ stop("farms is currently not implemented for '", theClass, "' objects.") } colnames(theExprs$exprs) <- colnames(theExprs$se.exprs) <- sampleNames(object) rownames(theExprs$exprs) <- rownames(theExprs$se.exprs) <- unique(pnVec) out <- new("ExpressionSet", phenoData=phenoData(object), experimentData=experimentData(object), exprs=theExprs$exprs, se.exprs=theExprs$se.exprs, annotation=annotation(object), protocolData=protocolData(object)) if (validObject(out)){ return(out) }else{ stop("Resulting object is invalid.") } } basicFARMS <- function(pmMat, pnVec, normalize, background, weight, mu, weighted.mean, laplacian, robust, correction, centering, spuriousCorrelation){ pns <- unique(pnVec) nPn <- length(unique(pnVec)) pnVec <- split(1:(length(pnVec)), pnVec) narray <- ncol(pmMat) if(background){ pmMat <- oligo:::backgroundCorrect(pmMat) } if(normalize){ pmMat <- oligo:::normalize(pmMat) } ini.mat <- matrix(NA, nPn, narray ) exprs.mat <- matrix(0, nPn, narray ) message("Summarizing... ", appendLF=FALSE) for(i in 1:length(pnVec)){ if(length(pnVec[[i]])<3){ if(length(pnVec[[i]])==1){ exprs.mat[i,] <- pmMat[pnVec[[i]],] } else { exprs.mat[i,] <- colMeans(pmMat[pnVec[[i]],]) } } else{ res <- generateExprVal.method.farms(probes=pmMat[pnVec[[i]],], weight=weight, mu=mu, cyc=30, tol=0.00001, weighted.mean=weighted.mean, robust=robust, minNoise=minNoise, correction=correction, laplacian=laplacian, centering=centering, spuriousCorrelation=spuriousCorrelation) exprs.mat[i,] <- res$exprs ini.mat[i,] <- res$se.exprs } } message("OK") return(list(exprs=exprs.mat, se.exprs=ini.mat)) } -- Djork-Arné Clevert, PhD Phone: +49 30 6883 5306 Fax: +49 30 6883 5307 Email: okko@clevert.de Am 13.11.2013 um 14:32 schrieb Fraser Sim <fjsim@buffalo.edu>: Hi, I am analyzing Affymetrix hugene2.0st arrays using the oligo package and had hoped to be able to use informative/non-informative probe calling (FARMS, Tolloen et 2007) as a means on filtering data. I had used this on older 3' biased arrays analyzed with affy but the farms package only works with an affyBatch object. Is this possible with another package or should I be considering a different approach to filter array data on the newer Affy arrays? Cheers, Fraser [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
1
Entering edit mode
Djork Clevert ▴ 210
@djork-clevert-422
Last seen 9.6 years ago
Hi Fraser, I just realized that my previous email was everything else than ?self- explanatory? and on top of that - the workaround contained a typo? Cheers, Okko Please find below the revised code including an example at the end: library(oligo) library(farms) farms <- function(object, background=FALSE, normalize=TRUE, subset=NULL, target="core", weight=.5, mu=0, weighted.mean=FALSE, laplacian=FALSE, robust=TRUE, correction=0, centering=c("median","mean"), spuriousCorrelation=0, verbose=TRUE ) { target <- match.arg(target, c("core", "probeset")) featureInfo <- oligo:::stArrayPmInfo(object, target=target) theClass <- class(exprs(object)) pmi <- featureInfo[["fid"]] pnVec <- as.character(featureInfo[["fsetid"]]) if ("matrix" %in% theClass){ pms <- exprs(object)[pmi,, drop=FALSE] dimnames(pms) <- NULL colnames(pms) <- sampleNames(object) theExprs <- basicFARMS(pms, pnVec, normalize, background, weight=weight, mu=mu, weighted.mean=weighted.mean, laplacian=laplacian, robust=robust, correction=correction, centering=centering, spuriousCorrelation=spuriousCorrelation) rm(pms) } else{ stop("farms is currently not implemented for '", theClass, "' objects.") } colnames(theExprs$exprs) <- colnames(theExprs$se.exprs) <- sampleNames(object) rownames(theExprs$exprs) <- rownames(theExprs$se.exprs) <- unique(pnVec) out <- new("ExpressionSet", phenoData=phenoData(object), experimentData=experimentData(object), exprs=theExprs$exprs, se.exprs=theExprs$se.exprs, annotation=annotation(object), protocolData=protocolData(object)) if (validObject(out)){ return(out) }else{ stop("Resulting object is invalid.") } } basicFARMS <- function(pmMat, pnVec, normalize, background, weight, mu, weighted.mean, laplacian, robust, correction, centering, spuriousCorrelation){ pns <- unique(pnVec) nPn <- length(unique(pnVec)) pnVec <- split(1:(length(pnVec)), pnVec) narray <- ncol(pmMat) if(background){ pmMat <- oligo:::backgroundCorrect(pmMat) } if(normalize){ pmMat <- oligo:::normalize(pmMat) } ini.mat <- matrix(0.5, nPn, narray ) exprs.mat <- matrix(0, nPn, narray ) message("Summarizing... ", appendLF=FALSE) for(i in 1:length(pnVec)){ if(length(pnVec[[i]])<2){ exprs.mat[i,] <- log2(pmMat[pnVec[[i]],]) } else{ res <- generateExprVal.method.farms(probes=pmMat[pnVec[[i]],], weight=weight, mu=mu, cyc=30, tol=0.00001, weighted.mean=weighted.mean, robust=robust, minNoise=minNoise, correction=correction, laplacian=laplacian, centering=centering, spuriousCorrelation=spuriousCorrelation) exprs.mat[i,] <- res$exprs ini.mat[i,] <- res$se.exprs } } message("OK") return(list(exprs=exprs.mat, se.exprs=ini.mat)) } ### example to run # library(oligo) # library(farms) # celFiles <- list.celfiles("/path/to/celfiles/", full.name=TRUE) # raw <- read.celfiles(celFiles) # eset <- farms(raw, normalize=TRUE) #### -- Djork-Arn? Clevert, PhD Phone: +49 30 6883 5306 Fax: +49 30 6883 5307 Email: okko at clevert.de Am 13.11.2013 um 14:32 schrieb Fraser Sim <fjsim at="" buffalo.edu="">: > Hi, > > > > I am analyzing Affymetrix hugene2.0st arrays using the oligo package and had > hoped to be able to use informative/non-informative probe calling (FARMS, > Tolloen et 2007) as a means on filtering data. I had used this on older 3' > biased arrays analyzed with affy but the farms package only works with an > affyBatch object. Is this possible with another package or should I be > considering a different approach to filter array data on the newer Affy > arrays? > > > > Cheers, > > Fraser > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode

Dear Okko,

I am using the "farms" code you listed here for Affymetrix HTA-2.0 arrays that have ~6.8 million probes in total. The code worked well for data with smaller sample sizes (tested at 168) but it failed to process ~217 samples together. The error message said "cannot allocate vector of size 11.7 Gb" at the core level (14.6 Gb at probeset level). It seemed a "farms" infrastructure issue as I was running the code in the unix environment with a 32gb-ram node. Any advice or modifications would be much appreciated. Thanks you!

Jinsheng Yu

=============================

Genome Technology Access Center

Washington University School of Medicine

Saint Louis, MO 53110

====================================

ADD REPLY

Login before adding your answer.

Traffic: 801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6