option rm.mask=TRUE in ReadAffy

0

Entering edit mode

t-kawai@hhc.eisai.co.jp ▴ 50

@t-kawaihhceisaicojp-503

Last seen 11.4 years ago

Thank you for your comment, Laurent. Yes, your way is a smart way to remove in-file MASKS probes. By the way, '.MSK' files are provided by Affymetrix or created by the indivisual user which contains probe cells to be masked in the analysis step. Old design chip like Hu6800 has a lot of masking probes to remove cross-hybridization data. Detail description can be found in MAS5 user's guide Appendix-G. Regards, Kawai > This is a nice way to solve the problem. > > I did not suspect many were suffering from that. > In my particular case I skip the bg.correct step quite often > (and this appears to be the trouble. Most of the > other preprocessing steps seem to behave (see my previous > mail in this thread). Note that Ben Bosltad > recalled (earlier in this very same thread) > that the bgcorrect step tries to "correct" the signal > for the probes. Masking probes away removes a part of the need > for background correction (in my humble opinion, for the > Affymetrix chips, and for many of these I chips I had; this > might not be good be everyone)). > > The masked (and outliers) found in the CEL should probably be stored > in the annotation/description for easier retrieval (but at AFAIR, > this has been shortly in the devel-version some time ago (or was almost there > ;)), > public request will hopefully bring it back that). > > I might have an other way to solve the problem: > (I have never seen '.MSK' files. The 'masked probes' I know of > are included in the CEL files. The CEL files also contain > a section 'outliers'). > > > ## have the file names in filenames. For example : > filenames <- list.files("where/I/want", "CEL", full.names=TRUE) > > abatch <- ReadAffy() > > ## compressed CEL files ? > compress <- FALSE > > ids.list <- vector("list", length=length(filenames)) > > for (i in seq(along=filenames)) { > file <- filenames[i] > masked.xy <- .Call("getIndexExtraFromCEL", as.character(file), > as.character("MASKS"), > as.integer(compress)) > masked.i <- xy2indices(masked.xy[,1], masked.xy[, 2], abatch=abatch) > outliers.xy <- .Call("getIndexExtraFromCEL", as.character(file), > as.character("OUTLIERS"), > as.integer(compress)) > outliers.i <- xy2indices(outliers.xy[, 1], outliers.xy[, 2], abatch=abatch) > ids.list[[i]] <- rbind(masked.i, outliers.i) > } > > ## then whenever ones judges the time has come to mask the outliers: > > for (i in seq(along=filenames)) { > intensity(abatch)[ids[[i]], i] <- NA > } > > > > > Best, > > > > L. > > > > > On Fri, Oct 31, 2003 at 07:08:03PM +0900, t-kawai@hhc.eisai.co.jp wrote: > > I have the same problem with expresso() as weiss. > > > > To solve this problem, I executed a series of procedures step by step. > > > > Background correction, I think, needs all of the CEL data wether those are > > to be masked or not. From the normalization step, I removed masking data > > using Affy's original mask file. > > > > For example, > > > > ## read CEL files > > dat0 <- ReadAffy(); > > > > ## Normalization step > > dat <- bg.correct.mas(dat0); > > > > ## read mask file (in this case, 1803 probes to be masked) > > msk <- scan("Hu6800_ClassA.MSK", skip=2, list("", "")); > > > > ## set ids (cell index list to be masked) > > ## (in this case, 59540 cells will be masked) > > for (i in 1:length(msk[[1]])) { > > nam <- msk[[1]][i]; > > txt <- gsub("-", ":", msk[[2]][i]); > > lst <- eval(parse(text=paste("c(", txt, ")"))); > > > > if (i == 1) { > > ids <- pmindex(dat, nam)[[1]][lst]; > > } else { > > ids <- c(ids, pmindex(dat, nam)[[1]][lst]); > > } > > ids <- c(ids, mmindex(dat, nam)[[1]][lst]); > > } > > > > ## set NAs to cells to be maksed > > intensity(dat)[ids, ] <- NA; > > > > ## Normalization step > > dat1 <- normalize.AffyBatch.qspline(dat); > > > > ## Probe correction & summary step > > dat2 <- computeExprSet(dat1, pmcorrect="mas", summary.method="liwong"); > > write.exprs(dat2, file="result.lst"); > > > > > > > > The file "Hu6800_ClassA.MSK" looks like > > Hu6800 > > [Call] > > A28102_at 17,18,19,20 > > AB000381_s_at 7 > > AC000064_cds2_at 18,19,20 > > ... > > M81830_at 1-20 > > M83181_at 1-20 > > ... > > > > > > Above script works rightly? Please give me a comment... > > > > Kawai > > > > _______________________________________ > > > > Takatoshi Kawai, Ph.D. > > > > Senior Sientist, Bioinformatics > > Laboratoy of Seeds Finding Technology > > Eisai Co., Ltd. > > 5-1-3 Tokodai, Tsukuba-shi, > > Ibaraki 300-2635, Japan > > > > TEL: +81-29-847-7192 > > FAX: +81-29-847-7614 > > e-mail: t-kawai@hhc.eisai.co.jp > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > -- > -------------------------------------------------------------- > Laurent Gautier CBS, Building 208, DTU > PhD. Student DK-2800 Lyngby,Denmark > tel: +45 45 25 24 89 http://www.cbs.dtu.dk/laurent > _______________________________________ Takatoshi Kawai, Ph.D. Senior Sientist, Bioinformatics Laboratoy of Seeds Finding Technology Eisai Co., Ltd. 5-1-3 Tokodai, Tsukuba-shi, Ibaraki 300-2635, Japan TEL: +81-29-847-7192 FAX: +81-29-847-7614 e-mail: t-kawai@hhc.eisai.co.jp

Normalization Preprocessing hu6800 probe Normalization Preprocessing hu6800 probe • 920 views

ADD COMMENT • link 22.2 years ago t-kawai@hhc.eisai.co.jp ▴ 50

Login before adding your answer.