Entering edit mode
Jeremy Gollub
▴
80
@jeremy-gollub-790
Last seen 10.2 years ago
Hi, all -
I'm experiencing very poor performance using the marray package (20
minutes to normalize a single <32,000 spot microarray). Can someone
tell me whether this is normal, or what I'm doing wrong?
In the process of hunting down some errors, I also noticed some odd
(to
me) behavior in the marrayLayout maSub slot assignment method,
described
below. An attempt to "correct" this results in a much faster
normalization (~1 minute), which looks good according to the MA plot
but produces different numbers in maM than the slower calculation.
It seems unlikely that either result is correct (I can choose between
suspiciously bad performance, or messing with the marrayLayout
object's
internals).
Thanks for any suggestions - details follow.
I'm using R version 1.9.0 on a sparc system running Solaris 2.9. My
marray version is 1.5.14.
I have a text file, "dat.txt," containing the data I want to
normalize.
10 columns, all numeric: in order,
FEATURE spot number 1 - 31736
SECTOR unnecessary and unused
ROW "
COL "
PLATE ID of printing plate
Gf green channel foreground
Rf red channel foreground
Gb green channel background
Rb red channel background
W spot weights, either 0 or 1
Array parameters are: Ngr = 8, Ngc = 4, Nsr = 31, Nsc = 32, Nspots =
31744. Not all spots are printed (ragged ends to each block). Only
printed spots are included in the data file, so there are gaps in the
FEATURE column sequence but no blank lines in the file.
The session:
> library(marray)
>
> # Read file.
>
> dat <- read.table('dat.txt', header = TRUE)
>
> # Construct maSub: 1 for each printed spot, 0 for absent spots.
>
> seq <- c(1:31744)
> int <- intersect(seq, as.numeric(dat[,1]))
> sub <- rep(0, 31744)
> sub[int] <- 1
>
> # Note contents of sub around the end of the first block and
beginning
> # of the second:
>
> print(sub[980:1000])
[1] 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
> # total of 31488 present spots
> sum(sub)
[1] 31488
>
> # Construct marrayLayout object.
>
> ml <- new("marrayLayout", maNgr = 8, maNgc = 4, maNsr = 31, maNsc =
32,
+ maNspots = 31744)
> maSub(ml) <- sub
> maPlate(ml) <- as.factor(dat[,5])
>
> # Note contents of maSub:
>
> sum(ml@maSub)
[1] 1
> length(ml@maSub)
[1] 31744
> print(sub[1:20])
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> print(ml@maSub[1:20])
[1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> print(ml@maSub[980:1000])
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> # Now meddle with ml@maSub (set it back the way I think it should
be).
> # Or don't - see comment on maNormMain step, below.
>
> maSub(ml)[int] <- TRUE
>
> # construct marrayRaw object.
>
> mr <- new("marrayRaw",
+ maGf = matrix(dat[,6], ncol = 1),
+ maRf = matrix(dat[,7], ncol = 1),
+ maGb = matrix(dat[,8], ncol = 1),
+ maRb = matrix(dat[,9], ncol = 1),
+ maW = matrix(dat[,10], ncol = 1),
+ maLayout = ml)
>
> # This step takes about one minute if I do maSub(ml)[int] <- TRUE
> # as indicated above. If I don't, it takes about 20 minutes.
> # The results differ, although the MA plot looks normalized either
way.
>
> mn <- maNormMain(mr, f.loc = list(maNormLoess(x="maA", y="maM",
+ z="maPrintTip", w=NULL, subset=TRUE, span =
0.4)),
+ f.scale = list(maNormMAD(x = "maPrintTip", y =
"maM",
+ geo = FALSE, subset = TRUE)),
+ Mloc = TRUE, Mscale = TRUE)
--
Jeremy Gollub, Ph.D.
jgollub@genome.stanford.edu
(W) 650/736-0075