Hi Maya -- Providing sessionInfo and a transcript of your session
really helped. Please see the comments below. I am responding to the
list, so that others may benefit.
Martin
"Maya Bercovich" <mayab at="" tauex.tau.ac.il=""> writes:
> -----Original Message-----
> From: Martin Morgan [mailto:mtmorgan at fhcrc.org]
> Sent: 10 July, 2007 3:58 PM
> To: Maya Bercovich
> Cc: Seth Falcon; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] how to revert to an older limma version?
>
> Hi Maya -- Here are my suggestions. Most important, cut and paste
> command and results from your R session, so that we can see what is
> going on
>
> 1. Please, please include the output of sessionInfo(). You can cut
and
> paste this from your R session into the email. Do this after you
> have reproduced the error.
>
>> sessionInfo()
> R version 2.5.0 (2007-04-23)
> i686-redhat-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_
US.U
> TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.
UTF-
> 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;L
C_ID
> ENTIFICATION=C
>
> attached base packages:
> [1] "tools" "tcltk" "stats" "graphics" "grDevices"
"utils"
> [7] "datasets" "methods" "base"
>
> other attached packages:
> Biobase reldist marray tkWidgets DynDoc
widgetTools
> "1.14.0" "1.5-5" "1.14.0" "1.14.0" "1.14.0"
"1.12.0"
> limma
> "2.10.5"
This is really helpful. Here's where my system starts:
> sessionInfo()
R version 2.6.0 Under development (unstable) (2007-07-09 r42160)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY
=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELE
PHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
Notice that these are en_US, whereas yours are en_US.UTF-8
> 2. Report the exact command that causes the error, and error
message.
> Do this by copying and pasting the relevant portions of your R
> session. For instance, we do not yet know whether you provide
your
> own column names, or what type of files you are trying to read.
>
> The Data file is a GenePix output gpr file
>
> The commands I'm running and error message:
>
>
>> library(limma)
>> library(marray)
> Loading required package: tkWidgets
> Loading required package: widgetTools
> Loading required package: tcltk
> Loading Tcl/Tk interface ... done
> Loading required package: DynDoc
>> library(reldist)
> Relative Distribution Methods
> Version 1.5-5 created on April 1, 2006.
> copyright (c) 2003, Mark S. Handcock, University of Washington
> Martina Morris, University of Washington
> Type help(package="reldist") to get started.
>> library(Biobase)
> Loading required package: tools
>
> Welcome to Bioconductor
>
> Vignettes contain introductory material. To view, type
> 'openVignette()'. To cite Bioconductor, see
> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>> setwd("/mnt/lifestore/Biotech/DannyS_Shared/Users/Revital/ssdp/")
>> path<-"/mnt/lifestore/Biotech/DannyS_Shared/Users/Revital/ssdp/"
>>
>> targets.RG1 <- readTargets("ssdpA270607.txt")
>>
>> targets.RG2 <- readTargets("ssdpB270607.txt")
>>
>> RG1 <- read.maimages(targets.RG1$Filename,
> columns=list(Rf="F635Median",Gf="F532Median",Rb="B635Median",Gb="B53
2Med
> ian"), path=path)
> Error in readGenericHeader(fullname, columns = columns, sep = sep) :
> Specified column headings not found in file
> In addition: Warning message:
> input string 1 is invalid in this locale in: grep(pattern, x,
> ignore.case, extended, value, fixed, useBytes)
Thanks for the files you forwarded (not included in this
response). When I do
> setwd("~/tmp")
> library(limma)
> targets.RG1 <- readTargets("ssdpA270607.txt")
> fname <- targets.RG1$Filename[[1]]
> fname
[1] "B12Z0471_A.gpr"
The file name has lower-case 'gpr', but the file you sent has
upper-csae GPR, so
> columns <- list(Rf="F635Median",Gf="F532Median",
+ Rb="B635Median",Gb="B532Median")
> read.maimages(fname, columns=columns)
Error in file(file, "r") : unable to open connection
In addition: Warning message:
In file(file, "r") :
cannot open file 'B12Z0471_A.gpr', reason 'No such file or
directory'
on the other hand
> fname <- toupper(fname)
> res <- read.maimages(fname, columns=columns)
Read B12Z0471_A.GPR
So far so good. One point is that these are genepix files, so that a
better way to read the files might be
> res <- read.maimages(fname, "genepix.custom", columns=columns)
Custom background: LocalFeature
Read B12Z0471_A.GPR
This reads information about the printer as well, which can be useful
during normalization.
I now change my system local to be like yours
> Sys.setlocale("LC_ALL", "en_US.UTF-8")
[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=
en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US;LC_PAPER=en_US;L
C_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFIC
ATION=C"
and try to read the files
> read.maimages(fname, columns=columns)
Error in readGenericHeader(fullname, columns = columns, sep = sep) :
Specified column headings not found in file
In addition: Warning message:
In grep(a, txt) : input string 1 is invalid in this locale
Ah ha! This seems to be the problem. So set the locale to "en_US"
> Sys.setlocale("LC_ALL", "en_US")
[1] "LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MON
ETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC
_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C"
> res <- read.maimages(fname, columns=columns)
Read B12Z0471_A.GPR
Does this work for you? Alternatively here's an interesting solution
that 'just works':
> Sys.setlocale("LC_ALL", "en_US.UTF-8")
[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=
en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US;LC_PAPER=en_US;L
C_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFIC
ATION=C"
> res <- read.maimages(fname, "genepix.custom", columns=columns)
Custom background: LocalFeature
Read B12Z0471_A.GPR
I'll just delve a little further, and see if we can figure out where
that warning message is coming from:
> options(warn=2)
> res <- read.maimages(fname, columns=columns)
Error in grep(a, txt) :
(converted from warning) input string 1 is invalid in this locale
setting warn=2 causes the warning to become an error. Here's where the
error occurs (I've edited frame 2 with [...] to shorten the
presentation):
> traceback()
8: doWithOneRestart(return(expr), restart)
7: withOneRestart(expr, restarts[[1]])
6: withRestarts({
.Internal(.signalCondition(simpleWarning(msg, call), msg,
call))
.Internal(.dfltWarn(msg, call))
}, muffleWarning = function() NULL)
5: .signalSimpleWarning("input string 1 is invalid in this locale",
quote(grep(a, txt)))
4: grep(a, txt)
3: readGenericHeader(fullname, columns = columns, sep = sep)
2: switch(source2, quantarray = {
[...]
}, {
skip <- readGenericHeader(fullname, columns = columns, sep =
sep)$NHeaderRecords
obj <- read.columns(fullname, required.col, text.to.search,
skip = skip, sep = sep, quote = quote, as.is = TRUE,
fill = TRUE, flush = TRUE, ...)
nspots <- nrow(obj)
})
1: read.maimages(fname, columns = columns)
Frame 4 is where the grep statement is, it's inside
readGenericHeader. Here's readGenericHeader:
> readGenericHeader
function (file, columns, sep = "\t")
{
if (missing(columns) || !length(columns))
stop("must specify column headings to find")
columns <- protectMetachar(as.character(columns))
if (!length(columns))
stop("column headings must be specified")
con <- file(file, "r")
on.exit(close(con))
out <- list()
Found <- FALSE
i <- 0
repeat {
i <- i + 1
txt <- readLines(con, n = 1)
if (!length(txt))
stop("Specified column headings not found in file")
Found <- TRUE
for (a in columns) Found <- Found && length(grep(a, txt))
if (Found)
break
}
out$NHeaderRecords <- i - 1
out$ColumnNames <- strsplit(txt, split = sep)[[1]]
out
}
<environment: namespace:limma="">
The 'grep' is toward the end, and is in a loop that looks at each
column name and compares it with the tab-delimited line of headings
from the gpr file. A little bit more snooping shows that the header
line has a field 'Rgn R^2 (635/532)', where the 'R^2' is rendered with
a superscripted '2'. This causes the problem. In UTF-8 it is
represented as "\xb2"; I don't know enough about locales to know what
this means.
The header line is read in a few lines above, with
txt <- readLines(con, n = 1)
The help page for readLines indicates that there is an argument
'encoding'. We 'know' (experience, I guess) that the file is in
'latin1', and in fact changing the readLine to
txt <- readLines(con, n = 1, encoding="latin1")
allows readGenericHeader to work correctly:
> res <- readGenericHeader(fname, columns)
>
I really don't know if this is the 'right' long-term solution for
limma or other package maintainers.
Martin
> 3. After the error occurs, run the command traceback() and include
the
> results. This shows where the error likely occured
>
> After I got the error:
>
>> oldOpt=options(warn=2)
>> traceback()
> 4: stop("Specified column headings not found in file")
> 3: readGenericHeader(fullname, columns = columns, sep = sep)
> 2: switch(source2, quantarray = {
> firstfield <- scan(fullname, what = "", sep = "\t", flush =
TRUE,
> quiet = TRUE, blank.lines.skip = FALSE, multi.line =
FALSE,
> allowEscapes = FALSE)
> skip <- grep("Begin Data", firstfield)
> if (length(skip) == 0)
> stop("Cannot find \"Begin Data\" in image output file")
> nspots <- grep("End Data", firstfield) - skip - 2
> obj <- read.columns(fullname, required.col, text.to.search,
> skip = skip, sep = sep, quote = quote, as.is = TRUE,
> fill = TRUE, nrows = nspots, flush = TRUE, ...)
> }, arrayvision = {
> skip <- 1
> cn <- scan(fullname, what = "", sep = sep, quote = quote,
> skip = 1, nlines = 1, quiet = TRUE, allowEscape = FALSE)
> fg <- grep(" Dens - ", cn)
> if (length(fg) != 2)
> stop(paste("Cannot find foreground columns in",
fullname))
> bg <- grep("^Bkgd$", cn)
> if (length(bg) != 2)
> stop(paste("Cannot find background columns in",
fullname))
> columns <- list(R = fg[1], Rb = bg[1], G = fg[2], Gb = bg[2])
> obj <- read.columns(fullname, required.col, text.to.search,
> skip = skip, sep = sep, quote = quote, as.is = TRUE,
> fill = TRUE, flush = TRUE, ...)
> fg <- grep(" Dens - ", names(obj))
> bg <- grep("^Bkgd$", names(obj))
> columns <- list(R = fg[1], Rb = bg[1], G = fg[2], Gb = bg[2])
> nspots <- nrow(obj)
> }, bluefuse = {
> skip <- readGenericHeader(fullname, columns = c(columns$G,
> columns$R))$NHeaderRecords
> obj <- read.columns(fullname, required.col, text.to.search,
> skip = skip, sep = sep, quote = quote, as.is = TRUE,
> fill = TRUE, flush = TRUE, ...)
> nspots <- nrow(obj)
> }, genepix = {
> h <- readGPRHeader(fullname)
> if (verbose && source == "genepix.custom")
> cat("Custom background:", h$Background, "\n")
> skip <- h$NHeaderRecords
> obj <- read.columns(fullname, required.col, text.to.search,
> skip = skip, sep = sep, quote = quote, as.is = TRUE,
> fill = TRUE, flush = TRUE, ...)
> nspots <- nrow(obj)
> }, smd = {
> skip <- readSMDHeader(fullname)$NHeaderRecords
> obj <- read.columns(fullname, required.col, text.to.search,
> skip = skip, sep = sep, quote = quote, as.is = TRUE,
> fill = TRUE, flush = TRUE, ...)
> nspots <- nrow(obj)
> }, {
> skip <- readGenericHeader(fullname, columns = columns, sep =
> sep)$NHeaderRecords
> obj <- read.columns(fullname, required.col, text.to.search,
> skip = skip, sep = sep, quote = quote, as.is = TRUE,
> fill = TRUE, flush = TRUE, ...)
> nspots <- nrow(obj)
> })
> 1: read.maimages(targets.RG1$Filename, columns = list(Rf =
"F635Median",
> Gf = "F532Median", Rb = "B635Median", Gb = "B532Median"),
> path = path)
>
>
> 4. evaluate the command
>
>> oldOpt = options(warn=2)
>
> (this will cause the warning to become an error), rerun the
command
> and report the results of traceback(). This will indicate where
the
> suspicious warning about 'grep' occurs.
>
> 5. Does the error occur when only some files are used for input, or
> does it occur with any file? If it is with only some files, then
> can you verify that the column names are present in those files?
> Can you determine the character encoding of those file, for
> instance by opening them in a browser such as firefox and looking
> at View --> Character encoding.
>
> This occurs with any file. I attached an example of one of them
> (B12Z0471_A.GPR), and one of the targets files, in case you would
like
> to make a test run. The column names are present.
> Character encoding: Western (ISO-8859-1).
>
>
> Thanks for your assistance.
>
> Martin
>
> "Maya Bercovich" <mayab at="" tauex.tau.ac.il=""> writes:
>
>> -----Original Message-----
>> From: Seth Falcon [mailto:sfalcon at fhcrc.org]
>> Sent: 09 July, 2007 11:18 PM
>> To: Maya Bercovich
>> Cc: Marcus Davy; Kasper Daniel Hansen; bioconductor at
stat.math.ethz.ch
>> Subject: Re: [BioC] how to revert to an older limma version?
>>
>> "Maya Bercovich" <mayab at="" tauex.tau.ac.il=""> writes:
>>
>>> See bellow and thank you so much.
>>
>> In general, I would recommend using the most recent version of
limma.
>> It would be helpful to include the output of sessionInfo() after
the
>> error occurs. The error message does suggest a locale or encoding
>> mismatch. Can you try setting your locale to "C":
>>
>> Sys.setlocale(locale="C")
>>
>> I tried it, and I still get the same error. Any more suggestions?
>>
>> Appreciate your assistance,
>>
>> Maya
>>
>>
>>
>> + seth
>>
>> --
>> Seth Falcon | Computational Biology | Fred Hutchinson Cancer
Research
>> Center
>>
http://bioconductor.org
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> --
> Martin Morgan
> Bioconductor / Computational Biology
>
http://bioconductor.org
>
>
--
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org