Question: Problem reading in files with marrayInput
14.6 years ago by
Richard Friedman2.0k wrote:
Fellow Expressionists, I am attempting to duplicate the results of Dudoit, Yang, and coworkers on the data of Callow as an exercise for learning to use Spot with my own data. I have tried a subset of six Spot Output files (3 WT, 3KO) as a proof of concept. I am having difficulty reading in the files correctly. I am running marrayInput downloaded on October 30, 2003, R1.8.0, and Windows XP. I would greatly appreciate it if someone could tell me what I am doing incorrectly, suggest something that I can do to debug the procedure, or be willing to look at my input files to see if the problem lies there. Here is a record of my session: ########################################################### > array1.gnames <- read.marrayInfo(file.path(datadir, "array1.gdl"), + info.id =1 labels = 1, skip = 1) Error: syntax error > info.id =1, labels = 1, skip = 1) Error: syntax error > array1.gnames <- read.marrayInfo(file.path(datadir, "array1.gdl"), + info.id =1, labels = 1, skip = 1) > array1.gnames Object of class marrayInfo. maLabels dat[, info.id] 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 10 10 10 11 11 ... Number of labels: 6383 Dimensions of maInfo matrix: 6383 rows by 1 columns Notes: C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.gdl > help.start() updating HTML package listing updating HTML search index If nothing happens, you should open  C:\PROGRA~1\R\rw1080\doc\html\rwin.html ' yourself > array1.gnames <- read.marrayInfo(file.path(datadir, "array1.gdl"),info.id =6:8, labels = 6, skip = 1) > array1.gnames Object of class marrayInfo. maLabels 1 Cy5RT 2 mSRB1 3 BLANK 4 BLANK 5 BLANK 6 7 5' similar to SW:BTF3_HUMAN P20290 TRANSCRIPTION FACTOR BTF3 ;. gi|1287559|gb|W13502|W13502 [1287559] 8 5' similar to gb:J04794 ALCOHOL DEHYDROGENASE (HUMAN);. gi|1287584|gb|W13547|W13547 [1287584] 9 5'. gi|1287586|gb|W13549|W13549 [1287586] 10 "5' similar to gb:X03747_cds1 SODIUM/POTASSIUM-TRANSPORTING ATPASE BETA-1 (HUMAN); gb:X16646 Mouse mRNA for Na,K-ATPase beta subunit (MOUSE);. gi|1315970|gb|W34060|W34060 [1315970]" Cy3RT 1 Cy5RT 2 mSRB1 3 BLANK 4 BLANK 5 BLANK 6 7 5' similar to SW:BTF3_HUMAN P20290 TRANSCRIPTION FACTOR BTF3 ;. gi|1287559|gb|W13502|W13502 [1287559] 8 5' similar to gb:J04794 ALCOHOL DEHYDROGENASE (HUMAN);. gi|1287584|gb|W13547|W13547 [1287584] 9 5'. gi|1287586|gb|W13549|W13549 [1287586] 10 "5' similar to gb:X03747_cds1 SODIUM/POTASSIUM-TRANSPORTING ATPASE BETA-1 (HUMAN); gb:X16646 Mouse mRNA for Na,K-ATPase beta subunit (MOUSE);. gi|1315970|gb|W34060|W34060 [1315970]" Control BLANK 1 Control BLANK 2 cDNA mSRB1 3 BLANK BLANK 4 BLANK BLANK 5 BLANK BLANK 6 cDNA 317448 7 cDNA 317452 8 cDNA 317456 9 cDNA 317460 10 cDNA 317464 ... Number of labels: 6383 Dimensions of maInfo matrix: 6383 rows by 3 columns Notes: C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.gdl > array1.layout <- read.marrayLayout(fname = file.path(datadir, + "array1.gdl"), ngr =4, ngc = 4, nsr = 19, nsc =21, skip = 1, + ctrl.col =7, id.col = 6) Error in scan(fname, quiet = TRUE, what = h, sep = sep, skip = skip + : unused argument(s) (ctrl.col ...) > array1.layout <- read.marrayLayout(fname = file.path(datadir, + "array1.gdl"), ngr =4, ngc = 4, nsr = 19, nsc =21, skip = 1) > array1.layout Array layout: Object of class marrayLayout. Total number of spots: 6384 Dimensions of grid matrix: 4 rows by 4 cols Dimensions of spot matrices: 19 rows by 21 cols Currently working with a subset of 6384 spots. Control spots: Notes on layout: C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.gdl > arrqy1.samples Error: Object "arrqy1.samples" not found > array1.samples Object of class marrayInfo. maLabels # of slide Names experiment Cy3 experiment Cy5 date comments 1 1 1 array1.1.spot wildtype ref 10/31/2003 NA 2 2 2 array1.2.spot wildtype ref 10/31/2003 NA 3 3 3 array1.3.spot wildtype ref 10/31/2003 NA 4 4 4 array1.4.spot ko ref 10/31/2003 NA 5 5 5 array1.5.spot ko ref 10/31/2003 NA 6 6 6 array1.6.spot ko ref 10/31/2003 NA Number of labels: 6 Dimensions of maInfo matrix: 6 rows by 6 columns Notes: C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1Sample.txt > fnames [1] "array1.1.spot" "array1.2.spot" "array1.3.spot" "array1.4.spot" "array1.5.spot" "array1.6.spot" > array1.raw <- read.marrayRaw(fnames, path = datadir, name.Gf = "Gmean", + name.Gb = "morphG", name.Rf = "Rmean", name.Rb = "morphR", + layout = array1.layout, gnames = array1.gnames, targets = array1.samples) [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.1.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.2.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.3.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.4.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.5.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.6.spot" Warning messages: 1: number of items read is not a multiple of the number of columns 2: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Gf, as.numeric(dat[[name.Gf]])) 3: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Gb, as.numeric(dat[[name.Gb]])) 4: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Rf, as.numeric(dat[[name.Rf]])) 5: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Rb, as.numeric(dat[[name.Rb]])) > ############################################################## Thanks and best wishes, Rich ------------------------------------------------------------ Richard A. Richard A. Friedman, PhD
Associate Research Scientist
Herbert Irving Comprehensive Cancer Center
Oncoinformatics Core
Lecturer
Department of Biomedical Informatics
Box 95, Room 130BB or P&S 1-420C
Columbia University
630 W. 168th St.
New York, NY 10032
modified 14.6 years ago by James W. MacDonald46k • written 14.6 years ago by Richard Friedman2.0k
14.6 years ago by
Jean Yee Hwa Yang920 wrote:
On Mon, 3 Nov 2003, Jean Yee Hwa Yang wrote: > Hi Richard, > > Look at the examples in help(read.marrayRaw) > Copy and paste the whole set of examples and see if it works. > If so, it's possible something is not right with the spot files. > > Try simply reading the data in: > array1.raw <- read.Spot(fnames, path=datadir) > and see if it works. > > Cheers > > Jean > Dear Jean and Everybody, Thank you for your reply. I ran the test case in "Introduction to the Bioconductor marrayInput package first, without any error messages. When I ran the session in help(read.marrayRaw) I got the following error messages: ##################################################################### > datadir <- system.file("data", package="marrayInput") > > skip <- grep("Row", readLines(file.path(datadir,"fish.gal"), n=100)) - 1 Error in file(con, "r") : unable to open connection In addition: Warning message: cannot open file C:/PROGRA~1/R/rw1080/library/marrayInput/data/fish.gal' > > swirl.layout <- read.marrayLayout(ngr=4, ngc=4, nsr=22, nsc=24) > > swirl.targets <- read.marrayInfo(file.path(datadir, "SwirlSample.txt")) Error in file(con, "r") : unable to open connection In addition: Warning message: cannot open file C:/PROGRA~1/R/rw1080/library/marrayInput/data/SwirlSample.txt' > > swirl.gnames <- read.marrayInfo(file.path(datadir, "fish.gal"), + info.id=4:5, labels=5, skip=skip) Error in file(con, "r") : unable to open connection In addition: Warning message: cannot open file C:/PROGRA~1/R/rw1080/library/marrayInput/data/fish.gal' > > x <- maInfo(swirl.gnames)[,1] > y <- rep(0, maNspots(swirl.layout)) > y[x == "control"] <- 1 > slot(swirl.layout, "maControls") <- as.factor(y) > > fnames <- dir(path=datadir,pattern=paste("*", "spot", sep="\.")) > swirl<- read.Spot(fnames, path=datadir, + layout = swirl.layout, + gnames = swirl.gnames, + targets = swirl.targets) [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.1.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.2.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.3.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.4.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.5.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.6.spot" Warning messages: 1: number of items read is not a multiple of the number of columns 2: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Gf, as.numeric(dat[[name.Gf]])) 3: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Gb, as.numeric(dat[[name.Gb]])) 4: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Rf, as.numeric(dat[[name.Rf]])) 5: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Rb, as.numeric(dat[[name.Rb]])) Error in read.marrayRaw(fnames = fnames, path = path, name.Gf = name.Gf, : Object "swirl.targets" not found > ###################################################################### #### Two things (at least) are puzzling to me about the above session. 1. I seemed able to read fish.gal when I ran the excercises in the Introduction. fish.gal is in the data directory under the marrayInput directory, in which I am working, Since I opened fish.gal with notepad it appears as a notepad file on the screen. Is that okay? 2. The computer started reading the array1.?.spot files, which the present series of commands had nothing to do. Then I tried reading the spot files the way that you said and their were problems: ###################################################################### #### > fnames <- dir(path=datadir,pattern=paste("*","spot",sep="\.")) > array1.raw <- read.Spot(fnames,path=datadir) [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.1.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.2.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.3.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.4.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.5.spot" [1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.6.spot" Warning messages: 1: number of items read is not a multiple of the number of columns 2: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Gf, as.numeric(dat[[name.Gf]])) 3: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Gb, as.numeric(dat[[name.Gb]])) 4: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Rf, as.numeric(dat[[name.Rf]])) 5: number of rows of result is not a multiple of vector length (arg 2) in: cbind(Rb, as.numeric(dat[[name.Rb]])) > ################################################################ So clearly there is a problem with the spot files. When I ran summary statistics on array.raw, I got the following: ################################################################# > objects() [1] "array1" "array1.gnames" "array1.layout" "array1.raw" "array1.samples" [6] "ctl" "datadir" "fileIndex" "fnames" "last.warning" [11] "read.marrayInfo" "swirl" "swirl.gnames" "swirl.layout" "swirl.raw" [16] "swirl.samples" "swirl2" "swirl2.gnames" "swirl2.layout" "swirl2.samples" [21] "swirl3" "swirl3..samples" "swirl3.gnames" "swirl3.layout" "swirl3.samples" [26] "x" "y" > array1.raw Pre-normalization intensity data: Object of class marrayRaw. Number of arrays: 6 arrays. A) Layout of spots on the array: Array layout: Object of class marrayLayout. Total number of spots: Dimensions of grid matrix: rows by cols Dimensions of spot matrices: rows by cols Currently working with a subset of spots. Control spots: Notes on layout: B) Samples hybridized to the array: Object of class marrayInfo. NULL data frame with 1 rows Number of labels: 0 Dimensions of maInfo matrix: 0 rows by 0 columns Notes: C) Summary statistics for log-ratio distribution: Min. 1st Qu. Median Mean 3rd Qu. Max NA 1 array1.1.spot -2.16 -0.76 -0.52 -0.44 -0.20 3.50 NA 2 array1.2.spot -2.15 -0.66 -0.44 -0.44 -0.21 2.01 NA 3 array1.3.spot -2.59 -0.84 -0.58 -0.58 -0.31 1.04 NA 4 array1.4.spot -3.46 -0.33 0.09 0.13 0.52 3.53 NA 5 array1.5.spot -3.05 -0.43 -0.15 -0.15 0.12 3.16 NA 6 array1.6.spot -13.08 -0.93 0.29 0.28 2.41 15.43 3664 D) Notes on intensity data: Spot Data > ############################################################### Clearly something is wrong. The files look good to me. May I send you the files (offlist). Thanks and best wishes, Rich ------------------------------------------------------------ Richard A. Richard A. Friedman, PhD
Associate Research Scientist
Herbert Irving Comprehensive Cancer Center
Oncoinformatics Core
Lecturer
Department of Biomedical Informatics
Box 95, Room 130BB or P&S 1-420C
Columbia University
630 W. 168th St.
New York, NY 10032
14.6 years ago by
United States
James W. MacDonald46k wrote: