Question

Reading Imagene files with read.imagene in limma

0

Entering edit mode

STKH Steen Krogsgaard ▴ 150

@stkh-steen-krogsgaard-797

Last seen 9.7 years ago

Hi, I have a bunch of Imagene data files that I want to read into limma using read.imagene. It all works very nicely like this: RG<-read.imagene(files), where files is a matrix with two columns and one row for each array. The names of the two dye-files are in the two columns. I want to wrap this call into another function in order to do some processing of the quality flags that Imagene uses (I want to change the flags into weights). The idea is: read.my.imagene <-function(files, path = NULL, ext = NULL, names = NULL, columns = NULL, wt.fun = NULL, verbose = TRUE, sep = " ", quote = "\"", exclude.flags=c(1,2,3), ...) { obj=read.imagene(files=files, path=path, ext=ext, names=names, columns=columns, wt.fun=wt.fun, verbose=verbose, sep=sep, quote=quote, ...) // do some more stuff } but when I call read.my.imagene I get an error "Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1 did not have 27 elements". I understand that scan is complaining about the number of columns in the data file, but I'm really puzzled why it works when I call read.imagene directly but not when I call read.my.imagene which then in turn calls read.imagene. I also tried to copy the source code for read.imagene into a new function (which I called test.read), and this produced the exact same error. There were no code changes done, just a copy of the source code into Notepad (yes, I'm on a Windows box) preceeded by "test.read = ". I use R 1.9.0 and the latest Bioconductor (downloaded last week). thanks in advance for any suggestions Steen [[alternative HTML version deleted]]

• 1.4k views

ADD COMMENT • link updated 19.9 years ago by Gordon Smyth 50k • written 19.9 years ago by STKH Steen Krogsgaard ▴ 150

score 0 · Answer 1 · 2004-06-07

Steen, On Mon, 7 Jun 2004, STKH (Steen Krogsgaard) wrote: > wt.fun = NULL, verbose = TRUE, sep = " ", quote = "\"", > exclude.flags=c(1,2,3), ...) Perhaps you took code from read.imagene using an unreliable method like typing "read.imagene" at the R prompt and pressing enter or by typing "edit(read.imagene)" and seeing the code in a text editor like Notepad. One reason why both of these methods are unreliable is that (at least on Windows), they will try to convert the special tab character "\t" in sep="\t" to an actual tab: " " A more reliable way is to download the source of limma (the .tar.gz file) and find the read.imagene function in R/read.R using a reliable text editor. So, try changing sep=" " to sep="\t". Could this be the problem? Regards, James

score 0 · Answer 2 · 2004-06-07

Hi James, YES! that definitely solved the problem. I feel kindda stupid... Thanks a million Steen -----Original Message----- From: James Wettenhall [mailto:wettenhall@wehi.edu.au] Sent: 7. juni 2004 12:30 To: STKH (Steen Krogsgaard) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Reading Imagene files with read.imagene in limma Steen, On Mon, 7 Jun 2004, STKH (Steen Krogsgaard) wrote: > wt.fun = NULL, verbose = TRUE, sep = " ", quote = "\"", > exclude.flags=c(1,2,3), ...) Perhaps you took code from read.imagene using an unreliable method like typing "read.imagene" at the R prompt and pressing enter or by typing "edit(read.imagene)" and seeing the code in a text editor like Notepad. One reason why both of these methods are unreliable is that (at least on Windows), they will try to convert the special tab character "\t" in sep="\t" to an actual tab: " " A more reliable way is to download the source of limma (the .tar.gz file) and find the read.imagene function in R/read.R using a reliable text editor. So, try changing sep=" " to sep="\t". Could this be the problem? Regards, James

score 0 · Answer 3 · 2004-06-07

At 08:14 PM 7/06/2004, STKH (Steen Krogsgaard) wrote: >Hi, > >I have a bunch of Imagene data files that I want to read into limma >using read.imagene. It all works very nicely like this: > >RG<-read.imagene(files), where files is a matrix with two columns and >one row for each array. The names of the two dye-files are in the two >columns. > >I want to wrap this call into another function in order to do some >processing of the quality flags that Imagene uses (I want to change the >flags into weights). The idea is: Why re-invent the wheel? read.maimages() already gives you complete facility to do this. The argument wt.fun given to read.maimages() can be any function you like which takes an imagene file as a data frame and converts flags into weights. The help says: 'wt.fun' may be any user-supplied function which accepts a data.frame argument and returns a vector of non-negative weights. The columns of the data.frame are as in the image analysis output files. See 'QualityWeights' for provided weight functions. Gordon >read.my.imagene <-function(files, path = NULL, ext = NULL, names = NULL, >columns = NULL, > wt.fun = NULL, verbose = TRUE, sep = " ", quote = "\"", >exclude.flags=c(1,2,3), ...) >{ > obj=read.imagene(files=files, path=path, ext=ext, names=names, >columns=columns, wt.fun=wt.fun, verbose=verbose, sep=sep, quote=quote, >...) > // do some more stuff >} > >but when I call read.my.imagene I get an error "Error in scan(file = >file, what = what, sep = sep, quote = quote, dec = dec, : > line 1 did not have 27 elements". I understand that scan is >complaining about the number of columns in the data file, but I'm really >puzzled why it works when I call read.imagene directly but not when I >call read.my.imagene which then in turn calls read.imagene. > >I also tried to copy the source code for read.imagene into a new >function (which I called test.read), and this produced the exact same >error. There were no code changes done, just a copy of the source code >into Notepad (yes, I'm on a Windows box) preceeded by "test.read = ". > >I use R 1.9.0 and the latest Bioconductor (downloaded last week). > >thanks in advance for any suggestions >Steen

score 0 · Answer 4 · 2004-06-08

Hi Gordon, I really don't want to reinvent the wheel (or any other part of the excellent functionality of limma). What I want to do is to save the flag columns from the data files in the RGList dataframe as RGList$flags and then set the weights to either 0 or 1 depending on the values of the flags. Since the flags say that the spot is either OK or unusable (with nothing in between) it doesn't really make sense (to me at least) to use weights between 0 and 1. I couldn't figure out how to add the flags "column" to the RGList within the wt.fun function, so that's why I wrapped read.imagene instead. The pseudocode is like this: read.my.imagene=function(bla bla bla, exclude.flags=c(1,2,3)) { obj=read.imagene(bla bla bla, wt.fun=wt.myimagene) obj$flags=obj$weights obj$weights=rep(1,length(obj$weights)) obj$weigths[obj$flags %in% exclude.flags]=0 obj } wt.myimagene=function(ima) { ima[,"Flags"] } so, during read.imagene the Flags column from the datafiles are put in the weigths list (it's a list, right?), and this is subsequently copied to a $flags list. Then $weights are set to all 1's, and the $weights where $flags is in an exclude-list (exclude.flags) is set to 0. This mimicks the data preparation step used in GeneSight from BioDiscovery where "bad" flags are filterede out. The reason for not putting the filtering into wt.myimagene is that I want to let the user specify which flags to exclude. To exclude only flags 1 and 2, for instance, the call would be RG=read.my.imagene(files=files, exclude.flags=c(1,2)) This is probably not the best way to accomplish this, but it's the one I could think up! I can see that for instance wtarea is defined by two function-statements and that this allow for a parameter to be specified in the call to read.imagene, but I don't really understand this notation. Besides, I still would like to have the original flag values stored in the RGList object so that I can re-do the filtering later or plot the flagges spots or whatever. Thanks for all the feedback Steen -----Original Message----- From: Gordon Smyth [mailto:smyth@wehi.edu.au] Sent: 8. juni 2004 00:38 To: STKH (Steen Krogsgaard) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Reading Imagene files with read.imagene in limma At 08:14 PM 7/06/2004, STKH (Steen Krogsgaard) wrote: >Hi, > >I have a bunch of Imagene data files that I want to read into limma >using read.imagene. It all works very nicely like this: > >RG<-read.imagene(files), where files is a matrix with two columns and >one row for each array. The names of the two dye-files are in the two >columns. > >I want to wrap this call into another function in order to do some >processing of the quality flags that Imagene uses (I want to change the >flags into weights). The idea is: Why re-invent the wheel? read.maimages() already gives you complete facility to do this. The argument wt.fun given to read.maimages() can be any function you like which takes an imagene file as a data frame and converts flags into weights. The help says: 'wt.fun' may be any user-supplied function which accepts a data.frame argument and returns a vector of non-negative weights. The columns of the data.frame are as in the image analysis output files. See 'QualityWeights' for provided weight functions. Gordon >read.my.imagene <-function(files, path = NULL, ext = NULL, names = >NULL, columns = NULL, > wt.fun = NULL, verbose = TRUE, sep = " ", quote = "\"", >exclude.flags=c(1,2,3), ...) >{ > obj=read.imagene(files=files, path=path, ext=ext, names=names, >columns=columns, wt.fun=wt.fun, verbose=verbose, sep=sep, quote=quote, >...) > // do some more stuff >} > >but when I call read.my.imagene I get an error "Error in scan(file = >file, what = what, sep = sep, quote = quote, dec = dec, : > line 1 did not have 27 elements". I understand that scan is >complaining about the number of columns in the data file, but I'm >really puzzled why it works when I call read.imagene directly but not >when I call read.my.imagene which then in turn calls read.imagene. > >I also tried to copy the source code for read.imagene into a new >function (which I called test.read), and this produced the exact same >error. There were no code changes done, just a copy of the source code >into Notepad (yes, I'm on a Windows box) preceeded by "test.read = ". > >I use R 1.9.0 and the latest Bioconductor (downloaded last week). > >thanks in advance for any suggestions >Steen

score 0 · Answer 5 · 2004-06-08

To exclude any set of flag values, define mywtfun <- function(exclude.flags=c(1,2,3)) function(obj) 1-(obj$Flag %in% exclude.flags) Now you can use RG <- read.maimages(files, source="imagene", wt.fun=mywtfun(c(1,2))) to remove only 1 and 2, or RG <- read.maimages(files, source="imagene", wt.fun=mywtfun(c(1,2,4))) to remove 1, 2 and 4, etc, etc. Gordon At 06:04 PM 8/06/2004, STKH (Steen Krogsgaard) wrote: >Hi Gordon, > >I really don't want to reinvent the wheel (or any other part of the >excellent functionality of limma). What I want to do is to save the flag >columns from the data files in the RGList dataframe as RGList$flags and >then set the weights to either 0 or 1 depending on the values of the >flags. Since the flags say that the spot is either OK or unusable (with >nothing in between) it doesn't really make sense (to me at least) to use >weights between 0 and 1. > >I couldn't figure out how to add the flags "column" to the RGList within >the wt.fun function, so that's why I wrapped read.imagene instead. The >pseudocode is like this: >read.my.imagene=function(bla bla bla, exclude.flags=c(1,2,3)) >{ > obj=read.imagene(bla bla bla, wt.fun=wt.myimagene) > obj$flags=obj$weights > obj$weights=rep(1,length(obj$weights)) > obj$weigths[obj$flags %in% exclude.flags]=0 > obj >} > >wt.myimagene=function(ima) >{ > ima[,"Flags"] >} > >so, during read.imagene the Flags column from the datafiles are put in >the weigths list (it's a list, right?), and this is subsequently copied >to a $flags list. Then $weights are set to all 1's, and the $weights >where $flags is in an exclude-list (exclude.flags) is set to 0. This >mimicks the data preparation step used in GeneSight from BioDiscovery >where "bad" flags are filterede out. The reason for not putting the >filtering into wt.myimagene is that I want to let the user specify which >flags to exclude. To exclude only flags 1 and 2, for instance, the call >would be >RG=read.my.imagene(files=files, exclude.flags=c(1,2)) > >This is probably not the best way to accomplish this, but it's the one I >could think up! I can see that for instance wtarea is defined by two >function-statements and that this allow for a parameter to be specified >in the call to read.imagene, but I don't really understand this >notation. Besides, I still would like to have the original flag values >stored in the RGList object so that I can re-do the filtering later or >plot the flagges spots or whatever. > >Thanks for all the feedback >Steen > >-----Original Message----- >From: Gordon Smyth [mailto:smyth@wehi.edu.au] >Sent: 8. juni 2004 00:38 >To: STKH (Steen Krogsgaard) >Cc: bioconductor@stat.math.ethz.ch >Subject: Re: [BioC] Reading Imagene files with read.imagene in limma > > >At 08:14 PM 7/06/2004, STKH (Steen Krogsgaard) wrote: > >Hi, > > > >I have a bunch of Imagene data files that I want to read into limma > >using read.imagene. It all works very nicely like this: > > > >RG<-read.imagene(files), where files is a matrix with two columns and > >one row for each array. The names of the two dye-files are in the two > >columns. > > > >I want to wrap this call into another function in order to do some > >processing of the quality flags that Imagene uses (I want to change the > > >flags into weights). The idea is: > >Why re-invent the wheel? read.maimages() already gives you complete >facility to do this. The argument wt.fun given to read.maimages() can be > >any function you like which takes an imagene file as a data frame and >converts flags into weights. The help says: > >'wt.fun' may be any user-supplied function which accepts > a data.frame argument and returns a vector of non-negative > weights. The columns of the data.frame are as in the image > analysis output files. See 'QualityWeights' for provided weight > functions. > >Gordon > > >read.my.imagene <-function(files, path = NULL, ext = NULL, names = > >NULL, columns = NULL, > > wt.fun = NULL, verbose = TRUE, sep = " ", quote = "\"", > >exclude.flags=c(1,2,3), ...) > >{ > > obj=read.imagene(files=files, path=path, ext=ext, names=names, > > >columns=columns, wt.fun=wt.fun, verbose=verbose, sep=sep, quote=quote, > >...) > > // do some more stuff > >} > > > >but when I call read.my.imagene I get an error "Error in scan(file = > >file, what = what, sep = sep, quote = quote, dec = dec, : > > line 1 did not have 27 elements". I understand that scan is > >complaining about the number of columns in the data file, but I'm > >really puzzled why it works when I call read.imagene directly but not > >when I call read.my.imagene which then in turn calls read.imagene. > > > >I also tried to copy the source code for read.imagene into a new > >function (which I called test.read), and this produced the exact same > >error. There were no code changes done, just a copy of the source code > >into Notepad (yes, I'm on a Windows box) preceeded by "test.read = ". > > > >I use R 1.9.0 and the latest Bioconductor (downloaded last week). > > > >thanks in advance for any suggestions > >Steen

score 0 · Answer 6 · 2004-06-08

At 06:04 PM 8/06/2004, STKH (Steen Krogsgaard) wrote: >notation. Besides, I still would like to have the original flag values >stored in the RGList object so that I can re-do the filtering later or >plot the flagges spots or whatever. limma doesn't include the flag values themselves in the output object, because the different image analysis programs have quite different treatments of this concept. It is desirable to produce a data object all the components of which have meanings not dependent on the specific image analysis program used. Moreover, there may not be a single flag column from which the quality weights are computed - the weights may depend on two or more columns. Therefore the approach has been taken to compute numeric weights directly within the read.maimages() function, but to provide the facility for users to define how these are computed. Gordon >Thanks for all the feedback >Steen