Curious error with 'subseq' function from BSgenome (IRanges)
1
0
Entering edit mode
@jdelasherasedacuk-1189
Last seen 8.6 years ago
United Kingdom
Hi everyone, I am using the BSgenome package and annotations to retrieve several thousand sequences (22k) corresponding to a promoter microarray. Basically I run a loop through the whole list of chromosome name, start, and stop coordinates, and retrieve each 1Kb sequence using the 'subseq' function. When I run it, I get the following error *sometimes*: Error in get(name, envir = .classTable) : formal argument "envir" matched by multiple actual arguments The first time, I retrieved the index at which it had encountered the error, and ran the 'subseq' command alone. No problem. In fact, if I re-run teh whole thing the error may occur at another point. Once it even ran the whole thing without a hitch. I ended up putting the loop within a 'try' function, so that if there was an error, the loop coould restart where it left earlier and eventually retrieve the whole list. The number of times there's an error varies from run to run, and I see that the error messages are also varied. I just re-ran the loop again, just for fun. This is the code: library(BSgenome.Mmusculus.UCSC.mm8) # create vectors to store results in: newseq2<-vector(mode="character", length=dim(UInfo)[1]) newstart2<-vector(mode="numeric", length=dim(UInfo)[1]) newstop2<-vector(mode="numeric", length=dim(UInfo)[1]) ambiguous.orientation<-c() #UInfo is a data frame containing annotations. I extract chr,start,stop from it j<-1 i<-1 while(i<=dim(UInfo)[1]) { if (i==dim(UInfo)[1]) stop("finished") try( for (i in j:dim(UInfo)[1]) { # first extract chromosome name from the "NimbleGenID" included # in the annotation. # It is in the same format as the BSgenome annotation package # for mouse, so it's a straight extraction: chr<-sub(":.+$","",unlist(strsplit(UInfo[i,"NimbleGenID"],split=" "))[1]) if (chr=="NA") next # extract start and stop: start<-as.numeric(UInfo[i,"Start"]) stop<-as.numeric(UInfo[i,"End"]) # extract strand orientation: strand<-UInfo[i,"Frame"] # calculate the coordinates for the 1Kb upstream region: if (strand=="-") { upstart<-stop+1 upstop<-min(upstart+1000,length(Mmusculus[[chr]])) } if (strand=="+") { upstart<-max(start-1000,1) upstop<-max(start-1,1) } if (!(strand %in% c("+","-"))) { upstart<-upstop<-NA # when orientation is not clearly given, store indices for # further processing: ambiguous.orientation<-c(ambiguous.orientation,i) newseq2[i]<-"NNN" newstart2[i]<-upstart newstop2[i]<-upstop next } #extract sequence: sequence<-subseq(Mmusculus[[chr]],upstart,upstop) sequence<-as.character(sequence) #store results: newstart2[i]<-upstart newstop2[i]<-upstop newseq2[i]<-sequence }) # check whether the last index done is the last in the list. # if not, it means tehre was an abnormal exit. # update "j" to teh value of the last index "i", and the # loop will restart from the point it left earlier: if (i!=dim(UInfo)[1]) j<-i # write a tell-tale file so I can see where the problems occur as they # happen: write.table(1, paste(i,"_")) } This time it produced an error 7 times. The errors reported were: Error in get(name, envir = .classTable) : formal argument "envir" matched by multiple actual arguments Error in assign(".target", method at target, envir = envir) : formal argument "envir" matched by multiple actual arguments Error in assign(".defined", method at defined, envir = envir) : formal argument "envir" matched by multiple actual arguments Error in assign("disabled", disabled, envir = .validity_options) : formal argument "envir" matched by multiple actual arguments Error in assign(".defined", method at defined, envir = envir) : no function to return from, jumping to top level Error in shift(restrict(nir, start = solved_start, end = solved_end), : error in evaluating the argument 'x' in selecting a method for function 'shift' Error in assign(".Method", method, envir = envir) : formal argument "envir" matched by multiple actual arguments Error: finished The last one is not really an error, I just used the 'stop' function to report the job was done, so it says "error"... Clearly there is nothing wrong with the coordinates or other parameters in the subseq command, because I can repeat it. I find it very strange that the errors will happen at different points... or sometimes (rarely) nowhere at all. I got the result I was after by embedding the loop in a 'try' command, and that inside a 'while' loop... But I wonder why this happened in the first place. My session info follows: > sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] BSgenome.Mmusculus.UCSC.mm8_1.3.16 BSgenome_1.14.2 [3] Biostrings_2.14.12 IRanges_1.4.16 loaded via a namespace (and not attached): [1] Biobase_2.6.1 tools_2.10.0 Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507095 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK ********************************************* Alternative email: nach.mcnach at gmail.com ********************************************* -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Microarray Annotation BSgenome BSgenome Microarray Annotation BSgenome BSgenome • 1.2k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 6 weeks ago
United States
On 06/04/2010 03:52 AM, J.delasHeras at ed.ac.uk wrote: > > Hi everyone, > > I am using the BSgenome package and annotations to retrieve several > thousand sequences (22k) corresponding to a promoter microarray. > > Basically I run a loop through the whole list of chromosome name, start, > and stop coordinates, and retrieve each 1Kb sequence using the 'subseq' > function. > > When I run it, I get the following error *sometimes*: > Error in get(name, envir = .classTable) : > formal argument "envir" matched by multiple actual arguments Hi Jose -- This sounds like a bug in R, fixed in the R-2.11.* series, and updating your R (and packages, see http://bioconductor.org/docs/install/) should fix this. If not, it would be great to hear... Martin > > The first time, I retrieved the index at which it had encountered the > error, and ran the 'subseq' command alone. No problem. In fact, if I > re-run teh whole thing the error may occur at another point. Once it > even ran the whole thing without a hitch. > > I ended up putting the loop within a 'try' function, so that if there > was an error, the loop coould restart where it left earlier and > eventually retrieve the whole list. The number of times there's an error > varies from run to run, and I see that the error messages are also varied. > > I just re-ran the loop again, just for fun. This is the code: > > library(BSgenome.Mmusculus.UCSC.mm8) > # create vectors to store results in: > newseq2<-vector(mode="character", length=dim(UInfo)[1]) > newstart2<-vector(mode="numeric", length=dim(UInfo)[1]) > newstop2<-vector(mode="numeric", length=dim(UInfo)[1]) > ambiguous.orientation<-c() > > #UInfo is a data frame containing annotations. I extract chr,start,stop > from it > j<-1 > i<-1 > while(i<=dim(UInfo)[1]) > { > if (i==dim(UInfo)[1]) stop("finished") > try( > for (i in j:dim(UInfo)[1]) > { > # first extract chromosome name from the "NimbleGenID" included > # in the annotation. > # It is in the same format as the BSgenome annotation package > # for mouse, so it's a straight extraction: > chr<-sub(":.+$","",unlist(strsplit(UInfo[i,"NimbleGenID"],split=" > "))[1]) > if (chr=="NA") next > # extract start and stop: > start<-as.numeric(UInfo[i,"Start"]) > stop<-as.numeric(UInfo[i,"End"]) > # extract strand orientation: > strand<-UInfo[i,"Frame"] > # calculate the coordinates for the 1Kb upstream region: > if (strand=="-") > { > upstart<-stop+1 > upstop<-min(upstart+1000,length(Mmusculus[[chr]])) > } > if (strand=="+") > { > upstart<-max(start-1000,1) > upstop<-max(start-1,1) > } > if (!(strand %in% c("+","-"))) > { > upstart<-upstop<-NA > # when orientation is not clearly given, store indices for > # further processing: > ambiguous.orientation<-c(ambiguous.orientation,i) > newseq2[i]<-"NNN" > newstart2[i]<-upstart > newstop2[i]<-upstop > next > } > #extract sequence: > sequence<-subseq(Mmusculus[[chr]],upstart,upstop) > sequence<-as.character(sequence) > #store results: > newstart2[i]<-upstart > newstop2[i]<-upstop > newseq2[i]<-sequence > }) > # check whether the last index done is the last in the list. > # if not, it means tehre was an abnormal exit. > # update "j" to teh value of the last index "i", and the > # loop will restart from the point it left earlier: > if (i!=dim(UInfo)[1]) j<-i > # write a tell-tale file so I can see where the problems occur as they > # happen: > write.table(1, paste(i,"_")) > } > > > This time it produced an error 7 times. The errors reported were: > Error in get(name, envir = .classTable) : > formal argument "envir" matched by multiple actual arguments > Error in assign(".target", method at target, envir = envir) : > formal argument "envir" matched by multiple actual arguments > Error in assign(".defined", method at defined, envir = envir) : > formal argument "envir" matched by multiple actual arguments > Error in assign("disabled", disabled, envir = .validity_options) : > formal argument "envir" matched by multiple actual arguments > Error in assign(".defined", method at defined, envir = envir) : > no function to return from, jumping to top level > Error in shift(restrict(nir, start = solved_start, end = solved_end), : > error in evaluating the argument 'x' in selecting a method for > function 'shift' > Error in assign(".Method", method, envir = envir) : > formal argument "envir" matched by multiple actual arguments > Error: finished > > The last one is not really an error, I just used the 'stop' function to > report the job was done, so it says "error"... > > Clearly there is nothing wrong with the coordinates or other parameters > in the subseq command, because I can repeat it. > I find it very strange that the errors will happen at different > points... or sometimes (rarely) nowhere at all. > > I got the result I was after by embedding the loop in a 'try' command, > and that inside a 'while' loop... But I wonder why this happened in the > first place. > > My session info follows: > > >> sessionInfo() > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United Kingdom.1252 > [2] LC_CTYPE=English_United Kingdom.1252 > [3] LC_MONETARY=English_United Kingdom.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United Kingdom.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] BSgenome.Mmusculus.UCSC.mm8_1.3.16 BSgenome_1.14.2 > [3] Biostrings_2.14.12 IRanges_1.4.16 > > loaded via a namespace (and not attached): > [1] Biobase_2.6.1 tools_2.10.0 > > > Jose > -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENT
0
Entering edit mode
Quoting Martin Morgan <mtmorgan at="" fhcrc.org="">: > On 06/04/2010 03:52 AM, J.delasHeras at ed.ac.uk wrote: >> >> Hi everyone, >> >> I am using the BSgenome package and annotations to retrieve several >> thousand sequences (22k) corresponding to a promoter microarray. >> >> Basically I run a loop through the whole list of chromosome name, start, >> and stop coordinates, and retrieve each 1Kb sequence using the 'subseq' >> function. >> >> When I run it, I get the following error *sometimes*: >> Error in get(name, envir = .classTable) : >> formal argument "envir" matched by multiple actual arguments > > Hi Jose -- > > This sounds like a bug in R, fixed in the R-2.11.* series, and updating > your R (and packages, see http://bioconductor.org/docs/install/) should > fix this. If not, it would be great to hear... > > Martin Hi Martin, I have the latest R installed as well, I just hadn't installed the annotation packages so I used my previous R install for this task. I'll try again later today with the latest shiniest R/BioC and will report back. Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507095 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK ********************************************* Alternative email: nach.mcnach at gmail.com ********************************************* -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
ADD REPLY
0
Entering edit mode
Quoting Martin Morgan <mtmorgan at="" fhcrc.org="">: > On 06/04/2010 03:52 AM, J.delasHeras at ed.ac.uk wrote: >> >> Hi everyone, >> >> I am using the BSgenome package and annotations to retrieve several >> thousand sequences (22k) corresponding to a promoter microarray. >> >> Basically I run a loop through the whole list of chromosome name, start, >> and stop coordinates, and retrieve each 1Kb sequence using the 'subseq' >> function. >> >> When I run it, I get the following error *sometimes*: >> Error in get(name, envir = .classTable) : >> formal argument "envir" matched by multiple actual arguments > > Hi Jose -- > > This sounds like a bug in R, fixed in the R-2.11.* series, and updating > your R (and packages, see http://bioconductor.org/docs/install/) should > fix this. If not, it would be great to hear... > > Martin Just a quick update: I ran the same code with the latest R/BioC a couple of times and didn't encounter a single error, so it does seem that it's ok now. Thank you. Jose -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
ADD REPLY

Login before adding your answer.

Traffic: 964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6