GOstats locked database
3
0
Entering edit mode
Janet Young ▴ 740
@janet-young-2360
Last seen 4.5 years ago
Fred Hutchinson Cancer Research Center,…
Hi all, I'm having some trouble with a locked database with GOstats, perhaps due to running multiple simultaneous processes that are all accessing GO.db? I'm using R CMD BATCH to run an R script I wrote, and I'm doing that simultaneously from 12 different terminal windows, each logged in to a single node of a linux cluster. Some processes may be sharing a node (2 CPU per node). I'm happy to send the entire script, if that's useful, but for now there are just some snippets. Here's the basic problem: > params <- new("GOHyperGParams", geneIds = geneentrezIDs, universeGeneIds = allgeneentrezIDs, ontology="BP", annotation="org.Hs.eg.db",pvalueCutoff=hgCutoff, conditional=FALSE, testDirection = "over") > thishgOver<-hyperGTest(params) Error in sqliteFetch(rs, n = -1, ...) : RSQLite driver: (RS_SQLite_fetch: failed first step: database is locked) Calls: hyperGTest ... dbGetQuery -> sqliteQuickSQL -> sqliteFetch - > .Call Execution halted It's a very sporadic problem - I'm actually using the script to loop through a bunch of simulated datasets and run hyperGTest - it does fine for a while and then suddenly has a problem. I can't be sure, but it seems like several of the processes I was running simultaneously all had a problem around the same time (which wouldn't be surprising if something suddenly happened to the database). It's also possible that our linux nodes are having some intermittent connectivity issues to the mounted drives - could that cause the database locked error? If so would there be a way to make hyperGTest robust to a temporary problem like that? As well as hyperGTest, the script also accesses GO information using the following commands at various points, with commands like these: > Term(get(names(genes)[b],GOTERM)) > geneentrezIDs <- geneentrezIDs[!is.na(mget (geneentrezIDs,envir=org.Hs.egGO,ifnotfound=NA))] I was running a very similar version of the script last week, with no problem, and I think the above two commands are the only things I've added that might be accessing the GO data. I'm not clear on which of these commands use the same database as one another: (a) mget from org.Hs.egGO (b) hyperGTest with annotation="org.Hs.eg.db", (c) get from GOTERM. Here is the output of sessionInfo(), run just before I started looping through the datasets, so several iterations of the mget from org.Hs.egGO and the hyperGTest have happened after running this sessionInfo, but I think all relevant libraries were loaded. (is there a way to make R output sessionInfo immediately before it terminates with error, when running in batch mode?) > sessionInfo() R version 2.6.1 Patched (2007-12-02 r43572) i686-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .U TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UT F- 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ ID ENTIFICATION=C attached base packages: [1] splines tools stats graphics grDevices utils datasets [8] methods base other attached packages: [1] org.Hs.eg.db_2.0.2 GOstats_2.4.0 Category_2.4.0 [4] genefilter_1.16.0 survival_2.34 RBGL_1.14.0 [7] annotate_1.16.1 xtable_1.5-2 GO.db_2.0.2 [10] AnnotationDbi_1.0.6 RSQLite_0.6-4 DBI_0.2-4 [13] Biobase_1.16.2 graph_1.16.1 loaded via a namespace (and not attached): [1] cluster_1.11.9 And here's some other, possibly pertinent information: [12] kpvpt50:/home/jayoung/traskdata/janet/forOthers/forIlona/ GOanalysis/doGOmoreregions_slightly_better_again/DCLoss_10percent> ls -l ~/traskdata/lib_linux/R/library/GO.db/extdata/ total 37364 -rw-r--r-- 1 jayoung trasklab 38252544 Dec 3 13:55 GO.sqlite So I can write to GO.sqlite. Should it be read-only, to myself? Will that mess me up if I want to over-write it in future? [93] bedrock:/home/jayoung/traskdata/janet/forOthers/forIlona/ GOanalysis/doGOmoreregions_slightly_better_again> ls -l ~/traskdata/ lib_linux/R/library/org.Hs.eg.db/extdata/ total 187130 -rw-r--r-- 1 jayoung trasklab 95802368 Dec 13 14:50 org.Hs.eg.sqlite Thanks for any advice - this is a tricky one as it happens sometime in the middle of a ~12 hour run, and is not necessarily reproducible. Hopefully I've provided enough information here to track down the problem. Janet ------------------------------------------------------------------- Dr. Janet Young (Trask lab) Fred Hutchinson Cancer Research Center 1100 Fairview Avenue N., C3-168, P.O. Box 19024, Seattle, WA 98109-1024, USA. tel: (206) 667 1471 fax: (206) 667 6524 email: jayoung at fhcrc.org http://www.fhcrc.org/labs/trask/
GO Cancer GOstats GO Cancer GOstats • 1.2k views
ADD COMMENT
0
Entering edit mode
Janet Young ▴ 740
@janet-young-2360
Last seen 4.5 years ago
Fred Hutchinson Cancer Research Center,…
I have an additional piece of information for this problem. Not sure it helps, but thought I'd send it anyway. My script just crashed again, but this time not while running a hyperGTest, although possibly while looking at the results of a hyperGTest. It crashed somewhere within the following loop, perhaps at the geneIdsByCategory(alltests[[a]],sigCategories(alltests[[a]])) line (I only suspect that line because it uses the hyperGTest result stored in alltests[[a]], and it was hyperGTest that was the problem every other time, but I could easily be wrong). > for (a in 1:4) { + if (length(sigCategories(alltests[[a]])) == 0) next + genes <- geneIdsByCategory(alltests[[a]],sigCategories(alltests [[a]])) + for (b in 1:length(genes)) { + thesegenes <- genes[[b]] + if (length(thesegenes)==0) next + for (c in 1:length(thesegenes)) { + signifGeneInfo[rowcount,"Test"] <- names(alltests)[a] + signifGeneInfo[rowcount,"GO_ID"] <- names(genes)[b] + signifGeneInfo[rowcount,"EntrezID"] <- thesegenes[c] + signifGeneInfo[rowcount,"GeneName"] <- symbolsfromAnnPkg [[ thesegenes[c] ]] + signifGeneInfo[rowcount,"Term"] <- Term(get(names(genes) [b],GOTERM)) + rowcount <- rowcount + 1 + } + } + } Error in sqliteFetch(rs, n = -1, ...) : RSQLite driver: (RS_SQLite_fetch: failed first step: database is locked) Calls: get ... dbGetQuery -> sqliteQuickSQL -> sqliteFetch -> .Call Execution halted On Jan 23, 2008, at 2:29 PM, Janet Young wrote: > Hi all, > > I'm having some trouble with a locked database with GOstats, > perhaps due to running multiple simultaneous processes that are all > accessing GO.db? > > I'm using R CMD BATCH to run an R script I wrote, and I'm doing > that simultaneously from 12 different terminal windows, each logged > in to a single node of a linux cluster. Some processes may be > sharing a node (2 CPU per node). I'm happy to send the entire > script, if that's useful, but for now there are just some snippets. > Here's the basic problem: > > > params <- new("GOHyperGParams", geneIds = geneentrezIDs, > universeGeneIds = allgeneentrezIDs, ontology="BP", > annotation="org.Hs.eg.db",pvalueCutoff=hgCutoff, conditional=FALSE, > testDirection = "over") > > thishgOver<-hyperGTest(params) > Error in sqliteFetch(rs, n = -1, ...) : > RSQLite driver: (RS_SQLite_fetch: failed first step: database is > locked) > Calls: hyperGTest ... dbGetQuery -> sqliteQuickSQL -> sqliteFetch - > > .Call > Execution halted > > It's a very sporadic problem - I'm actually using the script to > loop through a bunch of simulated datasets and run hyperGTest - it > does fine for a while and then suddenly has a problem. I can't be > sure, but it seems like several of the processes I was running > simultaneously all had a problem around the same time (which > wouldn't be surprising if something suddenly happened to the > database). > > It's also possible that our linux nodes are having some > intermittent connectivity issues to the mounted drives - could that > cause the database locked error? If so would there be a way to make > hyperGTest robust to a temporary problem like that? > > As well as hyperGTest, the script also accesses GO information > using the following commands at various points, with commands like > these: > > Term(get(names(genes)[b],GOTERM)) > > geneentrezIDs <- geneentrezIDs[!is.na(mget > (geneentrezIDs,envir=org.Hs.egGO,ifnotfound=NA))] > I was running a very similar version of the script last week, with > no problem, and I think the above two commands are the only things > I've added that might be accessing the GO data. I'm not clear on > which of these commands use the same database as one another: (a) > mget from org.Hs.egGO (b) hyperGTest with > annotation="org.Hs.eg.db", (c) get from GOTERM. > > Here is the output of sessionInfo(), run just before I started > looping through the datasets, so several iterations of the mget > from org.Hs.egGO and the hyperGTest have happened after running > this sessionInfo, but I think all relevant libraries were loaded. > (is there a way to make R output sessionInfo immediately before it > terminates with error, when running in batch mode?) > > > sessionInfo() > R version 2.6.1 Patched (2007-12-02 r43572) > i686-pc-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US > .UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US. > UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8 > ;LC_IDENTIFICATION=C > > attached base packages: > [1] splines tools stats graphics grDevices utils > datasets > [8] methods base > > other attached packages: > [1] org.Hs.eg.db_2.0.2 GOstats_2.4.0 Category_2.4.0 > [4] genefilter_1.16.0 survival_2.34 RBGL_1.14.0 > [7] annotate_1.16.1 xtable_1.5-2 GO.db_2.0.2 > [10] AnnotationDbi_1.0.6 RSQLite_0.6-4 DBI_0.2-4 > [13] Biobase_1.16.2 graph_1.16.1 > > loaded via a namespace (and not attached): > [1] cluster_1.11.9 > > > And here's some other, possibly pertinent information: > [12] kpvpt50:/home/jayoung/traskdata/janet/forOthers/forIlona/ > GOanalysis/doGOmoreregions_slightly_better_again/DCLoss_10percent> > ls -l ~/traskdata/lib_linux/R/library/GO.db/extdata/ > total 37364 > -rw-r--r-- 1 jayoung trasklab 38252544 Dec 3 13:55 GO.sqlite > So I can write to GO.sqlite. Should it be read-only, to myself? > Will that mess me up if I want to over-write it in future? > [93] bedrock:/home/jayoung/traskdata/janet/forOthers/forIlona/ > GOanalysis/doGOmoreregions_slightly_better_again> ls -l ~/traskdata/ > lib_linux/R/library/org.Hs.eg.db/extdata/ > total 187130 > -rw-r--r-- 1 jayoung trasklab 95802368 Dec 13 14:50 > org.Hs.eg.sqlite > > > Thanks for any advice - this is a tricky one as it happens sometime > in the middle of a ~12 hour run, and is not necessarily > reproducible. Hopefully I've provided enough information here to > track down the problem. > > Janet > > ------------------------------------------------------------------- > > Dr. Janet Young (Trask lab) > > Fred Hutchinson Cancer Research Center > 1100 Fairview Avenue N., C3-168, > P.O. Box 19024, Seattle, WA 98109-1024, USA. > > tel: (206) 667 1471 fax: (206) 667 6524 > email: jayoung at fhcrc.org > > http://www.fhcrc.org/labs/trask/ > > ------------------------------------------------------------------- > > >
ADD COMMENT
0
Entering edit mode
Janet Young ▴ 740
@janet-young-2360
Last seen 4.5 years ago
Fred Hutchinson Cancer Research Center,…
Janet Young <jayoung at="" ...=""> writes: > Error in sqliteFetch(rs, n = -1, ...) : > RSQLite driver: (RS_SQLite_fetch: failed first step: database is > locked) > Calls: get ... dbGetQuery -> sqliteQuickSQL -> sqliteFetch -> .Call > Execution halted Another update: our systems admin people this morning reported file server problems (should now be fixed). I strongly suspect that's what was causing my error with hyperGTest, etc. I'm re-running now and will confirm whether the problem has gone ASAP.
ADD COMMENT
0
Entering edit mode
Janet Young ▴ 740
@janet-young-2360
Last seen 4.5 years ago
Fred Hutchinson Cancer Research Center,…
> Janet Young <jayoung at="" ...=""> writes: > > Error in sqliteFetch(rs, n = -1, ...) : > > RSQLite driver: (RS_SQLite_fetch: failed first step: database is > > locked) > > Calls: get ... dbGetQuery -> sqliteQuickSQL -> sqliteFetch -> .Call > > Execution halted > > Another update: our systems admin people this morning reported file server > problems (should now be fixed). I strongly suspect that's what was causing my > error with hyperGTest, etc. I'm re-running now and will confirm whether the > problem has gone ASAP. > Last message on this subject, I promise - I kept on trying the script, and finally got it to run all the way through without errors. So I don't think there is a problem with GOstats - I'm pretty sure the problem was due to our flaky file server (which is apparently fixed now). sorry to have bothered you all, Janet
ADD COMMENT
0
Entering edit mode
Hi Janet, Janet Young wrote: >> Janet Young <jayoung at="" ...=""> writes: >>> Error in sqliteFetch(rs, n = -1, ...) : >>> RSQLite driver: (RS_SQLite_fetch: failed first step: database is >>> locked) >>> Calls: get ... dbGetQuery -> sqliteQuickSQL -> sqliteFetch -> .Call >>> Execution halted >> Another update: our systems admin people this morning reported file server >> problems (should now be fixed). I strongly suspect that's what was causing my >> error with hyperGTest, etc. I'm re-running now and will confirm whether the >> problem has gone ASAP. >> > > Last message on this subject, I promise - I kept on trying the script, and > finally got it to run all the way through without errors. So I don't think there > is a problem with GOstats - I'm pretty sure the problem was due to our flaky > file server (which is apparently fixed now). > > sorry to have bothered you all, Thanks for reporting this with so many details! It is very useful for us to get this kind of feedback; there is always something to learn about it. I'm glad that you finally found the cause of the problem and fixed it. Cheers, H. > > Janet > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 1008 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6