R crashes with GEOmetadb
2
0
Entering edit mode
Guido Hooiveld ★ 4.1k
@guido-hooiveld-2020
Last seen 2 days ago
Wageningen University, Wageningen, the …
Dear Sean and others, I am exploring the functionality of 'GEOmetadb'. I am specifically interested in downloading all CEL files performed on a certain platform. To this end I am using the example mentioned in the vignette of GEOmetadb, which should retrieve the number of GEO entries and CEL files performed on the Affymetrix array HGU133A (page 8 vignette). However, when executing that code R crashes and needs to exit... To me the error messages are not informative to me, but may be you can deduce what is going wrong. Any feedback is appreciated. Regards, Guido R version 2.13.0 (2011-04-13) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-unknown-linux-gnu (64-bit) > > library(GEOmetadb) Loading required package: GEOquery Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation("pkgname")'. Setting options('download.file.method.GEOquery'='curl') Loading required package: RSQLite Loading required package: DBI > getSQLiteFile() trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz' Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes (104.4 Mb) opened URL ================================================ downloaded 104.4 Mb Unzipping... Metadata associate with downloaded file: name value 1 schema version 1.0 2 creation timestamp 2011-06-18 09:50:00 [1] "/home.local/guidoh/GEOmetadb.sqlite" > > con <- dbConnect(SQLite(), "GEOmetadb.sqlite") > dbDisconnect(con) [1] TRUE > > rs <- dbGetQuery(con,paste("select gsm,supplementary_file", + "from gsm where gpl='GPL96'", + "and supplementary_file like '%CEL.gz'")) *** caught segfault *** address 0x8, cause 'memory not mapped' Traceback: 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE = .SQLitePkgName) 2: sqliteExecStatement(con, statement, bind.data) 3: sqliteQuickSQL(conn, statement, ...) 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm where gpl='GPL96'", "and supplementary_file like '%CEL.gz'")) 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm where gpl='GPL96'", "and supplementary_file like '%CEL.gz'")) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: dim(rs) Selection: --------------------------------------------------------- Guido Hooiveld, PhD Nutrition, Metabolism & Genomics Group Division of Human Nutrition Wageningen University Biotechnion, Bomenweg 2 NL-6703 HD Wageningen the Netherlands tel: (+)31 317 485788 fax: (+)31 317 483342 email: guido.hooiveld@wur.nl internet: http://nutrigene.4t.com http://www.researcherid.com/rid/F-4912-2010 [[alternative HTML version deleted]]
hgu133a hgu133a • 1.8k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
Hi, Guido. The output of sessionInfo() might be helpful (or not). It is certainly possible that this is an issue with GEOmetadb, but the error suggests that RSQLite might be to blame, also. I'd suggest reinstalling RSQLite as a start. If you still have issues after doing so, definitely include the output of sessionInfo(). Sean On Wed, Jun 29, 2011 at 11:36 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Dear Sean and others, > > I am exploring the functionality of 'GEOmetadb'. I am specifically interested in downloading all CEL files performed on a certain platform. > To this end I am using the example mentioned in the vignette of GEOmetadb, which should retrieve the number of GEO entries and CEL files performed on the Affymetrix array HGU133A (page 8 vignette). > However, when executing that code R crashes and needs to exit... > To me the error messages are not informative to me, but may be you can deduce what is going wrong. Any feedback is appreciated. > > Regards, > Guido > > > R version 2.13.0 (2011-04-13) > Copyright (C) 2011 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: x86_64-unknown-linux-gnu (64-bit) > >> >> library(GEOmetadb) > Loading required package: GEOquery > Loading required package: Biobase > > Welcome to Bioconductor > > ?Vignettes contain introductory material. To view, type > ?'browseVignettes()'. To cite Bioconductor, see > ?'citation("Biobase")' and for packages 'citation("pkgname")'. > > Setting options('download.file.method.GEOquery'='curl') > Loading required package: RSQLite > Loading required package: DBI >> getSQLiteFile() > trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz' > Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes (104.4 Mb) > opened URL > ================================================ > downloaded 104.4 Mb > > Unzipping... > Metadata associate with downloaded file: > ? ? ? ? ? ? ? ?name ? ? ? ? ? ? ? value > 1 ? ? schema version ? ? ? ? ? ? ? ? 1.0 > 2 creation timestamp 2011-06-18 09:50:00 > [1] "/home.local/guidoh/GEOmetadb.sqlite" >> >> con <- dbConnect(SQLite(), "GEOmetadb.sqlite") >> dbDisconnect(con) > [1] TRUE >> >> rs <- dbGetQuery(con,paste("select gsm,supplementary_file", > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"from gsm where gpl='GPL96'", > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"and supplementary_file like '%CEL.gz'")) > > *** caught segfault *** > address 0x8, cause 'memory not mapped' > > Traceback: > 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE = .SQLitePkgName) > 2: sqliteExecStatement(con, statement, bind.data) > 3: sqliteQuickSQL(conn, statement, ...) > 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) > 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: dim(rs) > Selection: > > > --------------------------------------------------------- > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > Wageningen University > Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > tel: (+)31 317 485788 > fax: (+)31 317 483342 > email: ? ? ?guido.hooiveld at wur.nl > internet: ? http://nutrigene.4t.com > http://www.researcherid.com/rid/F-4912-2010 > > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Hi Sean, The reason i did not include the output of sessioninfo() was because my R session hangs. Anyway, below the output after loading GEOmetadb, but BEFORE running any other commands. [below: output of sessioninfo] > library(GEOmetadb) Loading required package: GEOquery Loading required package: Biobase Setting options('download.file.method.GEOquery'='curl') Loading required package: RSQLite Loading required package: DBI > sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GEOmetadb_1.12.0 RSQLite_0.9-4 DBI_0.2-5 GEOquery_2.19.1 [5] Biobase_2.12.1 loaded via a namespace (and not attached): [1] RCurl_1.5-0 tools_2.13.0 XML_3.4-0 > To be sure, I also updated RSQLite, but the version that was already installed was up to date (v0.9.4). Next I ran the sample script. Unfortunately, it again crashes... I did notice that the version of the SQLite database that was downloaded is the version of today (29 June). Earlier today it was from 18 June). Taken together, despite updating RSQLite and the SQLite database, problem still persists. Regards, Guido [below again lines of code that result into crash] > library(GEOmetadb) Loading required package: GEOquery Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation("pkgname")'. Setting options('download.file.method.GEOquery'='curl') Loading required package: RSQLite Loading required package: DBI > getSQLiteFile() trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz' Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes (104.4 Mb) opened URL ================================================ downloaded 104.4 Mb Unzipping... Metadata associate with downloaded file: name value 1 schema version 1.0 2 creation timestamp 2011-06-18 09:50:00 [1] "/home.local/guidoh/GEOmetadb.sqlite" > > file.info("GEOmetadb.sqlite") size isdir mode mtime ctime GEOmetadb.sqlite 1565664256 FALSE 644 2011-06-29 19:48:53 2011-06-29 19:48:53 atime uid gid uname grname GEOmetadb.sqlite 2011-06-29 19:49:14 1001 100 guidoh users > > con <- dbConnect(SQLite(), "GEOmetadb.sqlite") > dbDisconnect(con) [1] TRUE > > rs <- dbGetQuery(con,paste("select gsm,supplementary_file", + "from gsm where gpl='GPL96'", + "and supplementary_file like '%CEL.gz'")) *** caught segfault *** address 0x8, cause 'memory not mapped' Traceback: 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE = .SQLitePkgName) 2: sqliteExecStatement(con, statement, bind.data) 3: sqliteQuickSQL(conn, statement, ...) 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm where gpl='GPL96'", "and supplementary_file like '%CEL.gz'")) 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm where gpl='GPL96'", "and supplementary_file like '%CEL.gz'")) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: -----Original Message----- From: seandavi@gmail.com [mailto:seandavi@gmail.com] On Behalf Of Sean Davis Sent: Wednesday, June 29, 2011 18:38 To: Hooiveld, Guido Cc: bioconductor (bioconductor at stat.math.ethz.ch) Subject: Re: [BioC] R crashes with GEOmetadb Hi, Guido. The output of sessionInfo() might be helpful (or not). It is certainly possible that this is an issue with GEOmetadb, but the error suggests that RSQLite might be to blame, also. I'd suggest reinstalling RSQLite as a start. If you still have issues after doing so, definitely include the output of sessionInfo(). Sean On Wed, Jun 29, 2011 at 11:36 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Dear Sean and others, > > I am exploring the functionality of 'GEOmetadb'. I am specifically interested in downloading all CEL files performed on a certain platform. > To this end I am using the example mentioned in the vignette of GEOmetadb, which should retrieve the number of GEO entries and CEL files performed on the Affymetrix array HGU133A (page 8 vignette). > However, when executing that code R crashes and needs to exit... > To me the error messages are not informative to me, but may be you can deduce what is going wrong. Any feedback is appreciated. > > Regards, > Guido > > > R version 2.13.0 (2011-04-13) > Copyright (C) 2011 The R Foundation for Statistical Computing ISBN > 3-900051-07-0 > Platform: x86_64-unknown-linux-gnu (64-bit) > >> >> library(GEOmetadb) > Loading required package: GEOquery > Loading required package: Biobase > > Welcome to Bioconductor > > ?Vignettes contain introductory material. To view, type > ?'browseVignettes()'. To cite Bioconductor, see > ?'citation("Biobase")' and for packages 'citation("pkgname")'. > > Setting options('download.file.method.GEOquery'='curl') > Loading required package: RSQLite > Loading required package: DBI >> getSQLiteFile() > trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz' > Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes > (104.4 Mb) opened URL ================================================ > downloaded 104.4 Mb > > Unzipping... > Metadata associate with downloaded file: > ? ? ? ? ? ? ? ?name ? ? ? ? ? ? ? value > 1 ? ? schema version ? ? ? ? ? ? ? ? 1.0 > 2 creation timestamp 2011-06-18 09:50:00 [1] > "/home.local/guidoh/GEOmetadb.sqlite" >> >> con <- dbConnect(SQLite(), "GEOmetadb.sqlite") >> dbDisconnect(con) > [1] TRUE >> >> rs <- dbGetQuery(con,paste("select gsm,supplementary_file", > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"from gsm where gpl='GPL96'", > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"and supplementary_file like '%CEL.gz'")) > > *** caught segfault *** > address 0x8, cause 'memory not mapped' > > Traceback: > 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE = > .SQLitePkgName) > 2: sqliteExecStatement(con, statement, bind.data) > 3: sqliteQuickSQL(conn, statement, ...) > 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm > where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) > 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm > where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: dim(rs) > Selection: > > > --------------------------------------------------------- > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group Division of Human Nutrition > Wageningen University Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > tel: (+)31 317 485788 > fax: (+)31 317 483342 > email: ? ? ?guido.hooiveld at wur.nl > internet: ? http://nutrigene.4t.com > http://www.researcherid.com/rid/F-4912-2010 > > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
See below. On Wed, Jun 29, 2011 at 11:36 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Dear Sean and others, > > I am exploring the functionality of 'GEOmetadb'. I am specifically interested in downloading all CEL files performed on a certain platform. > To this end I am using the example mentioned in the vignette of GEOmetadb, which should retrieve the number of GEO entries and CEL files performed on the Affymetrix array HGU133A (page 8 vignette). > However, when executing that code R crashes and needs to exit... > To me the error messages are not informative to me, but may be you can deduce what is going wrong. Any feedback is appreciated. > > Regards, > Guido > > > R version 2.13.0 (2011-04-13) > Copyright (C) 2011 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: x86_64-unknown-linux-gnu (64-bit) > >> >> library(GEOmetadb) > Loading required package: GEOquery > Loading required package: Biobase > > Welcome to Bioconductor > > ?Vignettes contain introductory material. To view, type > ?'browseVignettes()'. To cite Bioconductor, see > ?'citation("Biobase")' and for packages 'citation("pkgname")'. > > Setting options('download.file.method.GEOquery'='curl') > Loading required package: RSQLite > Loading required package: DBI >> getSQLiteFile() > trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz' > Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes (104.4 Mb) > opened URL > ================================================ > downloaded 104.4 Mb > > Unzipping... > Metadata associate with downloaded file: > ? ? ? ? ? ? ? ?name ? ? ? ? ? ? ? value > 1 ? ? schema version ? ? ? ? ? ? ? ? 1.0 > 2 creation timestamp 2011-06-18 09:50:00 > [1] "/home.local/guidoh/GEOmetadb.sqlite" >> >> con <- dbConnect(SQLite(), "GEOmetadb.sqlite") >> dbDisconnect(con) Sorry, Guido. I missed this point in my first pass through your email. Here, you disconnect the connection. > [1] TRUE >> >> rs <- dbGetQuery(con,paste("select gsm,supplementary_file", > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"from gsm where gpl='GPL96'", > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"and supplementary_file like '%CEL.gz'")) Here, you are using a disconnected connection object (con) to perform the query; it should fail with an error message but probably not a segmentation fault. If you DO NOT disconnect the connection object, this query works fine. Perhaps RSQLite should have a check of the connection object to make sure that it is connected to avoid the segmentation fault? Sean > sessionInfo() R version 2.13.0 Under development (unstable) (2011-02-26 r54608) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RSQLite_0.9-4 DBI_0.2-5 > *** caught segfault *** > address 0x8, cause 'memory not mapped' > > Traceback: > 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE = .SQLitePkgName) > 2: sqliteExecStatement(con, statement, bind.data) > 3: sqliteQuickSQL(conn, statement, ...) > 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) > 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: dim(rs) > Selection: > > > --------------------------------------------------------- > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > Wageningen University > Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > tel: (+)31 317 485788 > fax: (+)31 317 483342 > email: ? ? ?guido.hooiveld at wur.nl > internet: ? http://nutrigene.4t.com > http://www.researcherid.com/rid/F-4912-2010 > > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Hi Sean, Indeed, you are correct! Due to my inexperience with performing database queries, and clumsy interpretation of some example code I inadvertently closed the connection to the database... Well, after omitting this line the example is working fine now! :) One thing though, through GEOmetadb I locate 17751 CEL files for GPL96, whereas a query directly @ GEO indicates it hosts a considerably larger number of these arrays (i.e. Samples (28011)). Any idea what may cause this discrepancy? http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL96 Thanks again for your assistance, Guido -----Original Message----- From: seandavi@gmail.com [mailto:seandavi@gmail.com] On Behalf Of Sean Davis Sent: Thursday, June 30, 2011 14:03 To: Hooiveld, Guido Cc: bioconductor (bioconductor at stat.math.ethz.ch); Seth Falcon Subject: Re: [BioC] R crashes with GEOmetadb See below. On Wed, Jun 29, 2011 at 11:36 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Dear Sean and others, > > I am exploring the functionality of 'GEOmetadb'. I am specifically interested in downloading all CEL files performed on a certain platform. > To this end I am using the example mentioned in the vignette of GEOmetadb, which should retrieve the number of GEO entries and CEL files performed on the Affymetrix array HGU133A (page 8 vignette). > However, when executing that code R crashes and needs to exit... > To me the error messages are not informative to me, but may be you can deduce what is going wrong. Any feedback is appreciated. > > Regards, > Guido > > > R version 2.13.0 (2011-04-13) > Copyright (C) 2011 The R Foundation for Statistical Computing ISBN > 3-900051-07-0 > Platform: x86_64-unknown-linux-gnu (64-bit) > >> >> library(GEOmetadb) > Loading required package: GEOquery > Loading required package: Biobase > > Welcome to Bioconductor > > ?Vignettes contain introductory material. To view, type > ?'browseVignettes()'. To cite Bioconductor, see > ?'citation("Biobase")' and for packages 'citation("pkgname")'. > > Setting options('download.file.method.GEOquery'='curl') > Loading required package: RSQLite > Loading required package: DBI >> getSQLiteFile() > trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz' > Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes > (104.4 Mb) opened URL ================================================ > downloaded 104.4 Mb > > Unzipping... > Metadata associate with downloaded file: > ? ? ? ? ? ? ? ?name ? ? ? ? ? ? ? value > 1 ? ? schema version ? ? ? ? ? ? ? ? 1.0 > 2 creation timestamp 2011-06-18 09:50:00 [1] > "/home.local/guidoh/GEOmetadb.sqlite" >> >> con <- dbConnect(SQLite(), "GEOmetadb.sqlite") >> dbDisconnect(con) Sorry, Guido. I missed this point in my first pass through your email. Here, you disconnect the connection. > [1] TRUE >> >> rs <- dbGetQuery(con,paste("select gsm,supplementary_file", > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"from gsm where gpl='GPL96'", > + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"and supplementary_file like '%CEL.gz'")) Here, you are using a disconnected connection object (con) to perform the query; it should fail with an error message but probably not a segmentation fault. If you DO NOT disconnect the connection object, this query works fine. Perhaps RSQLite should have a check of the connection object to make sure that it is connected to avoid the segmentation fault? Sean > sessionInfo() R version 2.13.0 Under development (unstable) (2011-02-26 r54608) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RSQLite_0.9-4 DBI_0.2-5 > *** caught segfault *** > address 0x8, cause 'memory not mapped' > > Traceback: > 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE = > .SQLitePkgName) > 2: sqliteExecStatement(con, statement, bind.data) > 3: sqliteQuickSQL(conn, statement, ...) > 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm > where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) > 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm > where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: dim(rs) > Selection: > > > --------------------------------------------------------- > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group Division of Human Nutrition > Wageningen University Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > tel: (+)31 317 485788 > fax: (+)31 317 483342 > email: ? ? ?guido.hooiveld at wur.nl > internet: ? http://nutrigene.4t.com > http://www.researcherid.com/rid/F-4912-2010 > > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
On Thu, Jun 30, 2011 at 8:50 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi Sean, > Indeed, you are correct! > Due to my inexperience with performing database queries, and clumsy interpretation of some example code I inadvertently closed the connection to the database... Well, after omitting this line the example is working fine now! :) > > One thing though, ?through GEOmetadb I locate 17751 CEL files for GPL96, whereas a query directly @ GEO indicates it hosts a considerably larger number of these arrays (i.e. Samples (28011)). Any idea what may cause this discrepancy? > http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL96 If you change your query to "%CEL%" rather than "%CEL.gz", you will pick up another 4k samples, but there are still about 6k samples without CEL files. GEO has not always required raw data. As an aside, the GEOmetadb database is update often, but not continuously, so there will be a bit of a lag (sample numbers may be lower in GEOmetadb than in GEO proper). Sean > Thanks again for your assistance, > Guido > > -----Original Message----- > From: seandavi at gmail.com [mailto:seandavi at gmail.com] On Behalf Of Sean Davis > Sent: Thursday, June 30, 2011 14:03 > To: Hooiveld, Guido > Cc: bioconductor (bioconductor at stat.math.ethz.ch); Seth Falcon > Subject: Re: [BioC] R crashes with GEOmetadb > > See below. > > On Wed, Jun 29, 2011 at 11:36 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: >> Dear Sean and others, >> >> I am exploring the functionality of 'GEOmetadb'. I am specifically interested in downloading all CEL files performed on a certain platform. >> To this end I am using the example mentioned in the vignette of GEOmetadb, which should retrieve the number of GEO entries and CEL files performed on the Affymetrix array HGU133A (page 8 vignette). >> However, when executing that code R crashes and needs to exit... >> To me the error messages are not informative to me, but may be you can deduce what is going wrong. Any feedback is appreciated. >> >> Regards, >> Guido >> >> >> R version 2.13.0 (2011-04-13) >> Copyright (C) 2011 The R Foundation for Statistical Computing ISBN >> 3-900051-07-0 >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >>> >>> library(GEOmetadb) >> Loading required package: GEOquery >> Loading required package: Biobase >> >> Welcome to Bioconductor >> >> ?Vignettes contain introductory material. To view, type >> ?'browseVignettes()'. To cite Bioconductor, see >> ?'citation("Biobase")' and for packages 'citation("pkgname")'. >> >> Setting options('download.file.method.GEOquery'='curl') >> Loading required package: RSQLite >> Loading required package: DBI >>> getSQLiteFile() >> trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz' >> Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes >> (104.4 Mb) opened URL ================================================ >> downloaded 104.4 Mb >> >> Unzipping... >> Metadata associate with downloaded file: >> ? ? ? ? ? ? ? ?name ? ? ? ? ? ? ? value >> 1 ? ? schema version ? ? ? ? ? ? ? ? 1.0 >> 2 creation timestamp 2011-06-18 09:50:00 [1] >> "/home.local/guidoh/GEOmetadb.sqlite" >>> >>> con <- dbConnect(SQLite(), "GEOmetadb.sqlite") >>> dbDisconnect(con) > > Sorry, Guido. ?I missed this point in my first pass through your email. ?Here, you disconnect the connection. > >> [1] TRUE >>> >>> rs <- dbGetQuery(con,paste("select gsm,supplementary_file", >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"from gsm where gpl='GPL96'", >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"and supplementary_file like '%CEL.gz'")) > > Here, you are using a disconnected connection object (con) to perform the query; it should fail with an error message but probably not a segmentation fault. ?If you DO NOT disconnect the connection object, this query works fine. ?Perhaps RSQLite should have a check of the connection object to make sure that it is connected to avoid the segmentation fault? > > Sean > > >> sessionInfo() > R version 2.13.0 Under development (unstable) (2011-02-26 r54608) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] RSQLite_0.9-4 DBI_0.2-5 > > >> *** caught segfault *** >> address 0x8, cause 'memory not mapped' >> >> Traceback: >> 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE = >> .SQLitePkgName) >> 2: sqliteExecStatement(con, statement, bind.data) >> 3: sqliteQuickSQL(conn, statement, ...) >> 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm >> where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) >> 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm >> where gpl='GPL96'", ? ? "and supplementary_file like '%CEL.gz'")) >> >> Possible actions: >> 1: abort (with core dump, if enabled) >> 2: normal R exit >> 3: exit R without saving workspace >> 4: exit R saving workspace >> Selection: dim(rs) >> Selection: >> >> >> --------------------------------------------------------- >> Guido Hooiveld, PhD >> Nutrition, Metabolism & Genomics Group Division of Human Nutrition >> Wageningen University Biotechnion, Bomenweg 2 >> NL-6703 HD Wageningen >> the Netherlands >> tel: (+)31 317 485788 >> fax: (+)31 317 483342 >> email: ? ? ?guido.hooiveld at wur.nl >> internet: ? http://nutrigene.4t.com >> http://www.researcherid.com/rid/F-4912-2010 >> >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >
ADD REPLY

Login before adding your answer.

Traffic: 456 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6