Dirk -
Thank you for following up on this. It's possible I'm just
misunderstanding the code, but I grabbed the source for RPostgreSQL
ver. 0.1-7, and as far as I can tell, the work for dbWriteTable (in
postgresqlWriteTable) still works by creating a temporary file. (Lines
649-660 of PostgreSQLSupport.R.)
- Elliot
On Fri, Jan 14, 2011 at 11:56:47AM -0600, Dirk Eddelbuettel wrote:
>
> On 14 January 2011 at 12:46, Sean Davis wrote:
> |
> |
> | On Fri, Jan 14, 2011 at 12:36 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote:
> |
> | On 01/14/2011 06:25 AM, Elliot Joel Bernstein wrote:
> | > Martin -
> | >
> | > Thanks for your reply. Is RdbiPgSQL no longer maintained at
all? I
> |
> | Hi Elliot --
> |
> | (sorry for the last post, oops). The package hasn't had any
substantial
> | change since 2006. I'm cc'ing the currently listed maintainer,
and also
> | the maintainer of RPostgreSQL, who might have additional
insights.
> |
> |
> |
> | Perhaps Dirk will comment directly but, with regards to
RPosgreSQL, there is a
> | google code site and an issue for dbWriteTable was already filed.
> |
> |
http://code.google.com/p/rpostgresql/issues/detail?id=4&can=1
> |
> | The resolution is that Dirk would appreciate a patch (which I have
not been
> | able to supply--someone else?).
>
> Did the OP try a current version? NEWS follows below. Tomoaki has
done
> some wonderful work and we are in much better shape than we were,
say, one
> year ago.
>
> That said, there are always open issues. See the Google Code site;
and I cc
> out dev list.
>
> Dirk
>
> Version 0.1-7 -- 2010-10-17
>
> o Several potential buffer overruns were fixed
>
> o dbWriteTable now writes a data.frame to database through a
network
> connection rather than a temporary file. Note that row_names
may be
> changed in future releases. Also, passing in filenames
instead of
> data.frame is not supported at this time.
>
> o When no host is specified, a connection to the PostgreSQL
server
> is made via UNIX domain socket (just like psql does)
>
> o Table and column names are case sensitive, and identifiers
are escaped
> or quoted appropriately, so that any form of table/column
names can be
> created, searched, or removed, including upper-, lower- and
mixed-case.
>
> o nullOk in dbColumnInfo has a return value of NA when the
column does
> not correspond to a column in the table. The utility of
nullOk is
> doubtful but not removed at this time.
>
> o Correct Windows getpid() declaration (with thanks to Brian
D. Ripley)
>
> o A call of as.POSIXct() with a time format string wrongly
passed to TZ
> has been corrected; this should help with intra-day
timestamps (with
> thanks to Steve Eick)
>
> o Usage of tmpdir has been improved on similarly to Linux
(with thanks
> to Robert McGehee)
>
> Dirk
>
>
> |
> | Sean
> |
> | ?
> |
> | Martin
> |
> | have tried RPostgreSQL, but it has an even worse problem. In
that
> | package, dbWriteTable is implemented by calling write.table on
the data
> | frame to write it to a temporary file, and then load it into
the
> | database, which means that the formatting options set in R
affect the
> | result. Specifically, write.table by default formats some
numbers in
> | scientific notation, which causes the database load to fail.
> | >
> | > After looking into the problem with RdbiPgSQL a little more,
it
> | > seems
> | that it's caused by dbDisconnect calling the same function
that is
> | registered as the finalizer. I think it could be easily solved
by using
> | an indicator variable to keep track of whether PQfinish has
been called.
> | (I can also just call rm instead of dbDisconnect, but it's a
little
> | disconcerting that the package contains a function that causes
memory
> | errors when used as intended.)
> |
> | > - Elliot
> | >
> | >
> | > On Tue, Jan 11, 2011 at 08:12:21PM -0800, Martin Morgan
wrote:
> | >> On 01/11/2011 11:37 AM, Elliot Joel Bernstein wrote:
> | >>> I am having a problem using the dbWriteTable function in
package
> | RdbiPgSQL to insert data into a PostgreSQL table.
Specifically, the
> | following code sometimes works, but sometimes fails with an
error message
> | indicating that there is "no database connection":
> | >>>
> | >>>
> | >>>
> | >>>
=====================================================================
> | >>> dbtest.R
> | >>>
=====================================================================
> | >>>
> | >>> require(RdbiPgSQL)
> | >>
> | >> Hi Elliot -- not so much an answer, but unless RdbiPgSQL is
central you
> | >> might try the CRAN package RPostgreSQL, which is actively
maintained.
> | >>
> | >> Martin
> | >>
> | >>>
> | >>> test <- function(conn, n) {
> | >>>
> | >>> ? dbSendQuery(conn, "create temp table temp (x int, y
int)")
> | >>>
> | >>> ? n <- 10000
> | >>> ? dat <- data.frame(x=sample(1000,n,replace=T),
y=sample(1000,n,replace
> | =T))
> | >>>
> | >>> ? for (i in 1:dim(dat)[1]) {
> | >>>
> | >>> ? ? err <- try(dbSendQuery(conn, sprintf("insert into temp
values
> | (%d,%d)", dat$x[i], dat$y[i])))
> | >>> ? ? if (inherits(err, "try-error")) {
> | >>> ? ? ? print(conn)
> | >>> ? ? ? break
> | >>> ? ? }
> | >>>
> | >>> ? }
> | >>>
> | >>> ? print(dbGetQuery(conn, "select count(1) from temp"))
> | >>>
> | >>> }
> | >>>
> | >>> conn <- dbConnect(PgSQL(), dbname="gf", host="gf-host")
> | >>> test(conn, 10000)
> | >>> dbDisconnect(conn)
> | >>>
> | >>>
=====================================================================
> | >>>
> | >>>
> | >>>
> | >>> The following is sample output from multiple runs of this
script:
> | >>>
> | >>>
> | >>>
> | >>>
=====================================================================
> | >>> Sample output
> | >>>
=====================================================================
> | >>>
> | >>> R version 2.12.0 (2010-10-15)
> | >>> Copyright (C) 2010 The R Foundation for Statistical
Computing
> | >>> ISBN 3-900051-07-0
> | >>> Platform: x86_64-redhat-linux-gnu (64-bit)
> | >>>
> | >>> R is free software and comes with ABSOLUTELY NO WARRANTY.
> | >>> You are welcome to redistribute it under certain
conditions.
> | >>> Type 'license()' or 'licence()' for distribution details.
> | >>>
> | >>> ? Natural language support but running in an English
locale
> | >>>
> | >>> R is a collaborative project with many contributors.
> | >>> Type 'contributors()' for more information and
> | >>> 'citation()' on how to cite R or R packages in
publications.
> | >>>
> | >>> Type 'demo()' for some demos, 'help()' for on-line help,
or
> | >>> 'help.start()' for an HTML browser interface to help.
> | >>> Type 'q()' to quit R.
> | >>>
> | >>>> source("dbtest.R")
> | >>> Loading required package: RdbiPgSQL
> | >>> Loading required package: Rdbi
> | >>> ? count
> | >>> 1 10000
> | >>>> source("dbtest.R")
> | >>> Error in dbSendQuery.PgSQL.conn(conn, sprintf("insert into
temp values
> | (%d,%d)", ?:
> | >>> ? No database connection
> | >>> status = 1
> | >>> database.name = ?a
> | >>> host.name = p??
> | >>> options = 8}G
> | >>> password = P?
> | >>> port = ?
> | >>> tty = ??
> | >>> socket = -1
> | >>> client.encoding = -1
> | >>> backend.pid = 0
> | >>> Error in dbSendQuery.PgSQL.conn(conn, ...) : No database
connection
> | >>>
> | >>>
=====================================================================
> | >>>
> | >>>
> | >>>
> | >>> Here is the result of a traceback:
> | >>>
> | >>>
> | >>>
> | >>>
=====================================================================
> | >>> traceback
> | >>>
=====================================================================
> | >>>
> | >>>> traceback()
> | >>> 11: .Call("PgSQLsendQuery", conn, query, PACKAGE =
"RdbiPgSQL")
> | >>> 10: dbSendQuery.PgSQL.conn(conn, ...) at dbtest.R#21
> | >>> 9: dbSendQuery(conn, ...) at dbtest.R#21
> | >>> 8: dbGetResult(dbSendQuery(conn, ...)) at dbtest.R#21
> | >>> 7: dbGetQuery.PgSQL.conn(conn, "select count(1) from
temp") at dbtest.R
> | #21
> | >>> 6: dbGetQuery(conn, "select count(1) from temp") at
dbtest.R#21
> | >>> 5: print(dbGetQuery(conn, "select count(1) from temp")) at
dbtest.R#21
> | >>> 4: test(conn, 10000)
> | >>> 3: eval.with.vis(expr, envir, enclos)
> | >>> 2: eval.with.vis(ei, envir)
> | >>> 1: source("dbtest.R")
> | >>>
> | >>>
=====================================================================
> | >>>
> | >>>
> | >>>
> | >>> And here is my sessionInfo:
> | >>>
> | >>>
> | >>>
> | >>>
=====================================================================
> | >>> sessionInfo
> | >>>
=====================================================================
> | >>>
> | >>>> sessionInfo()
> | >>> R version 2.12.0 (2010-10-15)
> | >>> Platform: x86_64-redhat-linux-gnu (64-bit)
> | >>>
> | >>> locale:
> | >>> ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C
> | >>> ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8
> | >>> ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8
> | >>> ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C
> | >>> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C
> | >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> | >>>
> | >>> attached base packages:
> | >>> [1] stats ? ? graphics ?grDevices utils ? ? datasets
?methods ? base
> | >>>
> | >>> other attached packages:
> | >>> [1] RdbiPgSQL_1.24.0 Rdbi_1.24.0
> | >>>
> | >>>
=====================================================================
> | >>>
> | >>>
> | >>>
> | >>> I am using version 8.4.4 of the PostgreSQL-devel
libraries, and the
> | server is running version 8.4.1. The problem does not always
occur on the
> | second run; I am sometimes able to run the script multiple
times before it
> | eventually fails. Also, it does not always fail at the same
row of the
> | dataframe, nor is there anything obviously wrong with the data
values on
> | the row where it fails. I would appreciate any help in
diagnosing the
> | problem.
> | >>>
> | >>> Thank you very much.
> | >>>
> | >>> - Elliot
> | >>>
> | >>> _______________________________________________
> | >>> Bioconductor mailing list
> | >>> Bioconductor at r-project.org
> | >>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> | >>> Search the archives:
http://news.gmane.org/
> | gmane.science.biology.informatics.conductor
> | >>
> | >>
> | >> --
> | >> Computational Biology
> | >> Fred Hutchinson Cancer Research Center
> | >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> | >>
> | >> Location: M1-B861
> | >> Telephone: 206 667-2793
> |
> |
> | --
> | Computational Biology
> | Fred Hutchinson Cancer Research Center
> | 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> |
> | Location: M1-B861
> | Telephone: 206 667-2793
> |
> | _______________________________________________
> | Bioconductor mailing list
> | Bioconductor at r-project.org
> |
https://stat.ethz.ch/mailman/listinfo/bioconductor
> | Search the archives:
http://news.gmane.org/
> | gmane.science.biology.informatics.conductor
> |
> |
>
> --
> Dirk Eddelbuettel | edd at debian.org |
http://dirk.eddelbuettel.com