R: R: Is there a way to extract some fields data fromHTML pages through any R function ?
1
0
Entering edit mode
@mauedealiceit-3511
Last seen 8.4 years ago
It works if the web file adress is of the type: "http://". It does not work if the web file adress is of the type: "'ftp://". > outFile <- read.xls("ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CU RRENT/miRNA.xls") Error in xls2csv(xls, sheet, verbose = verbose, ..., perl = perl) : Unable to read xls file 'ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls'. Error in file.exists(tfn) : invalid 'file' argument But the file does exists as shown in the following: > download.file("ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/ miRNA.xls","outFile") trying URL 'ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls' ftp data connection made, file length 2563072 bytes opened URL downloaded 2.4 Mb Can the two steps (download + read.xls) be performed with one command line only ? Thank you, Maura -----Messaggio originale----- Da: r-help-bounces@r-project.org per conto di Daniel Nordlund Inviato: lun 06/07/2009 20.45 A: r-help@stat.math.ethz.ch Oggetto: Re: [R] R: R: Is there a way to extract some fields data fromHTML pages through any R function ? > -----Original Message----- > From: r-help-bounces@r-project.org > [mailto:r-help-bounces@r-project.org] On Behalf Of mauede@alice.it > Sent: Sunday, July 05, 2009 11:28 PM > To: Martin Morgan > Cc: r-help@stat.math.ethz.ch > Subject: [R] R: R: Is there a way to extract some fields data > from HTML pages through any R function ? > > It helps. But it is overly sophisticated. > I have already downloaded and used the Excel file containing > the validated stuff. > > Since there are R commands to download gzip as well as FASTA > files, I wonder whether it is possible to > automatically download the Excel file from > http://mirecords.umn.edu/miRecords/download.php > Actually the latter may not be the actual file URL because it > is necessary to click on the word "here" to download the file. > > Thank you, > Maura > Maura, I haven't seen a response to your question (however, I just may have missed it, or you mave have received an off-line response). I went to the URL above and found that the Excel file is at http://mirecords.umn.edu/miRecords/download_data.php?v=1 I think you could use the read.xls() function from the gdata package to get the file, something like this library(gdata) df <- read.xls("http://mirecords.umn.edu/miRecords/download_data.php?v=1") Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
• 715 views
ADD COMMENT
0
Entering edit mode
@gabor-grothendieck-2332
Last seen 8.4 years ago
I will send you offline an enhancement for read.xls that accepts ftp connections. On Sun, Jul 26, 2009 at 11:32 AM, <mauede at="" alice.it=""> wrote: > It works if the web file adress is of the type: ?"http://". > It does not work if the web file adress is of the type: ?"'ftp://". >> outFile <- read.xls("ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/C URRENT/miRNA.xls") > Error in xls2csv(xls, sheet, verbose = verbose, ..., perl = perl) : > ?Unable to read xls file 'ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls'. > Error in file.exists(tfn) : invalid 'file' argument > > But the file does exists as shown in the following: >> download.file("ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT /miRNA.xls","outFile") > trying URL 'ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls' > ftp data connection made, file length 2563072 bytes > opened URL > downloaded 2.4 Mb > > Can the two steps (download + read.xls) be performed with one command line ?only ? > > Thank you, > Maura > > > > -----Messaggio originale----- > Da: r-help-bounces at r-project.org per conto di Daniel Nordlund > Inviato: lun 06/07/2009 20.45 > A: r-help at stat.math.ethz.ch > Oggetto: Re: [R] R: R: Is there a way to extract some fields data fromHTML ? ? ?pages through any R function ? > >> -----Original Message----- >> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] On Behalf Of mauede at alice.it >> Sent: Sunday, July 05, 2009 11:28 PM >> To: Martin Morgan >> Cc: r-help at stat.math.ethz.ch >> Subject: [R] R: R: Is there a way to extract some fields data >> from HTML pages through any R function ? >> >> It helps. But it is overly sophisticated. >> I have already downloaded and used the Excel file containing >> the validated stuff. >> >> Since there are R commands to download gzip as well as FASTA >> files, I wonder whether it is possible to >> automatically download the Excel file from >> http://mirecords.umn.edu/miRecords/download.php >> Actually the latter may not be the actual file URL because it >> is necessary to click on the word "here" to download the file. >> >> Thank you, >> Maura >> > Maura, > > I haven't seen a response to your question (however, I just may have missed > it, or you mave have received an off-line response). ?I went to the URL > above and found that the Excel file is at > > http://mirecords.umn.edu/miRecords/download_data.php?v=1 > > I think you could use the read.xls() function from the gdata package to get > the file, something like this > > library(gdata) > df <- read.xls("http://mirecords.umn.edu/miRecords/download_data.php?v=1") > > Hope this is helpful, > > Dan > > Daniel Nordlund > Bothell, WA USA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- guide.html > and provide commented, minimal, self-contained, reproducible code. >
ADD COMMENT

Login before adding your answer.

Traffic: 474 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6