Importing data from GEOquery
1
0
Entering edit mode
@kini-aditya-m-2878
Last seen 9.6 years ago
Hi, I am repeatedly getting this error message when I try to import a file from GEO. Here is the code: > gsm.1 <- getGEO("GSM3612") trying URL 'http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?targ=self&ac c=GSM3612&form=text&view=full' Content type 'geo/text' length unknown opened URL downloaded 1.5 Mb File stored at: C:\Users\Aditya\AppData\Local\Temp\Rtmp5ZePB5/GSM3612.soft Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got '!sample_table_end' > gsm.1 Error: object "gsm.1" not found Please let me know what the problem is. Thanks, Adi
• 1000 views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 6 weeks ago
United States
please read the posting guide before posting. you should identify your system and version of R. i can verify that this problem occurs on windows R 2.7.0 and on mac osx R 2.8.0 r45836 with RCurl 0.8-3 and GEOquery 2.5.0 however example(getGEO) works on those systems, so i wonder if the problem is with the GSM3612 files rather than GEOquery. we have no way of verifying that the files on GEO are parseable short of trying to read them. --- Vince Carey, PhD Assoc. Prof Med (Biostatistics) Harvard Medical School Channing Laboratory - ph 6175252265 fa 6177311541 181 Longwood Ave Boston MA 02115 USA stvjc at channing.harvard.edu On Wed, 25 Jun 2008, Kini, Aditya M wrote: > Hi, > > I am repeatedly getting this error message when I try to import a file from GEO. Here is the code: > > > gsm.1 <- getGEO("GSM3612") > trying URL 'http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?targ=self& acc=GSM3612&form=text&view=full' > Content type 'geo/text' length unknown > opened URL > downloaded 1.5 Mb > > File stored at: > C:\Users\Aditya\AppData\Local\Temp\Rtmp5ZePB5/GSM3612.soft > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > scan() expected 'a real', got '!sample_table_end' > > gsm.1 > Error: object "gsm.1" not found > > Please let me know what the problem is. > > Thanks, > Adi > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > The information transmitted in this electronic communica...{{dropped:10}}
ADD COMMENT
0
Entering edit mode
On Wed, Jun 25, 2008 at 3:57 PM, Vincent Carey 525-2265 <stvjc at="" channing.harvard.edu=""> wrote: > please read the posting guide before posting. you should > identify your system and version of R. > > i can verify that this problem occurs on windows R 2.7.0 > and on mac osx R 2.8.0 r45836 with RCurl 0.8-3 and GEOquery 2.5.0 > > however example(getGEO) works on those systems, so i wonder > if the problem is with the GSM3612 files rather than GEOquery. > we have no way of verifying that the files on GEO are parseable > short of trying to read them. > > --- > Vince Carey, PhD > Assoc. Prof Med (Biostatistics) > Harvard Medical School > Channing Laboratory - ph 6175252265 fa 6177311541 > 181 Longwood Ave Boston MA 02115 USA > stvjc at channing.harvard.edu > > On Wed, 25 Jun 2008, Kini, Aditya M wrote: > >> Hi, >> >> I am repeatedly getting this error message when I try to import a file from GEO. Here is the code: >> >> > gsm.1 <- getGEO("GSM3612") >> trying URL 'http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?targ=self &acc=GSM3612&form=text&view=full' >> Content type 'geo/text' length unknown >> opened URL >> downloaded 1.5 Mb >> >> File stored at: >> C:\Users\Aditya\AppData\Local\Temp\Rtmp5ZePB5/GSM3612.soft >> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : >> scan() expected 'a real', got '!sample_table_end' >> > gsm.1 >> Error: object "gsm.1" not found >> >> Please let me know what the problem is. Thanks, Vince, for checking into the problem. It looks like those files have a multibyte character (that was supposed to be a degree symbol, from the looks of it) that is problematic in at least some locales. I don't know of an easy way to fix the problem, as the files at NCBI are supposed to all be in the same character encoding (UTF-8). If anyone knows of a solution, let me know. Sean
ADD REPLY
0
Entering edit mode
On Wed, 25 Jun 2008, Sean Davis wrote: > > > > On Wed, 25 Jun 2008, Kini, Aditya M wrote: > > > >> Hi, > >> > >> I am repeatedly getting this error message when I try to import a file from GEO. Here is the code: > >> > >> > gsm.1 <- getGEO("GSM3612") > >> trying URL 'http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?targ=se lf&acc=GSM3612&form=text&view=full' > >> Content type 'geo/text' length unknown > >> opened URL > >> downloaded 1.5 Mb > >> > >> File stored at: > >> C:\Users\Aditya\AppData\Local\Temp\Rtmp5ZePB5/GSM3612.soft > >> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > >> scan() expected 'a real', got '!sample_table_end' > >> > gsm.1 > >> Error: object "gsm.1" not found > >> > >> Please let me know what the problem is. > > Thanks, Vince, for checking into the problem. It looks like those > files have a multibyte character (that was supposed to be a degree > symbol, from the looks of it) that is problematic in at least some > locales. I don't know of an easy way to fix the problem, as the files > at NCBI are supposed to all be in the same character encoding (UTF-8). > If anyone knows of a solution, let me know. > > i just checked it on a more recent version of R and we get more info: downloaded 1.5 Mb File stored at: /tmp/RtmpjQn5oh/GSM3612.soft Error in make.names(col.names, unique = TRUE) : invalid multibyte string 29 In addition: Warning messages: 1: In grep("!\\w+_table_begin", txt[i], perl = TRUE) : input string 1 is invalid in this locale 2: In grep("!\\w+_table_begin", txt[i], perl = TRUE) : input string 1 is invalid in this locale 3: In grep("^#", txt, perl = TRUE) : input string 42 is invalid in this locale 4: In grep("^#", txt, perl = TRUE) : input string 67 is invalid in this locale 5: In grep("!\\w*_", txt, perl = TRUE, value = TRUE) : input string 42 is invalid in this locale 6: In grep("!\\w*_", txt, perl = TRUE, value = TRUE) : input string 67 is invalid in this locale 7: In grep(leader, txt) : input string 42 is invalid in this locale 8: In grep(leader, txt) : input string 67 is invalid in this locale > sessionInfo() R version 2.8.0 Under development (unstable) (--) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] GEOquery_2.5.0 RCurl_0.9-3 Biobase_2.1.7 while sean has diagnosed the problem, i report the sessionInfo() because the locale may also contribute to problems in such endeavors. people reporting data read problems should take care to provide all this information in the future. The information transmitted in this electronic communica...{{dropped:10}}
ADD REPLY

Login before adding your answer.

Traffic: 723 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6