GEOquery-Problem related to importing data set
1
0
Entering edit mode
@roopa-subbaiaih-5490
Last seen 9.4 years ago
United States
Hi All, I am trying to import few affy datasets into bioconductor.When I try the example given in R script it works fine but not the case with the dataset I need to import. gds <- getGEO(filename=system.file("extdata/GDS2478.soft.gz",package="GEOquer y")) Error in read.table(con, sep = "\t", header = FALSE, nrows = nseries) : invalid 'nlines' argument In addition: Warning messages: 1: In file(fname, "r") : file("") only supports open = "w+" and open = "w+b": using the former 2: In file(con, "r") : file("") only supports open = "w+" and open = "w+b": using the former 3: In file(fname, "r") : file("") only supports open = "w+" and open = "w+b": using the former This what i get. Please let me know how to rectify this. Thanks, Roopa [[alternative HTML version deleted]]
affy affy • 1.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States
Hi Roopa, On 9/10/2012 4:08 PM, Roopa Subbaiaih wrote: > Hi All, > > I am trying to import few affy datasets into bioconductor.When I try the > example given in R script it works fine but not the case with the dataset I > need to import. > > gds<- > getGEO(filename=system.file("extdata/GDS2478.soft.gz",package="GEOqu ery")) Unless you put the file there, I doubt you are pointing to the right place. Note that the vignette uses the extdata directory for data that come with the GEOquery package - that isn't where you should put things. Here is what I got: > getGEO("GDS2478") File stored at: C:\Users\BIOINF~1\AppData\Local\Temp\RtmpYDVgJB/GDS2478.soft.gz > dat <- getGEO(filename="C:/Users/BIOINF~1/AppData/Local/Temp/RtmpYDVgJB/GDS24 78.soft.gz") > class(dat) [1] "GDS" attr(,"package") [1] "GEOquery" And to see what is in a GDS object, ?GDS-class and ?GEOData-class and from there I get > head(Table(dat)) ID_REF IDENTIFIER GSM148887 GSM148888 GSM148889 GSM148890 GSM148892 1 1007_s_at DDR1 760.999 754.827 758.415 746.281 776.563 2 1053_at RFC2 93.978 113.67 110.74 114.423 140.641 3 117_at HSPA6 39.268 40.784 42.345 37.779 44.122 4 121_at PAX8 189.36 194.713 208.084 180.112 223.06 5 1255_g_at GUCA1A 11.864 8.857 10.754 9.224 10.396 6 1294_at UBA7 70.985 78.527 79.646 70.705 77.107 Best, Jim > Error in read.table(con, sep = "\t", header = FALSE, nrows = nseries) : > invalid 'nlines' argument > In addition: Warning messages: > 1: In file(fname, "r") : > file("") only supports open = "w+" and open = "w+b": using the former > 2: In file(con, "r") : > file("") only supports open = "w+" and open = "w+b": using the former > 3: In file(fname, "r") : > file("") only supports open = "w+" and open = "w+b": using the former > This what i get. > > Please let me know how to rectify this. Thanks, Roopa > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Thank you very much. I tried to follow what you did.This is what I get. getGEO("GDS2478") Error in download.file(myurl, destfile, mode = mode, quiet = TRUE, method = getOption("download.file.method.GEOquery")) : cannot open URL ' ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/SOFT/GDS/GDS2478.soft.gz' Am I missing something? Thank you for the quick response. Roopa On Mon, Sep 10, 2012 at 4:33 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Roopa, > > > On 9/10/2012 4:08 PM, Roopa Subbaiaih wrote: > >> Hi All, >> >> I am trying to import few affy datasets into bioconductor.When I try the >> example given in R script it works fine but not the case with the dataset >> I >> need to import. >> >> gds<- >> getGEO(filename=system.file("**extdata/GDS2478.soft.gz",** >> package="GEOquery")) >> > > Unless you put the file there, I doubt you are pointing to the right > place. Note that the vignette uses the extdata directory for data that come > with the GEOquery package - that isn't where you should put things. > > Here is what I got: > > > getGEO("GDS2478") > File stored at: > C:\Users\BIOINF~1\AppData\**Local\Temp\RtmpYDVgJB/GDS2478.**soft.gz > > dat <- getGEO(filename="C:/Users/**BIOINF~1/AppData/Local/Temp/** > RtmpYDVgJB/GDS2478.soft.gz") > > class(dat) > [1] "GDS" > attr(,"package") > [1] "GEOquery" > > And to see what is in a GDS object, > > ?GDS-class > and > ?GEOData-class > > and from there I get > > > head(Table(dat)) > ID_REF IDENTIFIER GSM148887 GSM148888 GSM148889 GSM148890 GSM148892 > 1 1007_s_at DDR1 760.999 754.827 758.415 746.281 776.563 > 2 1053_at RFC2 93.978 113.67 110.74 114.423 140.641 > 3 117_at HSPA6 39.268 40.784 42.345 37.779 44.122 > 4 121_at PAX8 189.36 194.713 208.084 180.112 223.06 > 5 1255_g_at GUCA1A 11.864 8.857 10.754 9.224 10.396 > 6 1294_at UBA7 70.985 78.527 79.646 70.705 77.107 > > Best, > > Jim > > > > Error in read.table(con, sep = "\t", header = FALSE, nrows = nseries) : >> invalid 'nlines' argument >> In addition: Warning messages: >> 1: In file(fname, "r") : >> file("") only supports open = "w+" and open = "w+b": using the former >> 2: In file(con, "r") : >> file("") only supports open = "w+" and open = "w+b": using the former >> 3: In file(fname, "r") : >> file("") only supports open = "w+" and open = "w+b": using the former >> This what i get. >> >> Please let me know how to rectify this. Thanks, Roopa >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Roopa, On 9/10/2012 4:50 PM, Roopa Subbaiaih wrote: > Thank you very much. I tried to follow what you did.This is what I get. > getGEO("GDS2478") > Error in download.file(myurl, destfile, mode = mode, quiet = TRUE, > method = getOption("download.file.method.GEOquery")) : > cannot open URL > 'ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/SOFT/GDS/GDS2478.soft.gz' > Am I missing something? Well, there is the obvious - you are connected to the internet, yes? If so, does the following code download some data? read.table("ftp://ftp.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat") if not, does this help (here I am assuming you are on Windows)? setInternet2(NA) read.table("ftp://ftp.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat") If so, you should then be able to use getGEO(). Best, Jim > Thank you for the quick response. Roopa > > > On Mon, Sep 10, 2012 at 4:33 PM, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Roopa, > > > On 9/10/2012 4:08 PM, Roopa Subbaiaih wrote: > > Hi All, > > I am trying to import few affy datasets into bioconductor.When > I try the > example given in R script it works fine but not the case with > the dataset I > need to import. > > gds<- > getGEO(filename=system.file("extdata/GDS2478.soft.gz",packag e="GEOquery")) > > > Unless you put the file there, I doubt you are pointing to the > right place. Note that the vignette uses the extdata directory for > data that come with the GEOquery package - that isn't where you > should put things. > > Here is what I got: > > > getGEO("GDS2478") > File stored at: > C:\Users\BIOINF~1\AppData\Local\Temp\RtmpYDVgJB/GDS2478.soft.gz > > dat <- > getGEO(filename="C:/Users/BIOINF~1/AppData/Local/Temp/RtmpYDVgJB /GDS2478.soft.gz") > > class(dat) > [1] "GDS" > attr(,"package") > [1] "GEOquery" > > And to see what is in a GDS object, > > ?GDS-class > and > ?GEOData-class > > and from there I get > > > head(Table(dat)) > ID_REF IDENTIFIER GSM148887 GSM148888 GSM148889 GSM148890 > GSM148892 > 1 1007_s_at DDR1 760.999 754.827 758.415 746.281 > 776.563 > 2 1053_at RFC2 93.978 113.67 110.74 114.423 > 140.641 > 3 117_at HSPA6 39.268 40.784 42.345 37.779 > 44.122 > 4 121_at PAX8 189.36 194.713 208.084 180.112 > 223.06 > 5 1255_g_at GUCA1A 11.864 8.857 10.754 9.224 > 10.396 > 6 1294_at UBA7 70.985 78.527 79.646 70.705 > 77.107 > > Best, > > Jim > > > > Error in read.table(con, sep = "\t", header = FALSE, nrows = > nseries) : > invalid 'nlines' argument > In addition: Warning messages: > 1: In file(fname, "r") : > file("") only supports open = "w+" and open = "w+b": using > the former > 2: In file(con, "r") : > file("") only supports open = "w+" and open = "w+b": using > the former > 3: In file(fname, "r") : > file("") only supports open = "w+" and open = "w+b": using > the former > This what i get. > > Please let me know how to rectify this. Thanks, Roopa > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
I am using windows XP 32 bit. I tried your suggestion and this is what I get- > setInternet2(NA) [1] FALSE > read.table("ftp://ftp.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat") Error in file(file, "rt") : cannot open the connection I am fairly new to R. Please let me know if I can try any other option. Thank you, Roopa On Mon, Sep 10, 2012 at 5:04 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Roopa, > > > On 9/10/2012 4:50 PM, Roopa Subbaiaih wrote: > >> Thank you very much. I tried to follow what you did.This is what I get. >> getGEO("GDS2478") >> Error in download.file(myurl, destfile, mode = mode, quiet = TRUE, method >> = getOption("download.file.**method.GEOquery")) : >> cannot open URL 'ftp://ftp.ncbi.nlm.nih.gov/** >> pub/geo/DATA/SOFT/GDS/GDS2478.**soft.gz<ftp: ftp.ncbi.nlm.nih.gov="" pub="" geo="" data="" soft="" gds="" gds2478.soft.gz=""> >> ' >> Am I missing something? >> > > Well, there is the obvious - you are connected to the internet, yes? If > so, does the following code download some data? > > read.table("ftp://ftp.stats.**ox.ac.uk/pub/datasets/csb/**ch11b.dat <ftp: ftp.stats.ox.ac.uk="" pub="" datasets="" csb="" ch11b.dat=""> > ") > > if not, does this help (here I am assuming you are on Windows)? > > setInternet2(NA) > read.table("ftp://ftp.stats.**ox.ac.uk/pub/datasets/csb/**ch11b.dat< ftp://ftp.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat> > ") > > If so, you should then be able to use getGEO(). > > Best, > > Jim > > > Thank you for the quick response. Roopa >> >> >> On Mon, Sep 10, 2012 at 4:33 PM, James W. MacDonald <jmacdon@uw.edu<mailto:>> jmacdon@uw.edu>> wrote: >> >> Hi Roopa, >> >> >> On 9/10/2012 4:08 PM, Roopa Subbaiaih wrote: >> >> Hi All, >> >> I am trying to import few affy datasets into bioconductor.When >> I try the >> example given in R script it works fine but not the case with >> the dataset I >> need to import. >> >> gds<- >> getGEO(filename=system.file("**extdata/GDS2478.soft.gz",** >> package="GEOquery")) >> >> >> Unless you put the file there, I doubt you are pointing to the >> right place. Note that the vignette uses the extdata directory for >> data that come with the GEOquery package - that isn't where you >> should put things. >> >> Here is what I got: >> >> > getGEO("GDS2478") >> File stored at: >> C:\Users\BIOINF~1\AppData\**Local\Temp\RtmpYDVgJB/GDS2478.**soft.gz >> > dat <- >> getGEO(filename="C:/Users/**BIOINF~1/AppData/Local/Temp/** >> RtmpYDVgJB/GDS2478.soft.gz") >> > class(dat) >> [1] "GDS" >> attr(,"package") >> [1] "GEOquery" >> >> And to see what is in a GDS object, >> >> ?GDS-class >> and >> ?GEOData-class >> >> and from there I get >> >> > head(Table(dat)) >> ID_REF IDENTIFIER GSM148887 GSM148888 GSM148889 GSM148890 >> GSM148892 >> 1 1007_s_at DDR1 760.999 754.827 758.415 746.281 >> 776.563 >> 2 1053_at RFC2 93.978 113.67 110.74 114.423 >> 140.641 >> 3 117_at HSPA6 39.268 40.784 42.345 37.779 >> 44.122 >> 4 121_at PAX8 189.36 194.713 208.084 180.112 >> 223.06 >> 5 1255_g_at GUCA1A 11.864 8.857 10.754 9.224 >> 10.396 >> 6 1294_at UBA7 70.985 78.527 79.646 70.705 >> 77.107 >> >> Best, >> >> Jim >> >> >> >> Error in read.table(con, sep = "\t", header = FALSE, nrows = >> nseries) : >> invalid 'nlines' argument >> In addition: Warning messages: >> 1: In file(fname, "r") : >> file("") only supports open = "w+" and open = "w+b": using >> the former >> 2: In file(con, "r") : >> file("") only supports open = "w+" and open = "w+b": using >> the former >> 3: In file(fname, "r") : >> file("") only supports open = "w+" and open = "w+b": using >> the former >> This what i get. >> >> Please let me know how to rectify this. Thanks, Roopa >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org>> >> >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.** >> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- --------------------------------------- Roopa Shree Subbaiaih Post Doctoral Fellow Department of Orthopaedics School of Medicine Case Western Reserve University Cleveland, OH-44106 Tel:+1 216 368 1380 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Got it. Looks like the firewall in our department was creating trouble. Thanks for your patience. Roopa On Mon, Sep 10, 2012 at 5:17 PM, Roopa Subbaiaih <rss115@case.edu> wrote: > I am using windows XP 32 bit. I tried your suggestion and this is what I > get- > > > setInternet2(NA) > [1] FALSE > > read.table("ftp://ftp.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat") > Error in file(file, "rt") : cannot open the connection > I am fairly new to R. > > Please let me know if I can try any other option. > > Thank you, Roopa > > On Mon, Sep 10, 2012 at 5:04 PM, James W. MacDonald <jmacdon@uw.edu>wrote: > >> Hi Roopa, >> >> >> On 9/10/2012 4:50 PM, Roopa Subbaiaih wrote: >> >>> Thank you very much. I tried to follow what you did.This is what I get. >>> getGEO("GDS2478") >>> Error in download.file(myurl, destfile, mode = mode, quiet = TRUE, >>> method = getOption("download.file.**method.GEOquery")) : >>> cannot open URL 'ftp://ftp.ncbi.nlm.nih.gov/** >>> pub/geo/DATA/SOFT/GDS/GDS2478.**soft.gz<ftp: ftp.ncbi.nlm.nih.gov="" pub="" geo="" data="" soft="" gds="" gds2478.soft.gz=""> >>> ' >>> Am I missing something? >>> >> >> Well, there is the obvious - you are connected to the internet, yes? If >> so, does the following code download some data? >> >> read.table("ftp://ftp.stats.**ox.ac.uk/pub/datasets/csb/**ch11b.da t<ftp: ftp.stats.ox.ac.uk="" pub="" datasets="" csb="" ch11b.dat=""> >> ") >> >> if not, does this help (here I am assuming you are on Windows)? >> >> setInternet2(NA) >> read.table("ftp://ftp.stats.**ox.ac.uk/pub/datasets/csb/**ch11b.dat <ftp: ftp.stats.ox.ac.uk="" pub="" datasets="" csb="" ch11b.dat=""> >> ") >> >> If so, you should then be able to use getGEO(). >> >> Best, >> >> Jim >> >> >> Thank you for the quick response. Roopa >>> >>> >>> On Mon, Sep 10, 2012 at 4:33 PM, James W. MacDonald <jmacdon@uw.edu<mailto:>>> jmacdon@uw.edu>> wrote: >>> >>> Hi Roopa, >>> >>> >>> On 9/10/2012 4:08 PM, Roopa Subbaiaih wrote: >>> >>> Hi All, >>> >>> I am trying to import few affy datasets into bioconductor.When >>> I try the >>> example given in R script it works fine but not the case with >>> the dataset I >>> need to import. >>> >>> gds<- >>> getGEO(filename=system.file("**extdata/GDS2478.soft.gz",** >>> package="GEOquery")) >>> >>> >>> Unless you put the file there, I doubt you are pointing to the >>> right place. Note that the vignette uses the extdata directory for >>> data that come with the GEOquery package - that isn't where you >>> should put things. >>> >>> Here is what I got: >>> >>> > getGEO("GDS2478") >>> File stored at: >>> C:\Users\BIOINF~1\AppData\**Local\Temp\RtmpYDVgJB/GDS2478.**soft.gz >>> > dat <- >>> getGEO(filename="C:/Users/**BIOINF~1/AppData/Local/Temp/** >>> RtmpYDVgJB/GDS2478.soft.gz") >>> > class(dat) >>> [1] "GDS" >>> attr(,"package") >>> [1] "GEOquery" >>> >>> And to see what is in a GDS object, >>> >>> ?GDS-class >>> and >>> ?GEOData-class >>> >>> and from there I get >>> >>> > head(Table(dat)) >>> ID_REF IDENTIFIER GSM148887 GSM148888 GSM148889 GSM148890 >>> GSM148892 >>> 1 1007_s_at DDR1 760.999 754.827 758.415 746.281 >>> 776.563 >>> 2 1053_at RFC2 93.978 113.67 110.74 114.423 >>> 140.641 >>> 3 117_at HSPA6 39.268 40.784 42.345 37.779 >>> 44.122 >>> 4 121_at PAX8 189.36 194.713 208.084 180.112 >>> 223.06 >>> 5 1255_g_at GUCA1A 11.864 8.857 10.754 9.224 >>> 10.396 >>> 6 1294_at UBA7 70.985 78.527 79.646 70.705 >>> 77.107 >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> Error in read.table(con, sep = "\t", header = FALSE, nrows = >>> nseries) : >>> invalid 'nlines' argument >>> In addition: Warning messages: >>> 1: In file(fname, "r") : >>> file("") only supports open = "w+" and open = "w+b": using >>> the former >>> 2: In file(con, "r") : >>> file("") only supports open = "w+" and open = "w+b": using >>> the former >>> 3: In file(fname, "r") : >>> file("") only supports open = "w+" and open = "w+b": using >>> the former >>> This what i get. >>> >>> Please let me know how to rectify this. Thanks, Roopa >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org>> >>> >>> >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https :="" stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: >>> http://news.gmane.org/gmane.**science.biology.informatics.** >>> conductor<http: news.gmane.org="" gmane.science.biology.informatics.="" conductor=""> >>> >>> >>> -- James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> >>> >>> >>> >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> > > > -- > --------------------------------------- > Roopa Shree Subbaiaih > Post Doctoral Fellow > Department of Orthopaedics > School of Medicine > Case Western Reserve University > Cleveland, OH-44106 > Tel:+1 216 368 1380 > > -- --------------------------------------- Roopa Shree Subbaiaih Post Doctoral Fellow Department of Orthopaedics School of Medicine Case Western Reserve University Cleveland, OH-44106 Tel:+1 216 368 1380 [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 547 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6