Question: GEOQuery bug, NCBI or package?
0
gravatar for cschot3
11 months ago by
cschot30
cschot30 wrote:

I recently updated R and some packages and ran into a problem when trying to run GEOquery to download a dataset. When grabbing the expression set from the dataset, I noticed that the sample names were no longer the column names, but that expression values were now the column names, almost as if the first row was taken from the dataset and used as the header. Is this a bug with GEOquery, Biobase, or NCBI's database? Any solutions?

 

$`GSE38958_series_matrix.txt.gz`
ExpressionSet (storageMode: lockedEnvironment)
assayData: 21787 features, 115 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: 6.156901 6.317044 ... 6.314153 (115 total)
  varLabels: title geo_accession ... data_row_count (35 total)
  varMetadata: labelDescription
featureData
  featureNames: 2315633 2315674 ... 7385696 (21787 total)
  fvarLabels: ID GB_LIST ... category (12 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL5175 

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2.2      GEOquery_2.48.0     Biobase_2.40.0      BiocGenerics_0.26.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         rstudioapi_0.8     bindr_0.1.1        xml2_1.2.0         magrittr_1.5      
 [6] hms_0.4.2          tidyselect_0.2.5   R6_2.3.0           rlang_0.3.0.1      dplyr_0.7.8       
[11] tools_3.5.1        assertthat_0.2.0   tibble_1.4.2       crayon_1.3.4       BiocManager_1.30.4
[16] purrr_0.2.5        readr_1.2.1        tidyr_0.8.2        curl_3.2           glue_1.3.0        
[21] limma_3.36.5       stringi_1.2.4      compiler_3.5.1     pillar_1.3.0       pkgconfig_2.0.2

 

biobase geoquery ncbi geo • 259 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by cschot30

This is a bug due to a change in behavior of the readr read_tsv function. https://github.com/tidyverse/readr/issues/925. There is an ongoing conversation about what to do here. Sorry for the instability.

ADD REPLYlink written 11 months ago by Sean Davis21k

Just a quick update--the Rstudio team has a working fix and it is going through code review. Once that is completed and the new readr package is released, the GEOquery problem will be resolved. 

ADD REPLYlink written 11 months ago by Sean Davis21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 290 users visited in the last hour