Question: A recent update to readr cause issues loading data with GEOquery 2.50.0
0
gravatar for stun24
9 months ago by
stun240
stun240 wrote:

Hello all,

This post is to report an issue I have discovered, not to ask for assistance or help.

Today I found an issue with the GEOquery when you upgrade the readr package to 1.2.1. I have posted the issue on the readr GitHub issue page (https://github.com/tidyverse/readr/issues/925)

Some of my code had stopped working after upgrading readr from 1.1.1 to 1.2.1. I took me a good couple of hours to identify and test this issue. The upgrade to readr 1.2.1 resulted in a parsing error when using GEOquery::getGEO. Column names were no longer parsed correctly. 

I have confirmed this issue on Windows and Linux, using different R versions (see github post)

Example of this error on my Ubuntu 16 machine:

> test <- GEOquery::getGEO('GSE76885', GSEMatrix = T, AnnotGPL=TRUE)
Setting options('download.file.method.GEOquery'='auto')
Setting options('GEOquery.inmemory.gpl'=FALSE)
Found 1 file(s)
GSE76885_series_matrix.txt.gz
trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE76nnn/GSE76885/matrix/GSE76885_series_matrix.txt.gz'
Content type 'application/x-gzip' length 10006900 bytes (9.5 MB)
==================================================
downloaded 9.5 MB

Parsed with column specification:
cols(
  .default = col_double(),
  A_23_P100001 = col_character()
)
See spec(...) for full column specifications.
File stored at: 
/tmp/RtmpJ5L0WL/GPL6480.annot.gz
Warning message:
Duplicated column names deduplicated: '-0.615' => '-0.615_1' [65], '-0.267' => '-0.267_1' [95], '-0.303' => '-0.303_1' [96], '-0.105' => '-0.105_1' [101], '0.089' => '0.089_1' [107], '0.146' => '0.146_1' [110], '-0.184' => '-0.184_1' [124], '-0.45' => '-0.45_1' [149], '-0.16' => '-0.16_1' [154], '-0.047' => '-0.047_1' [155], '0.019' => '0.019_1' [157], '-0.074' => '-0.074_1' [158], '-0.113' => '-0.113_1' [159], '0.009' => '0.009_1' [168], '-0.149' => '-0.149_1' [170], '-0.085' => '-0.085_1' [175], '0.096' => '0.096_1' [176], '-0.281' => '-0.281_1' [177], '-0.096' => '-0.096_1' [178], '0.248' => '0.248_1' [179], '-0.308' => '-0.308_1' [181], '-0.22' => '-0.22_1' [190], '-0.306' => '-0.306_1' [195] 
> Biobase::sampleNames(test[[1]])[1:10]
 [1] "0.04"   "-0.173" "-0.288" "0.089"  "-0.227" "-0.254" "-0.184" "0.453" 
 [9] "0.264"  "-0.179"
>

The result of the last command should be [1] "GSM2039774" "GSM2039775" "GSM2039776" "GSM2039777" "GSM2039778"

My session info output:

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2.2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0          tidyr_0.8.2         crayon_1.3.4       
 [4] dplyr_0.7.8         assertthat_0.2.0    R6_2.3.0           
 [7] magrittr_1.5        pillar_1.3.0        stringi_1.2.4      
[10] rlang_0.3.0.1       curl_3.2            limma_3.34.9       
[13] xml2_1.2.0          tools_3.4.4         readr_1.2.1        
[16] Biobase_2.38.0      glue_1.3.0          purrr_0.2.5        
[19] hms_0.4.2           parallel_3.4.4      compiler_3.4.4     
[22] BiocGenerics_0.24.0 pkgconfig_2.0.2     bindr_0.1.1        
[25] tidyselect_0.2.5    tibble_1.4.2        GEOquery_2.46.15 

Please note, I have confirmed this behavior with GEOquery 2.50.0 on Windows as well. Reverting to readr 1.1.1 resolves the issue.

 
bug version compatibility • 224 views
ADD COMMENTlink modified 9 months ago by Steve Lianoglou12k • written 9 months ago by stun240
Answer: A recent update to readr cause issues loading data with GEOquery 2.50.0
0
gravatar for James W. MacDonald
9 months ago by
United States
James W. MacDonald51k wrote:

You have mixed current release versions of packages on an old R version. This isn't possible if you use biocLite (on the old version of R you are running), or BiocManager (if you update).

We can only support people who are using the current version of R/BioC (which is R-3.5.2 and Bioc3.8), and categorically cannot support installations where end users have mixed and matched packages that we never intended (by us) to be combined, as there are a nearly infinite number of such combinations.

Your best bet is to install the current version of R, and then BiocManager, and then all the packages you would like using BiocManager::install, as detailed here.

If after doing that you still have problems, please do let us know, in a new thread.

ADD COMMENTlink written 9 months ago by James W. MacDonald51k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 228 users visited in the last hour