NA's returned for GEO GPL3921 platform probsets
1
1
Entering edit mode
Zach Roe ▴ 10
@zach-roe-11189
Last seen 4.0 years ago

Hi,

I am re-downloading a GSE Series I have worked with for several years, and suddenly encountered error in my code I have been using without issue for also several years.  As it turns out the probset information for the GPL3921 platform is not downloading properly.  I am not sure if this is GEOquery download issue or the GPL soft file on GEO is corrupt (it was last updated on the GEO website 8/12/16).

I have tried to download several times (deleting the cached copy each time), and either no probeset information is downloaded, OR a partial probset information is downloaded leaving the rest NA.

This gse has 2 datasets attached to it, there is no problem with the second gpl, GPL4685

Case #1: No probeset info downloaded, see featureData: none, and warning error

> gse50444 <- getGEO('gse50444', GSEMatrix = TRUE)

Download warning produced:

Warning message:
In readLines(con, 1) :
  incomplete final line found on '/var/folders/4z/w7_jy74n1nx7sf4hdcn4h4tm0000gn/T//RtmpWDrYrH/GPL3921.soft'

No featureData associated with Exression Set:

> gse50444.gpl3921 <- gse50444[[1]]
> gse50444.gpl3921
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22277 features, 13 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM1219374 GSM1219375 ... GSM1219390 (13 total)
  varLabels: title geo_accession ... data_row_count (35 total)
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation: GPL3921 

Case #2: No download warning error produced, but only partial probeset downloaded (different times produced different partial downloads:  e.g. the first 53 probes, the first 10 probes etc), NA's introduced for remainder.

> gse50444 <- getGEO('gse50444', GSEMatrix = TRUE)
> gse50444.gpl3921 <- gse50444[[1]]
> gse50444.gpl4685 <- gse50444[[2]]
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22277 features, 13 samples
  element names: exprs
protocolData: none
phenoData
  sampleNames: GSM1219374 GSM1219375 ... GSM1219390 (13 total)
  varLabels: title geo_accession ... data_row_count (35 total)
  varMetadata: labelDescription
featureData
  featureNames: 1007_s_at 1053_at ... NA.22263 (22277 total)
  fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL3921

​Partial information for probeset downloaded
​
> featureData(gse50444.gpl3921)$ID[1:50]
[1] 1007_s_at 1053_at   117_at    121_at    1255_g_at 1294_at   1316_at   1320_at
[9] 1405_i_at 1431_at   1438_at   1487_at   1494_f_at <NA>      <NA>      <NA>   
[17] <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>   
[25] <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>   
[33] <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>   
[41] <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>   
[49] <NA>      <NA>

> exprs(gse50444.gpl3921)[1:25, 1:5]
           GSM1219374  GSM1219375  GSM1219376  GSM1219377  GSM1219382
1007_s_at  0.41385892 -0.63996374 -1.73610412 -1.57517607 -1.07331220
1053_at   -0.12878988 -1.00403804 -0.73231870 -1.16391881 -0.77008163
117_at    -1.26070358  1.26937376  1.00263645  0.95988674  0.85146114
121_at     1.77211067  1.66987587 -0.98424704 -1.59976481 -0.50925136
1255_g_at  2.10582626  0.12738699 -0.12883987 -0.31430216  0.76752995
1294_at   -1.35346212 -1.54300280 -1.51740293 -0.78928326  0.08942978
1316_at    0.64331735  0.88807124  0.72578215  0.57944596 -2.87895589
1320_at   -2.48933703 -0.68146255  0.04693923 -1.29023574  1.42692626
1405_i_at -1.24342923 -0.98998659 -0.76558138 -0.89682371  1.56075510
1431_at    1.11565231  0.19878199  0.82711198  0.26595467  0.34545898
1438_at    1.39958352  1.15349628  1.25051712  1.27550538 -0.99547150
1487_at    0.07843472  0.85933824  1.45393915  1.96911334 -2.01832498
1494_f_at  0.46256203 -1.68187670  0.68478356  0.57685428  0.16596064
NA        -0.11693547  0.43283197 -1.72592572 -1.81656636  0.59357536
NA.1       2.11614697  2.09774366 -0.46119476 -0.67307001  0.17831501
NA.2      -0.87134590 -0.01708055 -2.50694865 -1.15106562 -0.04434486
NA.3      -0.40180877 -0.43061066 -0.81721768 -0.99795365 -0.25247241
NA.4       1.14204017 -0.75941455 -1.89897048 -1.57962924  1.43090640
NA.5       0.37910692  0.12075928 -0.05456004  0.97155238 -0.64881493
NA.6       1.43481442  0.10490435  2.01492490  1.28410018 -0.17900697
NA.7       1.43346276  0.52050705  1.46460961  1.89953704 -0.96485518
NA.8       1.78761277  0.91176840 -0.07419414  0.03137913  1.46269737
NA.9       0.58698465  0.78548994  1.77385158  1.52612317 -1.36141323
NA.10      0.98628179  0.81391549  1.72908608  1.69564966 -1.22285385
NA.11      0.03509782  0.43000220 -2.50945545 -1.18538264  0.38360561

Session Info is below.

Thank you.

 

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
 [1] stats4    parallel  grid      stats     graphics  grDevices utils     datasets 
 [9] methods   base     

other attached packages:
 [1] HGNChelper_0.3.1     org.Hs.eg.db_3.3.0   AnnotationDbi_1.34.4 IRanges_2.6.1       
 [5] S4Vectors_0.10.2     lumi_2.24.0          mclust_5.2           limma_3.28.17       
 [9] affy_1.50.0          cluster_2.0.4        reshape_0.8.5        ggplot2_2.1.0       
[13] gplots_3.0.1         RColorBrewer_1.1-2   GEOquery_2.38.4      Biobase_2.32.0      
[17] BiocGenerics_0.18.0  BiocInstaller_1.22.3 gridExtra_2.2.1      dendextend_1.2.0    

 

 

geoquery • 947 views
ADD COMMENT
0
Entering edit mode
Zach Roe ▴ 10
@zach-roe-11189
Last seen 4.0 years ago

FYI, I received answer from GEO support thought to share that I have confirmed this was simply due to network error.

I rerun my code on a different day and did not encounter the problems anymore so it seems to be a specific network problem the day I was running my code.

ADD COMMENT

Login before adding your answer.

Traffic: 454 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6