Downloading Twochannel microarray data from ArrayExpress
0
0
Entering edit mode
Andy91 ▴ 60
@andy91-8905
Last seen 3 months ago
Netherlands

Dear Bioconductor,

I am currently in the process of performing a study on various microarray datasets, some of which are two channel microarrays. For consistency reasons, I would like to download the raw two channel data. As such I resorted to the “ArrayExpress” R package. I have experienced quite some difficulties in downloading the aforementioned data as most of the time I get the following error:

> mtab5095.eset <- ArrayExpress("E-MTAB-5095")
trying URL 'https://www.ebi.ac.uk/arrayexpress/files/A-MEXP-2104/A-MEXP-2104.adf.txt'
Content type 'text/plain' length 5941699 bytes (5.7 MB)
==================================================
downloaded 5.7 MB

trying URL 'https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-5095/E-MTAB-5095.sdrf.txt'
Content type 'text/plain' length 13791 bytes (13 KB)
==================================================
downloaded 13 KB

trying URL 'https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-5095/E-MTAB-5095.idf.txt'
Content type 'text/plain' length 5589 bytes
==================================================
downloaded 5589 bytes

Copying raw data files

trying URL 'https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-5095/E-MTAB-5095.raw.1.zip'
Content type 'application/zip' length 113238265 bytes (108.0 MB)
==================================================
downloaded 108.0 MB

Unpacking data files
ArrayExpress: Reading pheno data from SDRF
Error in `row.names<-.data.frame`(`*tmp*`, value = c("US10020348_252800421889_S03_GE2_107_Sep09_1_1.txt",  :
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘US10020348_252800419789_S01_GE2_107_Sep09_1_1.txt’, ‘US10020348_252800420180_S01_GE2_107_Sep09_1_4.txt’, ‘US10020348_252800421147_S01_GE2_107_Sep09_1_1.txt’, ‘US10020348_252800421889_S03_GE2_107_Sep09_1_1.txt’, ‘US10020348_252800421889_S03_GE2_107_Sep09_1_3.txt’

Based on the output, it seems that the row names are not unique. Does that mean that the uploader did not do a proper job when uploading the .sdrf file? Interestingly, when I try out the example from the article associated to ArrayExpress (Kauffmann et al. 2009), I get exactly the same error:

> AEset <- ArrayExpress("E-ATMX-18")
trying URL 'https://www.ebi.ac.uk/arrayexpress/files/A-ATMX-8/A-ATMX-8.adf.txt'
Content type 'text/plain' length 3743536 bytes (3.6 MB)
==================================================
downloaded 3.6 MB

trying URL 'https://www.ebi.ac.uk/arrayexpress/files/E-ATMX-18/E-ATMX-18.sdrf.txt'
Content type 'text/plain' length 21142 bytes (20 KB)
==================================================
downloaded 20 KB

trying URL 'https://www.ebi.ac.uk/arrayexpress/files/E-ATMX-18/E-ATMX-18.idf.txt'
Content type 'text/plain' length 6889 bytes
==================================================
downloaded 6889 bytes

Copying raw data files

trying URL 'https://www.ebi.ac.uk/arrayexpress/files/E-ATMX-18/E-ATMX-18.raw.1.zip'
Content type 'application/zip' length 28842045 bytes (27.5 MB)
==================================================
downloaded 27.5 MB

Unpacking data files
ArrayExpress: Reading pheno data from SDRF
Error in `row.names<-.data.frame`(`*tmp*`, value = c("4.txt", "4.txt",  :
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘10.txt’, ‘11.txt’, ‘12.txt’, ‘2.txt’, ‘3.txt’, ‘4.txt’, ‘5.txt’, ‘6.txt’, ‘7.txt’, ‘8.txt’, ‘9.txt’

Does anybody else have the same issue with two channel microarrays on ArrayExpress and did anybody figure out how to fix this?

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pd.u133.x3p_3.12.0         pd.hg.u133a.2_3.12.0       convert_1.52.0             marray_1.54.0              ArrayExpress_1.36.1       
 [6] limma_3.32.10              primeviewcdf_2.18.0        pd.huex.1.0.st.v2_3.14.1   hthgu133pluspmcdf_2.18.0   hgu133plus2cdf_2.18.0     
[11] hgu133acdf_2.18.0          pd.hugene.1.0.st.v1_3.14.1 DBI_0.7                    RSQLite_2.0                affy_1.54.0               
[16] oligo_1.40.2               Biostrings_2.44.2          XVector_0.16.0             IRanges_2.10.5             S4Vectors_0.14.7          
[21] oligoClasses_1.38.0        BiocInstaller_1.26.1       GEOquery_2.42.0            Biobase_2.36.2             BiocGenerics_0.22.1       
[26] rafalib_1.0.0             

loaded via a namespace (and not attached):
[1] SummarizedExperiment_1.6.5 splines_3.4.2              lattice_0.20-35            blob_1.1.0                 XML_3.98-1.9              
 [6] rlang_0.1.2                bit64_0.9-7                RColorBrewer_1.1-2         affyio_1.46.0              matrixStats_0.52.2        
[11] GenomeInfoDbData_0.99.0    foreach_1.4.3              zlibbioc_1.22.0            codetools_0.2-15           memoise_1.1.0             
[16] knitr_1.17                 ff_2.2-13                  GenomeInfoDb_1.12.3        AnnotationDbi_1.38.2       preprocessCore_1.38.1     
[21] Rcpp_0.12.13               DelayedArray_0.2.7         affxparser_1.48.0          bit_1.1-12                 digest_0.6.12             
[26] GenomicRanges_1.28.6       grid_3.4.2                 tools_3.4.2                bitops_1.0-6               RCurl_1.95-4.8            
[31] tibble_1.3.4               pkgconfig_2.0.1            Matrix_1.2-11              httr_1.3.1                 iterators_1.0.8           

 

arrayexpress twochannel microarray • 519 views
ADD COMMENT
0
Entering edit mode

Hi. I am getting the same error. Did you solve it?

ADD REPLY

Login before adding your answer.

Traffic: 502 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6